Improved PDF-formatted Rogue Scholar blog posts

Improved PDF-formatted Rogue Scholar blog posts

On Monday the Rogue Scholar science blog archive launched the export of blog posts in PDF and other formats. Over the last few days I have been busy improving the PDF output, and today I am releasing a new version, available for all the more than 13K science blog posts archived by Rogue Scholar.

The PDF is generated using the open source Pandoc universal document converter in combination with the Weasyprint library. There are many other options for automatic PDF generation, but Weasyprint is a nice option for the flexible generation of beautiful PDF. The PDF files are generated on demand by the Rogue Scholar API based on the metadata and content stored in the Rogue Scholar search index, using the markdown format. The first page of the PDF shows the main metadata, including the abstract and feature image, just what you expect from a typical scholarly article.

Links in the PDF are clickable, including the author ORCID ID(s), DOI of the citation (formatted in APA style), and references in the fulltext:

Because the PDF generation is fully automated, it doesn't handle all possible edge cases and some metadata (e.g. funding information) aren't included yet. But the Rogue Scholar PDF is fully functional, nice looking, and will only get better over time. Please let me know (e.g. in the comments) if you discover issues with the PDF or have a feature request.


Fenner, M. (2024). Every Rogue Scholar blog post now available in Markdown, ePub, and PDF formats. Front Matter.

Copyright © 2024 Martin Fenner. Distributed under the terms of the Creative Commons Attribution 4.0 License.