The Rogue Scholar archive reaches a milestone: 1000 searchable full-text science blog posts with DOIs

The Rogue Scholar archive reaches a milestone: 1000 searchable full-text science blog posts with DOIs

The Rogue Scholar science blog archive launched in April and I have been busy building out the core features of archiving the full-text of blog posts, establishing a full-text search, and registering DOIs and metadata for all posts. My announced goal was to complete this work by the end of the second quarter.

We now have July and I am happy to report that the core features are working and that the Rogue Scholar includes 1,000 blog posts that are available via full-text search, with DOIs linking to the original post on one of 35 science blogs, marking an important milestone worth celebrating.

Rogue Scholar blog posts

What is equally important is that this milestone has been reached without major technical work for the involved blogs. Rogue Scholar works with all blogging platforms that publish scholarly content and have an RSS or Atom feed with full-text content distributed under a Creative Commons Attribution (CC-BY 4.0) license – currently nine different blogging platforms from Wordpress, Blogger, and Ghost to several static site generators. The major issue was author names, usually resolved by configuration changes, e.g. in the Wordpress author profile.

And the implementation doesn't take any shortcuts, the DOI metadata include abstract, language, license, and (OECD Fields of Science) subject category for all posts, and author ORCID ID and references for some posts.

Crossref Participation Reports for blog posts registered by Front Matter

And that Rogue Scholar does this without major costs for the participating blogs (free for up to 50 posts per year, and a one-time fee of $1 per post thereafter), or for Front Matter hosting the blog archive (with monthly costs under $200 plus $.25 per DOI registration). This is possible because Rogue Scholar follows three principles: using open source software (more details in another post), automation as much as possible, and community participation.

Reaching this milestone demonstrates that a central archive of science blogs with full-text content, and DOIs for all blog posts with relevant metadata is feasible, making an important contribution to Open Scholarly infrastructure.

What comes next? Besides a lot of detailed work (e.g. working with six blogs registered with the Rogue Scholar that have RSS/Atom feeds with incomplete author metadata or no full-text RSS feed), the main goals for the next three months are:

  • Improve the payment workflow, including automated payment processing and a sponsorship option for organizations wanting to support Rogue Scholar and/or specific blogs
  • Include more science blogs, more blog posts (so far I have included all posts from 12 blogs), and improve the metadata (e.g. ORCID identifiers and references). In particular, I want to include blogs that publish posts in languages other than English (currently only three of the 35 blogs).
  • Build a community of Rogue Scholar bloggers and users. I have gotten a lot of feedback in the last few months but would like to better understand how people are currently using the Rogue Scholar or what can be improved. The starting point is the Rogue Scholar Discord community, but there are also other feedback channels, including email, Mastodon, Zoom, and of course personal communications (you find me at the Open Science Festival Cologne this week).


Fenner M. The Rogue Scholar: An Archive for Scholarly blogs. Published online January 31, 2023. doi:10.54900/bj4g7p2-2f0fn9b

Fenner M. The Rogue Scholar is now open for business. Published online April 4, 2023. doi:10.53731/z9v2s-bh329

Fenner M. Starting to register DOIs for all blog posts included in the Rogue Scholar. Published online June 5, 2023. doi:10.53731/m9fs5-nap05

Bilder G, Lin J, Neylon C. Principles for Open Scholarly Infrastructures-v1. Published online 2015:35186 Bytes. doi:10.6084/M9.FIGSHARE.1314859

Copyright © 2023 Martin Fenner. Distributed under the terms of the Creative Commons Attribution 4.0 License.