The Rogue Scholar science blog archive depends on RSS feeds to automatically collect metadata and content. Atom, JSON Feed, and JSON APIs (e.g. for the WordPress, Ghost, or Substack platforms) are closely related to RSS. The Rogue Scholar API understands these different formats and regularly (currently every 10 min) checks participating blogs for new or updated content and associated metadata.
Rogue Scholar registers DOIs for posts of participating blogs using these metadata, unless the blog is doing its own DOI registration. The latter makes sense if the blog is run by an organization that is a Crossref or DataCite member and this is currently the case for six blogs participating in Rogue Scholar. Unfortunately, the DOIs registered by these six blogs are not included in their respective RSS/Atom feeds, requiring manual work (copying/pasting the DOI into the Rogue Scholar database) for every new blog post.
This is an issue when a blog publishes frequently or a new blog with hundreds of posts is added to Rogue Scholar. Also, the metadata that Rogue Scholar shows come from the RSS feed and sometimes are slightly different from the DOI metadata, depending on how the blog registers those DOI metadata.
Another issue is that blogs participating in Rogue Scholar sometimes want to know the DOI at publication time, e.g. when the blog posts are anticipated to be widely shared. My main goals regarding DOI registration when starting Rogue Scholar in 2023 were twofold:
- provide DOI registration for bloggers who can't afford Crossref or DataCite membership (at least $275 or €3000 a year respectively, plus a fee for each DOI),
- make DOI registration quick and easy while still providing relevant metadata. This goal was addressed by automating as much as possible, and building workflows that reduced DOI registration to less than 30 min after publication of the blog post.
While these goals have been achieved, the issues mentioned above need to be addressed, and one crucial step is to include the DOI in the RSS feed.
RSS feeds have two metadata elements for each item that are relevant here:
guid
guid stands for globally unique identifier. It's a string that uniquely identifies the item. When present, an aggregator may choose to use this string to determine if an item is new. Called id in Atom and JSON Feed.
link
The URL of the item. Called url in JSON Feed.
For DOI registration a DOI and url are needed, so that the DOI resolver at https://doi.org can redirect DOI requests to the registered url. DOIs are globally unique identifiers so they can work as guid/id in your blog feed. This is straightforward if you have full control of how your feed is generated.
Unfortunately, this is often not the case and your blog may be hosted elsewhere with only limited options for customizing the feed of your content. With blogs powered by databases (e.g. WordPress or Ghost) that includes a place in the database where you can store the DOI. A good option is a field for a custom canonical url which for example the Ghost platform provides. WordPress – by far the most popular blogging platform – unfortunately has no built-in support for storing DOIs or canonical DOIs and requires a plugin.
The Atom feed format has several important advantages over RSS, including the support for author metadata (e.g. the ORCID ID) and multiple authors, but also support for a post modification date. In the context of DOI support, another feature is important: support of multiple links via the rel attribute, e.g. rel="alternate" for the post url and rel="cite-as" for the post DOI. The JSON Feed spec has url and external_url, and external_url can be used to include the DOI (not equivalent to cite-as but close enough for practical purposes).
In the coming months, Rogue Scholar will add support for these various ways to include DOIs in blog feeds. Please reach out if you are interested in (or have already added) this functionality. I will start work on plugins for two popular blogging platforms that add this functionality (and more): WordPress and Hugo. With these plugins, I also want to help bloggers who are not Crossref or DataCite members by enabling DOI registration (as pending publication) for draft blog posts.