Improving Rogue Scholar references

Improving Rogue Scholar references
Photo by Grégoire Bertaud / Unsplash

The latest update of the Rogue Scholar science blog archive this week improves the finding and tracking of science blog post references, both on the website and in the API. This update again takes advantage of functionality of the InvenioRDM repository platform, with some minor tweaks.

InvenioRDM is based on the DataCite metadata model, which treats references as a subset of related identifiers, with more than 20 different relation types. This can be confusing and has bothered me for a long time, and I am much more comfortable with the simpler Crossref metadata model for references. An additional challenge with related identifiers is that some references don't include a link to a digital resource. In 2021 DataCite introduced related items to address this limitation, but unfortunately that makes related identifiers/related items even more complicated. There is confusion on when to use cites, references, or is supplemented by, and each of these relation types can go in either direction, e.g. references and is referenced by. In addition there is still an ongoing discussion of where to put these references, e.g. into a references section at the end of the publication, a separate references section for data and/or software, or referencing external resources inline in the text. Finally, some communities use multiple kinds of persistent identifiers that may not have a minimal set of standard metadata available at a clearly defined machine-readable location. Needless to say, all this makes it much harder to find and track references and that is an ongoing challenge in particular for data and software citation.

In science blog posts references are not (yet) widely used and the focus has been on discussions via comments, linkbacks, and social media. But there is no good reason not to use traditional scholarly references in science blog posts if it makes sense for the content. As of today there are 799 Rogue Scholar blog posts with references included in the Crossref metadata (and many more informal references in the text). This number was higher a few months ago as a very prolific blogger had to stop using a Wordpress plugin that generated the references for him because it was no longer supported with the newest PHP and Wordpress versions. The old argument that scholarly citations take months to years to appear is no longer true for science blogs where you can discuss and cite a paper and publish a blog post the next day or week, as in the example from Ross Mounce above, or me citing the commonmeta software used to build this functionality released on Monday.

The inclusion of references fortunately doesn't require a specific tool or functionality of the blogging platform, just standard formatting with a section called References and a list of DOI/URL links, as in this blog post.

Picture by Geoff Bilder

The full-text needs to be made available to Rogue Scholar for reference extraction, which is the case for all Rogue Scholar posts.

InvenioRDM understands DataCite related identifiers, but it also understands references. These references are a combination of an identifier and an unstructured text string, which aligns much better with the Crossref references data model that Rogue Scholar is using. The work this week mainly changed the display of references in the InvenioRDM user interface, e.g. as numbered list instead of bulleted list and the DOI/URL displayed as links.

In the initial implementation, I am showing the title and publication year together with the DataCite, Crossref DOI, or URL. Additional metadata such as author name(s), journal name, or content type (journal article, preprint, dataset, etc.) would be desirable but would dramatically increase the required work.

The same information is also available via the InvenioRDM API, and can be searched for via web interface or API. It will take a few weeks until all references are properly displayed in Rogue Scholar, but the functionality is now in place for all new blog posts. I hope the display of the references encourages more blog authors to include them and helps blog readers discover more relevant resources. In 2025 I will start tracking the citations of Rogue Scholar blog posts, which is of course the flip side of including references. As a Crossref member, Front Matter has access to this information via the Crossref cited-by service and will make it available to blog authors and readers in Rogue Scholar. With this work, Rogue Scholar further supports the Initiative for Open Citations, where references not only become available with an open license but Rogue Scholar enables and encourages authors to include formal references in their blog posts.

For questions about references in Rogue Scholar reach out via email. To follow the latest Rogue Scholar developments, sign up for the RSS feed or newsletter of this blog.

References

Lowenberg, D., Lammey, R., Jones, M. B., Chodacki, J., & Fenner, M. (2021). Data Citation: Let’s Choose Adoption Over Perfection. Zenodo. https://doi.org/10.5281/zenodo.4701079

Fenner, M. (2024, March 27). Tracking references of Rogue Scholar blog posts. Front Matter. https://doi.org/10.53731/j77gv-54g66

Mounce, R. (2024, November 18). Web of Science can’t handle innovation. A Blog by Ross Mounce. https://doi.org/10.59350/nsgwq-1mb74

Sadki, R. (2024, November 11). Anecdote or lived experience: Reimagining knowledge for climate-resilient health systems. Reda Sadki. https://doi.org/10.59350/hgkcb-52q38

Fenner, M. (2024). front-matter/commonmeta: V0.6.10 (Version v0.6.10) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.14217168

Shotton, D. M. (2017, April 6). The Initiative for Open Citations. OpenCitations Blog. https://doi.org/10.59350/jdwj8-at997

Copyright © 2024 Martin Fenner. Distributed under the terms of the Creative Commons Attribution 4.0 License.