Starting in February, the Rogue Scholar science blog archive is showing the citations (publications that cite the blog post) of science blog posts it archives. Rogue Scholar uses the same format for this as the blog post references – a reference string in APA style and a DOI link to the citing publication:

New preprint on Transformative Agreements https://doi.org/10.59350/fe9s4-8a179

This information is available on the website and in the REST API, enabling participating science blogs to fetch this information programmatically. Users can search for blog posts with citations (from one particular blog/community or all blog posts) with citations via Rogue Scholar search:

Searching for citations:* returns all blog posts with citations, whereas citations:10.1371* finds all blog posts cited by a DOI with prefix 10.1371 (PLOS), and citations:10.1101/167619 finds all blog posts cited by the bioRxiv preprint with DOI 10.1101/167619.

Crossref Cited-by service

Rogue Scholar uses the Crossref Cited-by service which is available to Crossref members. A small number (nine as of today) of science blogs participating in Rogue Scholar register their own DOIs and do this with DataCite. Citations of those DOIs are currently not available in Rogue Scholar, neither are citations of any of the Rogue Scholar blog posts by scholarly publications registered with DataCite.

As of today, there are 948 citations of 568 Rogue Scholar blog posts discovered by the Crossref Cited-by service. You can go to all of these publications and check how and why they cite a science blog post. The initial manual investigation of the Cited-by links found almost all Rogue Scholar blog posts in the references of those citing publications. Very rarely was a reference pointing to a related publication by the same author(s) instead of the blog post, and this needs more investigation.

Science blog posts do not appear to be formally cited very often by other scholarly works. There may be several reasons for this, including:

  • Blog posts are mentioned in the text (maybe including a link), but don't appear in the formal references,
  • Blog posts have been included in the references, but the publisher removed the reference, as referencing web resources was discouraged or the journal workflow couldn't handle web links,
  • The blog moved several times since blog post publication, making it difficult for the citing author, publisher, and Crossref to keep track of the cited resource.

Similar arguments are often found in the context of citing (or not citing) data or software.

Citations in the InvenioRDM repository software

The InvenioRDM repository software that powers Rogue Scholar is modeled around the DataCite metadata model, which uses relatedIdentifiers with currently more than 20 relation types rather than the more straightforward concept of references and citations. Rogue Scholar therefore implemented a custom_field for citations. The Rogue Scholar API automatically fetches citations from Crossref, formats the metadata in the APA citation style, and sends the information to Rogue Scholar. I will discuss with the broader InvenioRDM community whether there is interest in storing citations this way, which is different from what for example the Zenodo repository is doing with software citation.

Timeline of citations

The time it takes for citations to appear can be much shorter if the citing publication is a blog post. The Rogue Scholar January Newsletter was published on January 30 (four days ago) and cited a blog post published on December 22. The citation appeared in the Crossref Cited-by service a day later and can be thus shown on the Rogue Scholar landing page for that blog post:

About that Saurophaganax paper. https://doi.org/10.59350/ffgmk-zjj78

This dramatically changes the timeline of citations from years (books and journal articles as the citing publication), and months (preprints and conference proceedings) to days (blog posts) – similar to the dynamics of usage stats and altmetrics.

On the other hand, I find citations of science blog posts years after publication. The Crossref Cited-by service is often smart enough to figure out that a citation using a web link to a blog post is the same as a citation using a DOI. This means many citations of blog posts archived in Rogue Scholar are available even though Rogue Scholar only launched in 2023. For example this blog post from 2010:

Introducing the Semantic Publishing and Referencing (SPAR) Ontologies. https://doi.org/10.59350/rs068-15d95

What kind of content is cited

Science blog posts include a variety of different document types, from short announcements and opinion pieces to detailed reports and investigations that include data and code.

A blog post very popular on social media in 2012 has not been cited much (at least Crossref Cited-by doesn't find that):

Sick of Impact Factors. https://doi.org/10.59350/xqrv5-7bv94

An example of a data science article with data and code is the following:

A guide to modeling proportions with Bayesian beta and zero-inflated beta regression models. https://doi.org/10.59350/7p1a4-0tw75

With the Principles for Open Scholarly Infrastructures you see a number of scholarly publications citing the original blog post. Even though this original blog post asks authors to cite the PDF version deposited in the Figshare repository.

Principles for Open Scholarly Infrastructures. https://doi.org/10.59350/b7mtv-gpn88

A blog about how to cite R and R packages is a good example of the type of content that is frequently cited:

How to Cite R and R Packages. https://doi.org/10.59350/t79xt-tf203

What kind of content is citing

Rogue Scholar citations are currently collected from the Crossref Cited-by service, meaning that software, datasets, presentations, or other content types not registered with Crossref are currently not discovered as citing items. If the content has multiple versions, it is common that they all cite the Rogue Scholar blog post:

Hybrid open access is unreliable. https://doi.org/10.59350/pberv-kyt47

Referencing other blog posts is common and often includes self-citation. Self-citation is often frowned upon in journal articles or book chapters, but understandable and desired in blog posts. We can see this pattern also in Rogue Scholar citations, and I will check how many of the nearly 1,000 citations in the Crossref Cited-by service are by other science blogs.

Please reach out via email or the new Slack community if your blog is archived by Rogue Scholar and you have questions or feedback regarding the new citations service.

References

Rothfritz, L. (2024, November 17). New preprint on Transformative Agreements. Research Group Information Management @ Humboldt-Universität Zu Berlin. https://doi.org/10.59350/fe9s4-8a179

Ioannidis, A., & Gonzalez Lopez, J. B. (2019). Asclepias: Flower Power for Software Citation. https://doi.org/10.5281/ZENODO.2548643

Wedel, M. (2024, December 22). About that Saurophaganax paper. Sauropod Vertebra Picture of the Week. https://doi.org/10.59350/ffgmk-zjj78

Shotton, D. M. (2010, October 14). Introducing the Semantic Publishing and Referencing (SPAR) Ontologies. OpenCitations Blog. https://doi.org/10.59350/rs068-15d95

Curry, S. (2012, August 13). Sick of Impact Factors. Reciprocal Space. https://doi.org/10.59350/xqrv5-7bv94

Heiss, A. (2021, November 8). A guide to modeling proportions with Bayesian beta and zero-inflated beta regression models. Andrew Heiss’s Blog. https://doi.org/10.59350/7p1a4-0tw75

Bilder, G. (2015, February 23). Principles for Open Scholarly Infrastructures. Science in the Open. https://doi.org/10.59350/b7mtv-gpn88

LaZerte, S. (2021, November 16). How to Cite R and R Packages. rOpenSci - Open Tools for Open Science. https://doi.org/10.59350/t79xt-tf203

Mounce, R. (2017, February 20). Hybrid open access is unreliable. A Blog by Ross Mounce. https://doi.org/10.59350/pberv-kyt47