This week the commonmeta-py Python library adds an important new feature: metadata lists. With this feature commonmeta-py no longer only operates on metadata for a single scholarly work (e.g. a journal article, book, dataset, software, or blog post), but can handle lists of scholarly works.

Most of the metadata formats supported by commonmeta-py (Crossref and DataCite JSON and XML, BibTex, RIS, CSL JSON, or Schema.org) can handle lists of works, e.g.

  • a keyword query with the Crossref or DataCite REST APIs,
  • all articles in a journal issue or volume, or all chapters in a book,
  • all references in a journal article, or citations of that article,
  • all references in a Zotero reference manager category,
  • all versions of a preprint, dataset, or software.

The initial support of metadata lists in version 0.14 of the commonmeta-py library allows the reading and/or writing of metadata lists via a single Python method, or command-line argument. Over time, more helper methods and error checks will be added, e.g. to support lists of works from ORCID records.

This new feature also enhances the Rogue Scholar science blog archive and addresses an important use case. The Crossref API that Rogue Scholar uses for DOI registrations allows batch processing of DOI metadata. Instead of registering or updating one DOI record every 10 minutes, Rogue Scholar now triggers the update of 100 DOIs with a single XML file with all metadata every 10 minutes. This requires two method calls: 1) fetch all Rogue Scholar blog posts that need updating and store them in a single JSON file, and 2) generate a single XML file and upload it to the Crossref API. This streamlined workflow allows updates of all 15k Rogue Scholar metadata records with Crossref in 24 hours, compared to the weeks it took before (a speed improvement of 100x). This allows Rogue Scholar to continue to use GitHub Actions for DOI registrations, for the time being avoiding more complex and expensive infrastructure setups.

Rogue Scholar can now quickly update DOI metadata in bulk, e.g. add an ORCID ID to all blog post authors, or the blog name and blog post URLs if a blog moves to a different platform, or the archive location for all blog posts (once Crossref enables archive_locations metadata for content type posted-content). Registration of new content in batches is not implemented yet, as it involves additional logic of generating a random DOI string for every new content item.

Commonmeta-py can be installed via PyPI and can be found on GitHub. The Ruby version of this library will be updated with similar functionality in the coming weeks.

References

Fenner, M. (2023). Announcing commonmeta. Front Matter. https://doi.org/10.53731/cp7apdj-jk5f471

Fenner, M. (2024). Commonmeta-py (013.2) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.8340374