On Wednesday DataCite released version 4.5 of the DataCite metadata schema. Today I released updated versions of the commonmeta Ruby and Python libraries that fully support the new schema. You can install them via Rubygems and PyPI, respectively.

DataCite schema 4.5 is a minor update, the first schema update since schema 4.4 was released in March 2021 (when I was still DataCite Technical Director). The new version adds support for resourceTypeGeneral Instrument and StudyProtocol, and allows an optional publisher identifier in the required publisher field.

PIDs for instruments have been discussed for several years, coordinated by a dedicated Research Data Alliance Working Group. PIDs for instruments make it easy to connect research outputs to the instruments that helped generate them in metadata, via a new relation type Collects/IsCollectedBy.

Publisher is a required DataCite metadata field since the launch of DataCite. With schema 4.5 this property can now include an optional identifier, typically a Research Organization Registry (ROR) ID. This will make it easier to find all scholarly resources with DataCite DOIs published by a particular publisher.

Commonmeta is a metadata standard for scholarly metadata that I launched in early 2023. Currently implemented via open source libraries for the Ruby and Python languages, commonmeta allows easy conversion between common scholarly metadata formats, including Crossref and DataCite metadata, BibTex, RIS, Schema.org, and formatted citations in any of the available 1000s of citation styles.

While this is a minor update to the commonmeta-ruby library (commonmeta already supported publisher identifiers and the Instrument resource type), the changes for commonmeta-py are bigger, as I am working towards a version 1.0 release in the next few months. The main driver at the moment is DOI registrations for science blog posts included in the Rogue Scholar science blog archive, and I want to migrate the current workflow using GitHub Actions and commonmeta-ruby to background workers in the Rogue Scholar API written with commonmeta-py. The main reason is the performance bottleneck using GitHub Actions, which work great for continuous integration, but are not the best fit for very frequent background jobs.

You can see commonmeta-py at work generating DataCite or Crossref metadata on the fly via the Rogue Scholar API using the following links (Crossref, DataCite) for the 2020 blog post by Henry Rzepa mentioned earlier. Similar links work for all Rogue Scholar blog post DOIs. The publisher metadata of these DataCite DOIs don't include a publisher identifier yet, as this information has not been systematically collected for Rogue Scholar blogs, and/or ROR identifiers are not (yet) available for single-person organizations such as Front Matter. Work to do in 2024.

References

Stathis, K., Ross, C., Vogt, S., & Siziva, K. (2024). Introducing DataCite Metadata Schema 4.5. DataCite Blog. https://doi.org/10.5438/JVKK-8198

Rzepa, H. (2020). The Persistent Identifier ecosystem expands – to instruments! Henry Rzepa’s Blog. https://doi.org/10.59350/3jb9h-jcd77

Fenner, M. (2023). Announcing commonmeta. Front Matter. https://doi.org/10.53731/cp7apdj-jk5f471

Fenner, M. (2024). Improving rogue scholar metadata conversions. Front Matter. https://doi.org/10.53731/n4vfp-nwb08