Major update on Commonmeta Crossref DOI registration
Today I released a new version of the commonmeta-py Python library with major improvements in Crossref DOI registration, including refactoring to use the Python marshmallow library, XML schema validation, and API calls to Crossref and InvenioRDM instances via the commonmeta-py command-line interface.
Using the marshmallow library
Marshmallow is a popular Python library for converting complex objects to and from simple Python datatypes. The InvenioRDM repository software heavily uses marshmallow to convert metadata from and to JSON. Marshmallow is not specific to JSON, and writing Crossref metadata in XML requires an additional serialization step. commonmeta-py uses xmltodict to convert XML to Python data structures, and now also uses xmltodict for writing XML. This replaces lxml and the ElementTree API for XML writing. This worked well but didn't integrate with the rest of commonmeta-py, as the Crossref XML writer is the only place where commonmeta-py currently writes XML. More importantly, this change will make integrating commonmeta-py into InvenioRDM easier for Crossref DOI registration.
XML schema validation
Crossref metadata are fairly complex and have different requirements depending on content type, e.g. International Standard Serial Numbers (ISSN) are only supported for some content types, or the order of metadata elements might be different. For this reason, XML schema validation before submission is critical, and commonmeta-py now supports this, using the recently released schema 5.4.0. A large part of the work for this update was generating and validating XML for the various Crossref content types. I could not cover all use cases, so feedback is appreciated, e.g., by sending me DOIs registered with Crossref but not validating in commonmeta-py.
Commonmeta (both the Python and Go versions) relies heavily on JSON schema validation, which I greatly prefer over XML Schema Definition (XSD) validation. But until Crossref allows content registration via JSON metadata (similarly to the change DataCite made a few years ago), XML schema validation remains important. The commonmeta Go library does not yet use XML schema validation.
API calls via the CLI
The Rogue Scholar science blogging archive switched to the InvenioRDM repository platform in October 2024 and uses the commonmeta Go library and GitHub Actions for Crossref DOI registration. GitHub Actions are wonderful, but for more complex workflows it is easier to have the logic built into the application running in the GitHub Action. Since May 2024 that was the commonmeta Go library, and commonmeta-py now has similar functionality, including calling the Crossref and InvenioRDM APIs directly.
Starting today, the GitHub Actions for Rogue Scholar DOI registrations and updates use commonmeta-py instead of the commonmeta Go library. The next two weeks I will carefully monitor them for any issues that might have escaped testing.
The next major milestone is integrating Crossref DOI registration directly into InvenioRDM. This will not only simplify the workflows for Rogue Scholar, but makes InvenioRDM a more interesting option for repositories with original textual content, e.g. preprints, reports or dissertations.
References
- Fenner, M. (2025). Commonmeta-py (Version 0.113) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.15524711
- Feeney, P. (2025, March 19). Version 5.4.0 metadata schema update now available. Crossref Blog. https://doi.org/10.13003/325070