Last week I released updated Python, and Go versions of the commonmeta library that can now read metadata from OpenAlex. OpenAlex is an open index of over 250 million scholarly works from 250k sources. OpenAlex uses its own identifier for works, people, organizations, sources, and concepts, but also understands common identifiers for works (e.g. DOI or PMID), people (e.g. ORCID), or organizations, including funders (e.g. ROR). Commonmeta can now fetch metadata from the OpenAlex API and convert them into the commonmeta or any other supported format.
An example command-line call would look like this:
commonmeta convert https://pubmed.ncbi.nlm.nih.gov/17160063 --from openalex
Or you could fetch a random sample of 100 preprints:
commonmeta list --sample --type preprint -n 100 --from openalex
OpenAlex is an impressive service for the scholarly community, launched three years ago when the Microsoft Academic Graph database stopped being updated. I particularly like the following features:
- coverage of a large number of text publications, including content registered via Crossref and DataCite,
- links to legal copies of full-text versions of publications,
- enrichment of metadata with persistent identifiers, e.g. affiliation information,
- rich automated subject area classification into 4500 topics.
When working on integrating OpenAlex into commonmeta, I noticed some areas where the service (still only three years old) could be improved upon:
- personal names are not treated as a combination of given and family names. This can cause problems in cases of unusual names and formatted citations, which typically split personal names into given and family names,
- Metadata enrichment should not be done with personal names, as this is very difficult and may have privacy implications. My OpenAlex profile – which covers publications over 30 years in different research areas (mainly basic and clinical cancer research and scholarly infrastructure) – contains most of my publications but also publications not written by me, including several papers published before I finished high school in 1983,
- license information uses a simple schema that aligns with Creative Commons licenses, but for example doesn't consider different versions (e.g. CC-BY 3.0 vs. CC-BY 4.0). Commonmeta supports the SPDX license list that includes all Creative Commons license versions but also many software licenses.
The initial OpenAlex support in commonmeta is the result of a wonderful pull request for the Python version. I mainly added test coverage and added the same functionality to the Go version. Please provide feedback via email, Slack, or GitHub if you discover bugs or missing functionality of the OpenAlex support in commonmeta.
References
Fenner, M. (2025). Commonmeta-py (Version 0.107) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.15465786
Martin Fenner. (2025). front-matter/commonmeta: V0.25.0 (Version v0.25.0) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.15461402