All DataCite DOIs have associated metadata, described in the DataCite Metadata Schema Documentation (DataCite Metadata Working Group (2017)), validated and stored as XML in the DataCite Metadata Store (MDS). These metadata are then made available via DataCite APIs and services. For these services XML is not always the best format, and we are thus providing the metadata in other formats, most notably JSON. The problem with our approach so far has been that this JSON was not properly defined, creating overhead and ambiguity both for our internal services and for our users. To change this situation, and to make it easier to work with metadata for DataCite DOIs, we today are announcing DataCite JSON.
DataCite JSON represents all metadata elements and attributes available in DataCite XML, and can be converted from and to DataCite XML via several DataCite services (MDS API, EZ API, DOI Fabrica, Content Negotiation) that internally all use the bolognese metadata conversion library (Fenner (2017)), which also provides a command-line utility. Both our new Elasticsearch Search index and the updated JSON REST API (more on those in another blog post) use DataCite JSON. The bolognese metadata conversion library uses DataCite JSON as the intermediary format, for example when converting BibTeX to schema.org JSON-LD or JATS XML.
There are minor differences between DataCite JSON and DataCite XML, mainly to make working with the metadata easier. This includes an identifiers object that combines the identifier and alternateIdentifier properties, and a types object that not only stores resourceTypeGeneral and resourceType information, but also the type information from RIS, BibTeX, Citeproc and schema.org, to avoid losing type information when converting between these formats. There is also a new container property that stores information about the repository or journal where the content is located. We can provide this information in DataCite XML via the relatedidentifier (with relationType isPartOf) and description (via descriptionType SeriesInformation) elements, but the process is cumbersome. DataCite JSON also includes information not available in DataCite XML, including the url registered for the DOI, and the date the DOI was registered.
To see DataCite JSON in action, lookup the DOI metadata of your favorite DOI in our JSON REST API, e.g. https://api.datacite.org/dois/10.5438/0014, or - if you are a DataCite member or client - in DOI Fabrica. Alternatively install bolognese (via
gem install bolognese) and fetch metadata via the command
bolognese 10.5438/0014 -t datacite_json. Documentation of DataCite JSON is unfortunately still sparse, in early 2019 we will provide better documentation via our support site, and this will also include updated documentation of the JSON REST API and a JSON Schema to validate the metadata, aligned with our XSD Schema for DataCite XML.
We hope that DataCite JSON makes it easier to work with DataCite metadata, helping to improve metadata quality and re-use. We encourage users to adapt their tools to take advantage of DataCite JSON, and to consider DataCite JSON also when working with metadata not associated with a DataCite DOI, but when a description of scholarly resources with standard metadata and using JSON is needed. Watch out for more information about DataCite JSON in 2019, or reach out to us with questions or feedback via mailto:email@example.com.
This blog post was originally published on the DataCite Blog.
DataCite Metadata Working Group. (2017). DataCite metadata schema for the publication and citation of research data v4.1. DataCite. https://doi.org/10.5438/0014
Fenner, M. (2017). Bolognese: A ruby library for conversion of doi metadata. DataCite. https://doi.org/10.5438/n138-z3mk
DataCite's New Search
Today we are announcing our first new functionality of 2019, a much improved search for DataCite DOIs and metadata. While the DataCite Search user interface has not changed, changes under the hood bring many important improvements and are our biggest...
The DataCite GraphQL API is now open for (pre-release) business
DataCite DOIs describe resources such as datasets, samples, software and publications with rich metadata. An important part of this metadata is the description of connections between resources that use persistent identifiers (PIDs)...
Welcome to the DataCite Team, Robin!
DataCite is pleased to welcome Robin Dasler to our team. Robin joined DataCite as product manager in March. Get to know her better via this interview.Robin DaslerCan you tell us a little bit about what you did before you started working for DataCite?I began my career as a science liaison librarian,...