Personal names remain among the hardest scholarly metadata to capture properly, including for science blog posts. This week, the Rogue Scholar science blog archive therefore changed how it stores blog post author names: no longer as name, which is the standard in RSS, Atom, and JSON Feeds, but as name only for an organizational author, and as given and family name for personal authors.
This follows the best practices established by ORCID, Crossref, and the Citation Style Language. DataCite and InvenioRDM (the repository platform powering Rogue Scholar and based on the DataCite metadata model) use an implementation that was good for transitioning from names to given and family names (keeping the name field for personal names), but created confusion – personal names should be stored as family name, given name (Doe, John) in the name field, but that is not trivial to enforce and led to many organizations still submitting DataCite metadata with given name family name (e.g. John Doe) as name. For this reason, Rogue Scholar is following the Crossref model and is dropping name for personal names, even if it is a breaking change.
Extracting the given and family name from a name can not be fully automated, as there are important edge cases: a) organization names that look like personal names (Alfred P. Sloan Foundation) and b) personal names with multiple family names (Bastian Greshake Tsovaras) vs. multiple given names (Martin Paul Eve) vs. names with propositions (Wilma van Weezenbeck). More background info and additional edge cases (e.g. given name without family name) can be found in a 2011 W3C document.
To handle these special cases, Rogue Scholar has started a curated list of author names that fall into one of these categories and can correctly split names into given and family names, or not split the name as it is an organizational name. The transition of the more than 40K blog posts into the new format will take some time, but is only urgent for the edge cases mentioned above.
There are further issues with personal names and metadata for scholarly blogs, including handling multiple authors (which not all blogging platforms support), author identifiers (ORCID, etc.), and author affiliations (using identifiers such as ROR, etc., affiliation changes over time) – but that is material for another blog post.
References
- Personal names around the world. (n.d.). Retrieved May 6, 2025, from https://www.w3.org/International/questions/qa-personal-names.en