Last Tuesday Nucleic Acids Research published a nice paper describing the UK PubMed Central (UKPMC) database (McEntyre 2010). UKPMC was started in 2007, the enhanced version described in the paper was launched January 2010. In November 2009 I published an interview with Phil Vaughan, the senior author of the paper. The paper talks about the specific enhancements done to PubMed Central, including an integrated search of PubMed and PubMed Central, “Cited by information” and semantically enriched content generated by text mining.
Thanks to Duncan Hull we know that PubMed currently contains information about 20 million papers (Twenty million papers in PubMed: a triumph or a tragedy?). About 10% of these papers are available as fulltext from PubMed Central. What wasn’t clear to me and what I learned from the paper is that only 194,000 papers, or 1% of PubMed content, are from the PMC Open Access Subset (and that includes papers with a non-commercial OA license). All these papers are available as download or via PMC-OAI service. Although 1.8 million papers (the majority digitized back issues) can be freely downloaded as full-text from PubMed Central, they carry a publisher copyright and can’t be reused for research purposes (e.g. full-text mining) without an explicit permission from the publisher.
McEntyre JR, Ananiadou S, Andrews S, Black WJ, Boulderstone R, Buttery P, et al. UKPMC: a full text article resource for the life sciences. Nucleic Acids Research. 2010 November; DOI: http://doi.org/10.1093/nar/gkq1063.
Eating your own Dog Food
Eating your own dog food is a slang term to describe that an organization should itself use the products and services it provides. For DataCite this means that we should use DOIs with appropriate metadata and strategies for long-term preservation for the scholarly outputs we produce. ...
Differences between ORCID and DataCite Metadata
One of the first tasks for DataCite in the European Commission-funded THOR project, which started in June, was to contribute to a comparison of the ORCID and DataCite metadata standards. Together with ORCID, CERN, the British Library and Dryad we looked at how contributors, ...
2020 Strategic Priorities for Services and Infrastructure
In a blog post four weeks ago DataCite Executive Director Matt Buys talked about the DataCite strategic priorities for 2020 (Buys, 2020). In this post we want to talk a bit more about the strategic priorities for this year we have regarding services and infrastructure work: a) ...