Hacking the PLoS Article-Level Metrics API Server

Hacking the PLoS Article-Level Metrics API Server

The Science Online London 2011 Conference was a great event that took place last Friday and Saturday. I was able to celebrate the first PLoS Blogs anniversary together with community manager Brian Mossop, but a detailed conference post will follow later. The blog posts covering the event are here, and the list is growing by the hour.

On Sunday a few brave souls met at the Mendeley offices for the Science Online London hackathon to spend some time on a cool programming project. We were greeted by Victor Henning and Jason Hoyt and quickly came up with a few good ideas. Jason finished work on Twendeley, a cool Twitter/Mendeley mashup that looks for papers mentioned in your Twitter stream and finds relevant articles in Mendeley. I wanted to do some work on the PLoS Article-Level Metrics API.

Article-Level Metrics looks at the usage and reach of an individual article instead of using the Journal Impact Factor as a proxy for the impact of a paper. The PLoS Article-Level Metrics API provides access to some interesting numbers: not only citations, HTML pageviews and PDF downloads, but also social metrics such as number of bookmarks in CiteULike or number of readers in Mendeley.

But for the hackathon I was not interested in building a tool that talks to the Article-Metrics API. On Sunday morning I had discovered that the PLoS Article-Level Metrics server software is not only available as Open Source software, but is built with Ruby on Rails, a programming framework that I’m familiar with. So I thought it would be fun to start improving the software by adding metrics on the author level. And by building your own Article-Level Metrics server, you are not limited to papers published in PLoS journals, or to the kinds of metrics provided by PLoS. Wouldn’t it for example be nice to also include download counts for Dryad datasets?

With the help of Kristi Holmes and Cameron Neylon we quickly got the API server up and running, added a few papers and retrieved citation counts from PubMed Central and the number of bookmarks from CiteULike (see above). A few hours later we could add authors, and today I was able to automatically import the first papers by author.

For author-level metrics we of course need a widely used unique author identifier, and an API we can talk to. Otherwise author disambiguation quickly becomes frustrating. Until ORCID comes along next year, the Microsoft Academic Search API looks like one of the best ways to retrieve DOIs for papers published by a particular author.

The code for this project is available at Github. This obviously needs a lot more work, but it shouldn’t take more than a few months to have the author part working properly and to find a host for a server. And because this is an API server based on the PLoS code, all tools that interact with the PLoS Article-Metrics server can use this system immediately.

This service should make it a little bit easier to build a professional trading card for scientists, or a dashboard of the total impact of your research.

Copyright © 2011 Martin Fenner. Distributed under the terms of the Creative Commons Attribution 4.0 License.