In October Jen Gibson started as the new Executive Director for the Dryad Data Repository. I used the opportunity to ask Jen a few questions about Dryad, challenges with data sharing, and ideas about moving Dryad forward. I was particularly interested in the interview as I served on the Dryad Board of Directors from 2013 to 2016. In fact, one post on this blog is about a presentation I gave in 2013 (Metrics and attribution: my thoughts for the panel at the ORCID-Dryad symposium on research attribution).
Dryad is an open data publishing platform and community committed to the open availability and routine re-use of all research data. We’re a generalist, curated data repository – working with researchers across disciplines to assemble details about their work and their data to increase its discoverability and use by others. We’re a non-profit initiative working in partnership with other systems and communities working to advance open research.
Dryad is designed to support openly accessible and fully re-usable data, and publishes under a Creative Commons Public Domain License (CC0). So, we’re not able to publish data with incompatible licensing terms. We also don’t accept datasets that contain personally identifiable human subject information or specific locations for endangered species – but our team of curators will work with researchers to see if the data can be made appropriate for sharing.
We can accept up to 300GB per dataset and thousands of files, but curators are looking for the most accessible file types and downloads to support re-use. So, we ask that files be small and zipped when possible. Users with files larger than 300GB are asked simply to contact us to coordinate.
No, Dryad only publishes data. So, we’ve partnered with Zenodo so that users can load software and other supplementary information at the same time that they load data to Dryad. Software can then be made available under a different, more appropriate license and be given its own citation, but is also linked from the data publication at Dryad.
Yes. Often the cost to researchers is covered or discounted through an arrangement with their institution, publisher, or funder. Where there is not yet an arrangement in place, the submitting researcher is asked to pay $120 – with overage fees if the data is larger than 300GB. Waivers are, of course, available. Publication fees, memberships, and partnerships with institutions, publishers and funders help cover our costs for data curation and long-term preservation.
There are so many challenges for researchers with respect to open sharing of data – and they’re different in every discipline. There are behaviour changes, and workflow changes and culture changes to overcome – although several disciplines have carved a path, including astronomy and ecology. I hope that the open data-sharing around COVID will help inspire more people to follow suit.
Dryad helps to overcome these challenges, of course. We help first of all by helping to make it possible to share data properly – openly, with a CC0 license, and second by making it easy to share data through our friendly user interface and our integrations with publishers and partners such as Zenodo. (With the terms ‘possible, easy, rewarding and normative,’ I’m invoking Brian Nosek’s Strategy for Culture Change).
We’re helping to make data sharing rewarding by standardising usage metrics (through Make Data Count – an important, community-led initiative to develop metrics for open research data assessment) and encouraging citation, though these are just a couple of the pieces needed to put data and data sharing at the centre of research assessment.
Finally, what I’m very much looking forward to doing with Dryad is supporting and building communities that share and re-use data to make this practice normative. Connecting people with people – and people with data – is key in nurturing and normalising the regular exchange and re-use of data.
Authors whose communities don’t already rely on a domain repository for data, such as WormBase or the Protein Data Bank, should publish their data with Dryad because:
We are an open source project and our work is driven by researchers' needs. Get in touch with the help desk to discuss feature needs with our product team or leave a ticket on our public Github product board.
I’ve worked in open research since 2005, when I joined SPARC – the Scholarly Publishing and Academic Resources Coalition – to work on open-access advocacy and policy efforts. Immediately before Dryad I was a founding member of the team for eLife – the open-access journal for biology and medicine and initiative from three of the largest, most prestigious, private biomedical research funders to put science publishing back in the hands of science. I’d say my main contributions so far have been in building communities (among librarians, open science advocates, students, early-career or late-stage researchers, and others), advocacy (for open research practice at different levels of practice and policy), and adoption strategies (for researchers in particular).
Absolutely. Beyond my near-term objectives for optimising operations, I expect us to be:
Longer term, I’m enthusiastic about the potential for pushing data to the forefront of discovery, and taking steps to facilitate re-use and extend credit. Like Make Data Count, Dryad is committed to making data a first-class citizen in research and research assessment, and I'll be really pleased if we’re able to help properly reward researchers for their data publications. We’re going to need to, to really accelerate discovery and build that open, global network for the exchange of research objects.
PLoS Article-Level Metrics: Interview with Martin Fenner
This blog occasionally does interviews with people providing interesting tools for scholars. These interviews have always been among my favorite blog posts. This now is obviously an interview with myself, ...
PLoS One: Interview with Peter Binfield
At SciBar Camp Palo Alto last month, Peter Binfield from PLoS ONE gave a very interesting presentation on Article-level metrics from the PLoS perspective. Particularly interesting was his announcement that PLoS journals will provide usage data (HTML pageviews, ...
Figshare: Interview with Mark Hahnel
figshare allows researchers to publish all of their research outputs in seconds in an easily citable, sharable and discoverable manner. The service was started by Mark Hahnel last year while still a PhD student. ...