Dryad: Interview with Jen Gibson

In October Jen Gibson started as the new Executive Director for the Dryad Data Repository. I used the opportunity to ask Jen a few questions about Dryad, challenges with data sharing, and ideas about moving Dryad forward. I was particularly interested in the interview as I served on the Dryad Board of Directors from 2013 to 2016. In fact, one post on this blog is about a presentation I gave in 2013 (Metrics and attribution: my thoughts for the panel at the ORCID-Dryad symposium on research attribution).

1. Can you describe what Dryad is?

Dryad is an open data publishing platform and community committed to the open availability and routine re-use of all research data. We’re a generalist, curated data repository – working with researchers across disciplines to assemble details about their work and their data to increase its discoverability and use by others. We’re a non-profit initiative working in partnership with other systems and communities working to advance open research.

2. Are there data that should not be submitted to Dryad?

Dryad is designed to support openly accessible and fully re-usable data, and publishes under a Creative Commons Public Domain License (CC0). So, we’re not able to publish data with incompatible licensing terms. We also don’t accept datasets that contain personally identifiable human subject information or specific locations for endangered species – but our team of curators will work with researchers to see if the data can be made appropriate for sharing.

3. Are there limits in the number or size of files in a data submission?

We can accept up to 300GB per dataset and thousands of files, but curators are looking for the most accessible file types and downloads to support re-use. So, we ask that files be small and zipped when possible. Users with files larger than 300GB are asked simply to contact us to coordinate.

4. Does Dryad also accept software submissions?

No, Dryad only publishes data. So, we’ve partnered with Zenodo so that users can load software and other supplementary information at the same time that they load data to Dryad. Software can then be made available under a different, more appropriate license and be given its own citation, but is also linked from the data publication at Dryad.

5. Does a data submission to Dryad cost money?

Yes. Often the cost to researchers is covered or discounted through an arrangement with their institution, publisher, or funder. Where there is not yet an arrangement in place, the submitting researcher is asked to pay $120 – with overage fees if the data is larger than 300GB. Waivers are, of course, available. Publication fees, memberships, and partnerships with institutions, publishers and funders help cover our costs for data curation and long-term preservation.

6. What are the main challenges for authors?

There are so many challenges for researchers with respect to open sharing of data – and they’re different in every discipline. There are behaviour changes, and workflow changes and culture changes to overcome – although several disciplines have carved a path, including astronomy and ecology. I hope that the open data-sharing around COVID will help inspire more people to follow suit.

Dryad helps to overcome these challenges, of course. We help first of all by helping to make it possible to share data properly – openly, with a CC0 license, and second by making it easy to share data through our friendly user interface and our integrations with publishers and partners such as Zenodo. (With the terms ‘possible, easy, rewarding and normative,’ I’m invoking Brian Nosek’s Strategy for Culture Change).

We’re helping to make data sharing rewarding by standardising usage metrics (through Make Data Count – an important, community-led initiative to develop metrics for open research data assessment) and encouraging citation, though these are just a couple of the pieces needed to put data and data sharing at the centre of research assessment.

Finally, what I’m very much looking forward to doing with Dryad is supporting and building communities that share and re-use data to make this practice normative. Connecting people with people – and people with data – is key in nurturing and normalising the regular exchange and re-use of data.

7. Why should authors submit their data to Dryad?

Authors whose communities don’t already rely on a domain repository for data, such as WormBase or the Protein Data Bank, should publish their data with Dryad because:

Our curation service increases the quality and discoverability of new data, by ensuring key descriptive information (metadata) is available alongside the data itself
Our curation service increases the integrity of new data, by ensuring it’s readable and usable by other users
We put data in context, with links to publications, software, institutions, funders and more
Data published in Dryad is citable and accessible via a persistent DOI
As a non-profit and open-source initiative, our values are closely aligned with the research community
It’s easy, and affordable

8. How can authors give feedback, e.g. to report problems or request features?

We are an open source project and our work is driven by researchers' needs. Get in touch with the help desk to discuss feature needs with our product team or leave a ticket on our public Github product board.

9. What did you do before starting to work for Dryad?

I’ve worked in open research since 2005, when I joined SPARC – the Scholarly Publishing and Academic Resources Coalition – to work on open-access advocacy and policy efforts. Immediately before Dryad I was a founding member of the team for eLife – the open-access journal for biology and medicine and initiative from three of the largest, most prestigious, private biomedical research funders to put science publishing back in the hands of science. I’d say my main contributions so far have been in building communities (among librarians, open science advocates, students, early-career or late-stage researchers, and others), advocacy (for open research practice at different levels of practice and policy), and adoption strategies (for researchers in particular).

10. Do you want to talk about future plans for Dryad?

Absolutely. Beyond my near-term objectives for optimising operations, I expect us to be:

Leveraging support from institutions and expanding our membership program.
Working with our publisher partners to attract as much data as possible.
More actively diversifying our profile – specifically reaching out to different geographic and disciplinary communities.
Expanding our roadmap for modeling data publishing into the future, building on our progress over the last couple of years.

Longer term, I’m enthusiastic about the potential for pushing data to the forefront of discovery, and taking steps to facilitate re-use and extend credit. Like Make Data Count, Dryad is committed to making data a first-class citizen in research and research assessment, and I'll be really pleased if we’re able to help properly reward researchers for their data publications. We’re going to need to, to really accelerate discovery and build that open, global network for the exchange of research objects.