Skip to main content

Data Management for the Sciences

A guide to best practices for management of research data, including links to data services from the University of California.

Why Share Data?

Sharing data is now encouraged by major funding agencies, and many journals require it as a prerequisite for publication. In addition to funder requirements, data sharing is important because it can lead to a broader impact for your research, facilitate advances in science, and facilitate reproducibility. Sharing your data in a subject or institutional repository makes access to and reuse of your data easier.

Why Deposit Data?

You can share your data easily by emailing it to a colleague or posting a file on a website. Unfortunately, informal methods of data sharing make it difficult for other researchers to find your data. Depositing your data in an archive or repository will facilitate its discovery and preservation, and facilitate proper citation. Repositories are maintained by many academic discipline communities, by funding agencies to provide access to funded research, and by academic institutions to protect community member research.

Source: from the University of New Hamsphire

Repository Directories

The following maintain lists of data repositories:

  • re3data.org: global registry of research data repositories that covers repositories from different academic disciplines. As of 2015, it is a merged listing of re3data and DataBib, managed by DataCite.
  • Open Access Directory Data Repositories: a list of repositories and databases for open data. The list is arranged alphabetically, by subject.
  • NIH Data Repositories: a table listing descriptions of, links to, and information on submission and access guidelines for NIH data repositories.
  • Share: provides a comprehensive registry of research across academic disciplines.

Selected Science Repositories

  • BIRN - Biomedical Informatics Research Network: a national initiative to advance biomedical research through data sharing and online collaboration. Funded by the National Center for Research Resources (NCRR), a component of the US National Institutes of Health (NIH), BIRN provides data-sharing infrastructure, software tools, strategies and advisory services.
  • Climate Data: Weather, temperature, carbon, water, soil and all other kinds of climate related open data. Including (but not limited to) data about: the atmosphere, weather, air quality, temperature, the ocean, rivers, water levels, water quality, carbon emissions, pollution, soil erosion, and biodiversity.
  • Dryad: Dryad is an international repository of data underlying peer-reviewed articles in the basic and applied biosciences.
  • Extragalactic Database: NASA's archive of data for over 3 million extragalactic objects.
  • GitHub: free public repositories, collaborator management, issue tracking, wikis, downloads, code review, graphs and much more. Hosts developer libraries such as Ruby on Rails, IronRuby, jQuery, Perl
  • SIMBAD: the SIMBAD astronomical database provides basic data, cross-identifications, bibliography and measurements for astronomical objects outside the solar system.
  • Zenodo: An open dependable home for the long-tail of science, enabling researchers to share and preserve any research outputs in any size, any format and from any science. Funded by European organizations.
  • The Cancer Imaging Archive: contains imaging data organized by cancer type and anatomical site.

Selected Social Sciences Repositories

  • ICPSR: the Interuniversity Consortium for Political and Social Research (ICPSR) encourages and welcomes data deposit. Among the services offered, ICPSR describes data fully for Web discovery and protect respondent privacy,  ensures long-term data availability, and ICPSR staff are available to answer questions about downloading and using data.
  • UCLA Social Science Data Archive: established in 1977, the UCLA Social Science Data Archive provides a foundation for social science research with faculty support throught an entire research project involving original data collection or the reuse of publicly available studies.
  • UK Data Archive: the UK's largest collection of digital research data in the social sciences and humanities: We are open to offers of any data collection which may be of use to social scientists and historians, whether large scale or small scale, in most formats. - Web site
  • County of Los Angeles Open Data: data sets collected from various public institutions within the county on the topics of: education, health, administration, crime, and elections.

Selected Humanities Repositories

  • Cultural Policy and the Arts National Data Archive (CPANDA): CPANDA strives to acquire, archive, document, and preserve data sets on topics in art and cultural policy, including arts funding, arts education, the arts and economic development, public participation in the arts, and attitudes towards the arts. Data is provided in a user-friendly format for scholars, journalists, policy makers, artists, and cultural organizations.
  • Speech & Language Data Repository: a data repository offering labs and scholars a free-of-charge service for sharing their oral/linguistic data and archiving it with the help of procedures compliant with the OAIS model for long-term preservation.
  • VADS: online resource for visual arts. VADS collects, catalogues, manages, preserves, and encourages the re-use of digital resources created by, and of relevance to, the visual arts education community.

General Repositories

Data Citation Index

The Data Citation Index on the Web of Science provides a single point of access to quality research data from repositories across disciplines and around the world. Read more information about the coverage and selection process of the data citation index here.