Skip to Main Content

Data Management for the Sciences

A guide to best practices for management of research data, including links to data services from the University of California.

Data Citation

Data citation is an important component of data sharing and data reuse. Citing data gives data creators credit for creating and sharing their work, and creates a trail of research progress similar to the citation of articles and books.

Basic Data Citation

There's good consensus around the minimal components of a data citation:

Creator (Year) Title. Publisher. Identifier

Core Elements

  • Creator(s): Individual(s) or organization responsible for creating the dataset.
  • Year: Year the dataset was published, not necessarily created.
  • Title: Should be as descriptive as possible.
  • Publisher: Organization that provides access to the dataset (e.g. Dryad, Zenodo).
  • Identifier: Persistent, unique identifier (e.g. a DOI).

Additional Elements

  • Location / Availability: The web address of the dataset is essential when the identifier can’t be used to reach the dataset.
  • Version / Edition: Version of the dataset used in the present publication. Needed to reproduce analysis of versioned dynamic datasets.
  • Access Date: Date of access for analysis in the present publication. Needed to reproduce analysis of continuously updated dynamic datasets.
  • Format / Material Designator: e.g. database, CD-ROM.
  • Feature Name: A description of the subset of the dataset used. May be a formal title or a list of variables (e.g. concentration, optical density).
  • Verifier: Used to confirm that two datasets are identical. Most commonly a UNF or MD5 checksum.
  • Series: Used if the dataset is part of series of releases (e.g. monthly, yearly).
  • Contributor: e.g. editor, compiler

For datasets that have DOIs, DataCite and CrossRef provide a citation formatter to generate a citation matching any of a wide array of journal styles.

To learn more, see this DataPub blog post on Data Citation or the Joint Declaration of Data Citation Principles.


EZID (easy-eye-dee) A service for researchers and others to obtain and manage long-term identifiers for digital content including data, which makes digital objects easier to access and verify, thus increasing re-use and citations. These identifiers aid data management, data sharing, and citation tracking.

​EZID makes it easy to create and manage long-term, globally unique identifiers for your data and sources, ensuring their future discoverability. Use EZID to:

  • Create identifiers for anything: texts, data, bones, terms, etc.
  • Manage your research objects more easily with shareable, unbreakable links
  • Store citation information for the objects in a variety of formats
  • Fit identifiers into your automated workflows with our standards-based API

Read more at the EZID Learn page.

Data Citation Index

The Data Citation Index on the Web of Science provides a single point of access to quality research data from repositories across disciplines and around the world. Read more information about the coverage and selection process of the data citation index here.