Skip to main content

Data Management for the Sciences

A guide to best practices for management of research data, including links to data services from the University of California.

What Is Data?

While definitions of research data abound, it is arguably less important to articulate a single definition for the term "research data" than to recognize when you have such data on hand. A working definition for research data is that they are the bits of information generated during a research project and often form the backbone of any research study or publication. Examples of research data include

  • Non-digital text (lab books, field notebooks)
  • Digital texts or digital copies of text
  • Spreadsheets
  • Audio, video
  • Computer Aided Design/CAD
  • Statistics (SPSS, SAS)
  • Databases 
  • Geographic Information Systems (GIS) and spatial data
  • Non-digital images
  • Digital copies of images
  • Matlab files
  • Biological/organic/inorganic samples or specimens
  • Computer code
  • Protein or genetic sequences
  • Artistic products
  • Web files
  • Standard operating procedures and protocols
  • Collection of digital objects acquired and generated during research

Source: from Georgia Tech

Managing Data

If you're still reading at this point, then it's likely that you have some form of research data, either your own, your laboratory's, or your advisor's. It's also likely you need to systematically manage this data. Make this as easy a task as it can be by breaking the data management process into discrete, doable steps:

  • First, consider a data planning checklist. This helps you to answer basic questions about your data and also helps you look down to the road to the various issues that will probably arise as you go through your project. Learn more . . .
  • Document and create metadata, or descriptions, for your data. Documentation and metadata are essential if any of the following describes your research 
    • Many people on your research team will be working with the data
    • You want to return to using this data after a period of time has passed
    • You need to publish your data alongside any articles or other output
    • You want others to find and cite your data
    • Learn more . . .
  • Organizing your data. Organize your data by following certain conventions so that you can easily track what has been produced and where they are. Learn more . . .
  • Backing up and securing your data. Ensure continual access to your data as well as control who gets access to your data by following some simple rules. Learn more . . .
  • Sharing your data. All that work you put into managing your data -- may as well make it pay off for you by sharing your data, especially since doing so
    • Increase the visibility and impact of your research
    • Assist in the dissemination of knowledge by allowing others to replicate your results or discover new results of their own (and cite you for it!)
    • Satisfy funding agencies' requirements for disseminating research outputs
    • Learn more . . .
  • Creating a data management plan. If you want your research to be funded by NSF, NIH, or number of other federal, state, or nonprofit organizations, you may have to submit a data management plan along with your grant proposal. Let our tools help you do that in the least painful, time-consuming way possible. Learn more . . .

Source: adapted from MIT Libraries.