Skip to Main Content
UC Logo
Libraries | Ask the Libraries

Open Science

Data Management Plans

During the planning stages of research, it is helpful to create a data management plan. This plan deals with where data will be stored, how it will be named, and how you will define and organize that data. A DMP typically describes:

  • What data will be produced as a part of the project
  • How each type of data will be organized, documented, standardized, stored, protected, shared and archived
  • Who will take responsibility for carrying out the activities listed above, and
  • When these activities will take place over the course of the project (and beyond) 

Creating a Data Management Plan:

  • DMP Tool - a free, open-source, application that helps researchers create data management plans (DMPs). These plans are required by many funding agencies as part of the grant proposal submission process. The DMP Tool provides a click-through wizard for creating a DMP that complies with funder requirements. It also has direct links to funder websites, help text for answering questions, and data management best practices resources.
  • DMP Tool Funder Requirements - Templates for data management plans based on specific requirements listed in funder policy documents

 

Data Storage

Data storage: Storing data in an appropriate drive or repository is one of the first elements to begin planning for. The storage plan should be secure, have adequate function and memory for the research team’s needs, and comply with all data protection regulations. Storage can take place on a repository like OSF.io (Open Science Framework) or on a private drive like Google Drive or OneDrive. 

Use a data repository to provide for:

  • Persistent identifiers for your data (like DOI) that are unique and citable
  • Persistent access
  • Preservation
  • Backup
  • Management of access
  • Versioning
  • Licensing

Naming Conventions and Directory Structures

Data naming: Establishing a data naming convention for files can also assist researchers in increasing transparency and replicability. Creating a meaningful and standardized file name structure makes it easier to both find and understand data. There is no “best way” to create a data naming convention as these are often contextualized to the research project, but a few common practices include: 

  1. Including dates in a standard format, such as YYYY-MM-DD at the beginning of the file. This way, files can be organized by most recent. 

  1. Using an underscore to separate each item from a code. For example, if the naming convention uses year, which experiment, and which version, it could look like this: “2024-05-25_experiment1_v2” 

  1. Be specific about casing, whether that is upper or lower case 

  1. Data naming ideally goes from most broad to most specific. For example: “YYYY-MM-DD_Experiment1_Group2_JohnsonD_v1” indicates first the date that the data collection took place, then which research experiment this data is for, which group the participant is in, and finally which participant the data is about. 

Data dictionary: A data dictionary defines and describes the variables and values used in your data set, providing essential context. Data dictionaries are documents that include information such as variable names, descriptions, units of measurement, and any coding schemes. Doing so increases consistency in the collection and reporting of data across collaborators. A data dictionary also helps streamline data analysis. Here is an example of information that might be included in a data dictionary for each variable. 

 

Variable 

Description 

Units 

Coding Scheme 

Q1_AGE 

Age of respondent 

Years 

N/A 

Q2_GENDER 

Gender of respondent 

N/A 

1 = Male, 2 = Female 

Q3_INCOME 

Annual household income 

US Dollars 

N/A 

Q4_EDUCATION 

Highest level of education 

N/A 

1 = High School, 2 = Bachelor’s, 3 = Master’s, 4 = Doctorate 

Q5_OCCUPATION 

Respondent’s occupation 

N/A 

N/A 

Q6_SATISFACTION 

Satisfaction with service 

Likert Scale (1-5) 

1 = Very Unsatisfied, 2 = Unsatisfied, 3 = Neutral, 4 = Satisfied, 5 = Very Satisfied 

 

Data directory: As files are named and organized per part of the data management plan, a logical directory structure is also important for anyone who might be trying to navigate wherever it is stored. A data directory makes it easier for collaborators and others to navigate through the different stages of a project. This typically is a simple text README file that presents everything in the drive or repository to view at once. 

For an example of a data set that includes a README file, see this Scoping Review Project on OSF. 

Directory structures

  • Directories (main folder) and subdirectories (nested folders) organized to make research materials discoverable and understandable
  • Create subdirectories for like materials: separate data, code, and results.
  • Locations should be distinctive, consistent, and informative:
    • What it is
    • Why it exists
    • How it relates to other files
  • For more information on data organization, see Karl Broman's work

University of Cincinnati Libraries

PO Box 210033 Cincinnati, Ohio 45221-0033

Phone: 513-556-1424

Contact Us | Staff Directory

University of Cincinnati

Alerts | Clery and HEOA Notice | Notice of Non-Discrimination | eAccessibility Concern | Privacy Statement | Copyright Information

© 2021 University of Cincinnati