Skip to Main Content
Edward G. Miner Library

Data Management: Storing Data

This guide provides resources for managing and sharing your research data no matter the discipline.

UR Research Repository (URRR)

The UR Research Repository (URRR) offers a place for faculty, researchers, students, staff, and UR community members to deposit their research outputs. URRR allows you to:

  • Share your data, papers, presentations, dissertations, and other research outputs

  • Make your work easily accessible to the global research community

  • Meet publisher and funder requirements (such as the new NIH Data Management and Sharing Policy)

  • Reserve a DOI for your research output

  • Connect your data and research outputs to your publications and ORCID

  • Benefit from the UR Libraries’ data curation process

Every UR community member will be allotted 10GB of initial storage and data submissions will also undergo a light data curation process by the UR libraries, helping ensure your data meets funder and publisher standards.

Visit URRR to get started or contact our team of data librarians to get personalized guidance.  

Health and Biomedical Sciences Data Repositories

Biological Magnetic Resonance DataBank - MRI data

National Center for Biotechnology Information (NCBI) - Numerous databases with a genomic/proteomic focus

Neuroscience Information Framework - A virtual community of data, materials, and web-based neuroscience resources with the goal of enabling discovery and access to public research data and tools worldwide through an open source, networked environment

The PhysioBank archives of PhysioNet -- Digital recordings of physiologic signals and related data for use by the biomedical research community

OpenNeuro is a free and open platform for sharing MRI, MEG, EEG, iEEG, and ECoG data. It could be suitable for sharing MRI data from your project.

National Institute of Mental Health Data Archive (NDA): The NDA is a cloud-based data repository that stores and shares data from all research funded by the National Institute of Mental Health (NIMH). This repository could be suitable for demographic, clinical, neurocognitive, and MRI data.

Repository Registry Portals

FAIRsharing.org -- A curated, informative and educational resource on data and metadata standards.

Databib.org -- Databib is a tool for helping researchers identify and locate online repositories of research data.

re3data.org -- The Registry of Research Data Repositories (re3data) is a global registry of research data repositories which covers research data repositories from different academic disciplines.

NIH-Supported Data Repositories

To help researchers locate an appropriate repository for sharing or accessing data, The Trans-NIH BioMedical Informatics Coordinating Committee (BMIC) maintains lists of data sharing repositories. Domain-specific repositories are typically limited to data of a certain type or related to a certain discipline. Generalist repositories accept data regardless of data type, format, content, or disciplinary focus.

Repositories for Qualitative Research

Qualitative Data Repository (QDR) is a dedicated archive for storing and sharing digital data (and accompanying documentation) generated or collected through qualitative and multi-method research in the social sciences. It is housed at Syracuse University. QDR provides search, browsing, and download access to data in their original formats, and allows researchers to upload study materials themselves.

Inter-university Consortium for Political and Social Research (ICPSR) is an international consortium of academic institutions and research organizations that provides leadership and training in data access, curation, and methods of analysis for a vast archive of social science data. ICPSR is a suitable repository for a wide range of research data, including qualitative data.

Section 4 Rubric: Data Preservation, Access, and Timelines

The name of the repository(ies) where scientific data and metadata arising from the project will be archived. NIH has provided additional information to assist in selecting suitable repositories for scientific data resulting from funded research: NOT-OD-21-016. How the scientific data will be findable and identifiable, i.e., via a persistent unique identifier or other standard indexing Tools. When the scientific data will be made available to other users (i.e., the larger research community, institutions, and/or the broader public) and for how long.
Performance level
Performance Criteria Complete/detailed Addressed issue, but incomplete Did not address
4.1

Provides details on where the data will be made publicly available

Clearly specifies where the data will be made available to people outside of the project. Aligned with FOA requirements, as needed.

Verifies that the data will be made available outside of the project but does not identify a specific repository.

Does not specify where the data will be made available outside of the project.

4.2

How the scientific data will be findable and identifiable, i.e., via a persistent unique identifier or other standard indexing tools.

Clearly specifies findability of the data by describing how a unique and persistent identifier will be obtained for the data.

Describes how a landing page URL for the data file will be created, but no identifier.

Does not specify data findability parameters.

4.3

When the scientific data will be made available to other users (i.e., the research community, institutions, and/or the broader public) and for how long.

Clearly specifies when the data will be made available to people outside of the project.

Verifies that the data will be made available outside of the project but does not identify timing.

Does not specify when the data will be made available outside of the project.

Human participant considerations

De-identification

When collecting data from and with human participants and communities, Element 5.C. of the DMS Plan requires that you describe protection of participants including de-identification of the data. Be explicit in the process you will use to address direct and indirect identifiers.

  • Direct identifiers should be completely removed from data. This includes the 18 identifiers described in the HIPAA Safe Harbor Method and any other information that directly ties to an individual.
  • Indirect identifiers require close examination for variables - that when combined with other variables, datasets, or publicly available information - could re-identify participants.
  • The process of data curation usually involves some level of inspection for direct and indirect identifiers. Not all data repositories have data curators, and if they do, curation may not be a free service.
  • Note that a de-identified dataset is not anonymized. When crafting DMS Plans, IRB applications, and participant agreements, using language such as "de-identified" or "confidential" is preferred.

Resources

Informed consent language

When collecting data from and with human participants and communities, Element 5.A. of the DMS Plan requires that you describe how informed consent will be obtained for data sharing, and if there will be any access restrictions to the data related to consent. Be explicit in the language you will use in the consent form given that the Certificate of Confidentiality (issued to all NIH awardees) requires explicit consent for data sharing. Include:

  • How the data will be processed before it is shared, including de-identification methods
  • What data will be shared (and what data will not be shared)
  • Where the data will be shared (name the specific repository)
  • How the data will be accessed (publicly available or restricted to specific requesters)
  • Who will grant access (the repository or the PI)

Resources

Section 5 Rubric: Access, Distribution, or Reuse Considerations

Describe any applicable factors affecting subsequent access, distribution, or reuse of scientific data related to whether access to scientific data derived from humans will be controlled (i.e., made available by a data repository only after approval).
Performance level
Performance Criteria Complete/detailed Addressed issue, but incomplete Did not address
5.1

Provides details for access to scientific data derived from patient data (if any).

Clearly specifies restrictions imposed by federal, Tribal, or state laws, regulations, or policies, or existing or anticipated agreements, and any other considerations that may limit the extent of data sharing.

Verifies that the data are derived from patient data and must be controlled but does not specify controls.

Does not specify whether there are patient derived data.

5.2

Describes what protections will be put into place to protect privacy or confidentiality of human research subjects, including vulnerable populations (if applicable)

Clearly describe the actions that will be taken to address the sharing of sensitive data and demonstrate an appropriate balance of protecting sensitive data and sharing non sensitive data.

Actions that will be taken to address the sharing of sensitive data are vaguely described.

Actions that will be taken to address the sharing of sensitive data are not described.

5.3

Describes what intellectual property rights to the data and supporting materials will be given to the public and which will be retained by project personnel (if any)

Clearly defines the IP rights the public (or designated group) has in accessing the data and the rights retained by project personnel (if any).

Vaguely defines the IP rights the public (or designated group) has in accessing the data or that are retained by project personnel.

Does not address IP rights for the public, intended audiences or personnel in the research group.

5.4

Describes security measures that will be in place to protect the data from unauthorized access

Clearly describes the security measures that will be put into place to prevent unauthorized access to the data.

Vaguely describes the security measures that will be put into place to prevent unauthorized access to the data.

Does not describe the security measures that will be put into place to prevent unauthorized access to the data.

5.5

If there are factors that limit the ability to share data, e.g. proprietary nature or commercialization of the data

Clearly defines the population to whom the data will be made available, as well as any conditions on access, a justification for its limited release.

Vaguely discusses who will have access to the data or conditions on access.

Does not state who will be able to gain access to the data.


Key Aspects to Consider:

  • Informed consent
  • Privacy and confidentiality protections consistent with applicable federal, Tribal, state, and local laws, regulations, and policies
  • Whether access to data derived from humans will be controlled 
  • Any restrictions imposed by federal, Tribal, or state laws, regulations, or policies, or existing or anticipated agreements
  • Any other considerations that may limit the extent of data sharing. Any potential limitations on subsequent data use should be communicated to the individuals or entities (for example, data repository managers) that will preserve and share the scientific data. The NIH IC will assess whether an applicant’s DMS plan appropriately considers and describes these factors