Metadata provides the essential tools for discovery, such as a bibliographic citation, and reuse. It is important to add metadata tags to your data. Including keywords and phrases that describe your data and research ensure that other researchers can search and locate your data in a given repository.
The Data Cards Playbook is an emerging metadata standard focused on increasing transparency and providing structured documentation for machine learning datasets and models. The following resource provides a basic description: https://ai.googleblog.com/2022/11/the-data-cards-playbook-toolkit-for.html
The MIAME Standard is widely used for gene expression data and can be adapted for spatial transcriptomics data.
Data Documentation Initiative (DDI): A widely used international standard for describing data from social, behavioral, and economic sciences. DDI allows you to document and manage different stages of the data life cycle, from conceptualization to data distribution.
Dublin Core Metadata Initiative (DCMI): Dublin Core is a versatile metadata standard that can be applied to diverse digital resources, including audio recordings and text transcripts.
Qualitative Data Repository (QDR) offers a set of guidelines for documenting and managing qualitative data, including interview transcripts and notes. This can be combined with the DDI standard to create comprehensive metadata for mixed-methods research.
Performance level | ||||
Performance Criteria | Complete/detailed | Addressed issue, but incomplete | Did not address | |
3.1 | Identifies metadata standards and/or metadata formats that will used for the proposed project |
The metadata standard that will be followed is clearly stated and described. If no disciplinary standard exists, a project specific approach is clearly described. |
The metadata standard that will be followed is vaguely stated. If no disciplinary standard exists, a project-specific approach is vaguely described. |
The metadata standard that will be followed is not stated and no project-specific approach is described. |
3.2 |
Describes data formats created or used during project |
Clearly describes data format standard(s) for the data. |
Describes some but not all data formats, or data format standards for the data. Where standards do not exist, does not propose how this will be addressed. |
Does not include information about data format standards. |
3.3 |
Identifies data formats that will be used for storing data |
Clearly describes data formats that will be used for storing data and explains rationale or complicating factors. |
Only partially describes data formats that will be used for storing data and/or the rationale or complicating factors. |
Does not describe data formats that will be used for storing data and does not explain rationale or complicating factors. |
3.4 |
If the proposed project includes the use of unusual data formats, the plan discusses the proposed solution for converting data into more accessible formats |
Explains how the data will be converted to a more accessible format or otherwise made available to interested parties. In general, solutions and remedies should be provided. |
Vaguely explain[s] how the data may be converted to a more accessible format or otherwise made available to interested parties. |
Does not explain how the data will be converted to a more accessible format or otherwise made available to interested parties. |
The Digital Curation Centre (DCC) -- General list of metadata standards
Allen Brain Atlas-- Offers a detailed map of gene expression in the brain.
Human Protein Atlas (HPA) --Provides a map of the proteins in human cells, tissues, and organs.
This is a standard specifically designed for the healthcare industry. It's widely used for data related to clinical trials. More info at cdisc.org/standards