6.4 Annotations and Metadata Schemas

Last modified by publicadmin on 2025/12/16 13:04


About Annotations and Metadata Schemas

Metadata annotations are pieces of code defined by a Dataset author to describe and represent what's inside the Dataset in a machine-readable structure that can be used by search and query engines to facilitate future discovery and reuse by the research community.  Adding metadata annotations does not change the actual content of the data itself.

Certain research communities have launched initiatives to develop standard metadata annotation schemas for specific scientific research domains.  One example is the open Metadata Initiative for Neuroscience Data Structures (openMINDS ) schema supported by the Human Brain Project and EBRAINS.  Where no such domain-specific standards exist, generic metadata models can be used, such as the DAta Tag Suite (DATS) model.  

Annotating your Dataset with a metadata schema enables Dataset findability in the VRE Knowledge Graph. The VRE provides several options for researchers to annotate their Datasets:

  1. VRE Default Schema is interoperable with OpenMINDS and DATS models and can represent a wide range of research domains.
  2. VRE Custom Schemas are flexible schemas whose elements are defined entirely by you.
  3. Supported external schemas allow researchers to upload predefined supported* standard metadata schemas in JSON format. *Currently, the EBRAINS openMINDS schema is supported.

Changes to your Dataset's metadata are tracked and can be viewed in the Activity Stream.  When new versions of your Dataset are released, any metadata that has been defined for your Dataset at the time of release will be stored and will be available to download as part of that version.


VRE Schemas

VRE Schema

The VRE Schema is a collection of metadata schema templates that a Dataset creator can use to annotate their Datasets and make them findable in the VRE Knowledge Graph.  The Essential schema stores the key mandatory information - title, code, authors, and description - collected during the creation of a new dataset.  If desired, you can use the additional metatadata fields in the VRE Schema to describe your dataset in greater detail. 

The metadata schema templates in the VRE Schema are listed below.  Complete field descriptions for these metadata schemas can be viewed in the VRE Schema Description.

  • Essential - basic information about the Dataset, including the information collected at the time of Dataset creation.
    • Title
    • Dataset Code
    • Type
    • Authors
    • Description
    • Modality
    • Collection Method
    • License
    • Number of Subjects
    • Dataset Identifier
    • Dataset Identifier Source
  • Subjects - information about each data subject in the Dataset (nested/repeating entries)
    • Subject ID
    • Subject Sex
    • Subject Species
    • Subject Age Category
  • Disease - information about the disease condition
    • Disease Name
    • Disease Diagnosis Date
    • Disease Status
    • Identifier
    • Identifier Source
  • Distribution - Information about the Dataset's distribution properties (format, web URL, authorization)
    • Dataset Distribution Access Landing Page
    • Dataset Distribution Technical Format
    • Dataset Distribution Access Authorization
  • Contributors - Information about the persons or organizations who contributed to the dataset.
    • Person: Name/email address of Dataset creators
    • Organization - Name and abbreviation of a contributing organization
  • Grant - Information about the grant that supported the work reported by the Dataset.
    • Grant Name
    • Grant Funder (Person or Organization), and applicable information about each.

Custom Schema

If you would like to annotate your Dataset with information that is not listed in the VRE Schemas, you can define a Custom Schema template.  You will be able to create your own unique fields, designate each field as required or optional, and then save the template for you to fill out with your annotations.  

Field types that are available in the Custom Schema template include: 

  • Text
  • Multiple Choice
  • Numeric
  • Date

How to create a Custom Schema template

  1. Open the Datasets feature from the top VRE menu bar, then navigate to your Dataset.
  2. Click the Metadata tab. Under Existing Schemas click VRE Schemas.
  3. In the Schemas section on the right panel, click in the Select schema to complete dropdown menu and select + Create Custom Schema.
  4. Enter a Template Name
  5. Click Add field to create a new field:
    • Select a Type (text, multiple choice, numeric, date)
    • Enter a Title for the field
    • If Type was Multiple Choice, define the accepted Values (hit Enter after each entry). 
    • Check the Optional box if the field is not a required annotation for your Dataset.
  6. Click the green checkmark to save the new field, or the red X to remove the field.
  7. Repeat steps 5-6 until all fields have been added.
  8. Click Submit to save your new custom schema template. 
  9. To add more fields to your VRE Custom Schema Template after it has been saved and annotated, navigate to Existing Schema (left panel of the Metadata tab) and select your Custom schema from the Existing Schema list. Click the "eye" icon to view the Schema In the right Schemas panel, then click Manage Template.
Warning

Note: Once a custom schema template has been saved, new fields may be added to the template but existing fields cannot be edited or removed.  


How to annotate your dataset using VRE Default or Custom Schema templates

  1. Open the Datasets feature from the top VRE menu bar, then navigate to your Dataset.
  2. Click the Metadata tab. Under Existing Schemas click VRE Schemas
  3. In the Schemas section on the right panel, click in the Select schema to fill dropdown menu and select one of the VRE default schema templates or a Custom Schema created by you. 
  4. Enter the requested fields.  For complete field descriptions of the VRE Schema, see VRE Schema Description.
  5. Click Save as Draft to save the annotations and return later (this option can be used if not all required fields have been filled out).
  6. Cick Submit to save the schema to your Dataset.  

To view the metadata entries on your Dataset:

  • navigate to Existing Schema (left panel of the Metadata tab)
  • select VRE Schemas
  • Click the schema name, then click the "eye" icon.
  • The entries can be viewed in Schemas (right panel or metadata tab). You can make changes to any schema's metadata entries by selecting the schema from the dropdown list and clicking Edit
  • Note: The metadata viewing function in the Existing Schema panel is unavailable while a Custom Schema Template is being created or edited in the Schemas panel, and the "eye" icon turns grey. 

 To make changes to the existing metadata entries on your Dataset:

  • You can make changes to a VRE Default or Custom Schema's metadata entries by selecting the schema from the dropdown list and clicking Edit.
  • Make the desired changes, then click Update to save.

Supported External Schemas

openMINDS Schema

openMINDS_logo.png

The open Metadata Initiative for Neuroscience Data Structures (openMINDS) is an open-source, community-driven research infrastructure initiative powered by EBRAINS and the Human Brain Project.  The openMINDS schema gathers a set of metadata models that can be used for describing heterogeneous neuroscience data. The data can originate from human, animal or simulated studies, computational models, and software tools, as well as metadata or data models. Metadata stored in the openMINDS configuration can be uploaded directly to your Dataset in the Metadata tab of the Dataset Explorer

Information

NOTE: Before starting, you must have one or more JSON files in openMINDS format.

How to annotate your Dataset in the openMINDS format
  1. Open the Datasets feature from the top VRE menu bar, then navigate to your Dataset.
  2. Click the Metadata tab.
  3. Under Existing Schemas click openMINDS Schemas.
  4. Click Upload Schema.
  5. Click Select Schema.
  6. Select the json file(s) that contain your metadata in the openMINDS format and click Open.
  7. Click Upload.
  8. Your schemas will appear in the Existing Schemas list.  Click the eye icon to view your schema, click the trash icon to delete a schema.

Contributing Metadata Annotations into the Knowledge Graph

After annotating your VRE Dataset with a schema template, the metadata annotations can be ingested in the VRE Knowledge Graph in order to make your Dataset searchable and discoverable by other researchers.  At present, this is possible by importing the annotations in JSON format into the VRE Knowledge Graph.  For more information, see:Importing Dataset schemas into a Knowledge Graph with the Guacamole VM and VRE Command Line Tool.
 


See Also: 

Dataset Creation

Dataset Versioning

Importing Dataset schemas into a Knowledge Graph with the Guacamole VM and VRE Command Line Tool


https://i.creativecommons.org/l/by-sa/4.0/88x31.png

Copyright © 2022, Indoc Research. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0  International License.