Data Standards and the BICCN

Standardization in protocols, data production, format, ontology, and analysis are the key to the production of a quality IDS.  Standardization achieved by the BICCN will have practical and broad impact in defining how the field uses and extends this data.  BCDC has established an executive standards board that will help select topical leads and facilitate progress.  Identified below are major areas requiring standardization effort.  These may be separated into working groups focusing on standardization of protocols, metadata, and attributes as well as those concerned with infrastructure, transgenics, data analysis and outreach. Initial efforts outlined below will center around identifying and describing what the data will be generated, required metadata, file sharing, and common description language.  Specific issues to be addressed are:
 

• Standardized protocols and experimental design to enable quantitative interoperability of data between groups.   Protocols will be deposited into Protocls.io and linked to the BICCN repository.

• Standardized definition of each of the different data unit types and level.

• Inventory the number and size of each of the different data units.

• Common coordinate frameworks for standard and effective data mapping. 

• Standardized file formats to enable efficient data sharing.

• Minimal meta-data to enable data re-use.

• Application of controlled vocabularies.

• QC/QA protocols. 

• Define requirements for R24 submission pipeline and validation tools.

Standards specifications and documents will be made available as they are developed. Later phases of the standardization effort will be centered around feature definition and feature  computation for the Integrated Data Set (IDS), involving cross linkage of data sets for cell types specific comparisons.

 

Working Groups within the BICCN

The working groups in BICCN cover topics from management of transgenic lines, neuronal reconstruction, and infrastructure.  Special Standards Working Groups (SWG) span the core set of activities that require accurate documentation in protocol, metadata, and feature description.   Activities of the SWGs include write up of all protocols across laboratories for comparison and future reference for analysis discrepancies. Another key charge is in the determination of appropriate features and data resolution relevant to cell type and to be reported to BCDC. Specific activities for each Standards Working Group are summarized below.  

Transcriptomics/Epigenomics

Single cell trancriptomics forms one of the most basic and highest production level data modality for the BICCN. It has demonstrated fundamental importance in the elucidation and classification of cell types and the development of standards for data production in the brain is critical to form a solid basis for the BICCN. Issues requiring standardization and experimental control include

  • Comparison of Drop-Seq and 10x methodology and correspondence of signal detection between these approaches.
  • Single-nucleus epigenetic approaches, single nucleus mC-seq, ATAC-seq.
  • Appropriate data archiving, utilization of R24 NEMO, DCP infrastructure and deployment of data.
  • Define common analysis workflows and collaboration between single cell groups.
  • Metadata standards for single cell transcriptomics and epigenetics modalities.
  • Quality control metrics for each modality.

 

Anatomy and Morphology

The Neuronal Anatomy and Morphology SWG focuses on standardization in anatomic description, ontology, and morphological features to be captured.    

  • Description of how morphological data is produced and write documentation.
  • Definition of comprehensive, directional and quantitative 3D atlas of long range neuronal connections.
  • Identification and description of projection targets.
  • Minimum metadata standards for morphology.
  • Description of multiple labeling strategies.
  • Sharing morphology issues and challenges in reconstructions.
  • Define level of imaging data that is for BCDC (voxel-level, computation of features, quantification of axons, dendrites.
  • Quality control metrics for each modality.
  • Handing multiple versions of anatomical terminologies.

 

 

Spatial Transcriptomics

Spatial transcriptomic imaging methods will provide a key to determining the connection between anatomy and combinatorics of cell type.   There are many variants of these techniques under development and standardization for BICCN will be key to successful deciphering of cell types.

  • Write up of key protocols and comparison with results generated by alternative groups. 
  • Mapping of spatial transcriptomic data to the CCF and the key issues for its use in analysis.
  • Relationship of spatial transcriptomics and scRNA-seq methods.
  • Minimum metadata for experimental use.
  • Quantification issues in transcript identification.

 

Multimodal (Patch-Seq)

True multimodal data approaches such as Patch-Seq are central to directly associating electrophysiology, morphology, and transcriptomic data.  These techniques address the correspondence problem between modalities directly by simultaneously recording, extracting RNA, and reconstructing cells.   Key efforts in standardization include

  • Methods and protocols for cross modality association.
  • Variance and accuracy measurements in mapping associations.
  • Reconciliation with single modality measurement protocols.
  • Checking accuracy of transcriptomic mapped data. 

 

Common Coordinate Frameworks

Central to the BICCN mission is the accurate mapping of cellular and imaging data to a common coordinate framework (CCF) such that data can be referenced, searched, and analyzed by the community.   There are several problems data mapping issues including

  • Spatial coordinate systems, global vs local structures.
  • Protocols for mapping single cell data in the absence of imaging context.
  • Metadata necessary for accurate cellular and image mapping.
  • Descriptions of the CCF and how IDS users interact with this resource.
  • Registration methodology for imaging data including whole brain and slice data.
  • Distinctions in CCF for mouse, non-human primate, human and proposed roadmap.
  • Versioning and change logs that document differences in delineations across versions for backwards compatibility between CCD versions.

 

Human and Non-Human Primate Mapping

The human data generation groups are beginning to establish the framework for a larger more comprehensive survey of the brain.  There are special challenges for these data types including the lack of cellular mapping atlas and inability to utilize transgenic approaches.   This group addresses issues specific to human, developmental human and non-human primate data and their comparison to mouse cell data.   Areas of interest include:

  • Data mapping issues, common coordinate identification for these data types.
  • Multimodal atlases of human brain cell types
  • Human slice physiology, standardization and metadata.
  • Develop standardized methodologies for human scale characterization and mapping.
  • Comparison of the marmoset work with the Brain/MINDS project of Japan.

 

Imaging Protocols

The BICCN is employing several advances in imaging technology to enable higher throughput production and analysis of imaging data, and these data are fundamental to morphology and connectivity and its relationship to cell type Several opportunities for standardization practices include

  • Documentation of all imaging methods and approaches, including image processing workflow for anatomic data sets.
  • Defining level of imaging data that is for BCDC (voxel-level, computation of features, quantification of axons, dendrites)
  • Detailing of fMOST protocol and other precision imaging approaches and its mapping to the CCF v3.
  • Description and standardization for cell counting approaches.
  • Quality assurance and control.

For further information on standards activities within the BICCN please contact info@biccn.org