Data Curation

EMAGE curation staff take incoming data from many sources, check and correct for errors, assess for consistency, and convert into a standard format which adds structure that allows subsequent data interrogation and exchange. For an in-depth report about Data Curation at EMAGE, please refer to EMAGE Case Study commissioned by the Digital Curation Centre SCARP project.

The information aspects we check and standardise for every EMAGE entry (based on both the supplied information and extra information we source) are:


During the curation process we refer to several external sources that are the accepted authorities regarding each aspect of the information:

 

Information aspect Source and Comments

 

 

Gene or Protein Symbol and Name

MGI gene/protein symbols and names. Mouse gene/protein name and symbol information is assigned according to the guidelines of the Mouse Gene Nomenclature Committee and maintained by staff at MGI. The data includes Gene Name, Gene Symbol and a unique identifier. At EMAGE we assign the correct gene/protein ID to incoming data (e.g. MGI:99604).

Mouse Strains

MGI Mouse strain information. Mouse strain information is maintained by staff at MGI. At EMAGE we assign the strain name in MGI-format to incoming data (e.g. " 129S2/SvPas * C57BL/6 * CD-1 ")

Mouse alleles

Mouse allele information. Mouse allele information is maintained by staff at MGI. At EMAGE we assign the correct allele ID to incoming data (e.g. MGI:3702935).

Nucleic Acid Sequences

INSDC sequence database. We use versioned INSDC sequence identifiers when referring to nucleid acid sequences (e.g. NM_021459.4).

Amino Acid Sequences

NCBI protein sequence database. We use versioned NCBI sequence identifiers when referring to amino acid sequences (e.g. NP_067434.3).

Probes or antisera

MGI database. If a probe or antibody has been previously described by MGI curators, we use the MGI ID (e.g. MGI:1334951). Otherwise we assign a new ID in house. These are displayed in EMAGE as "GeneNameprobeA"," ProteinNameAntibodyB" etc.

Mouse Embryo Anatomy Descriptions

EMAP Mouse Anatomy Ontology. We describe all text-based descriptions of sites of expression using the EMAP mouse anatomy ontology.

 

During our spatial annotation procedure, we also comment on the clarity of the expression pattern seen in the image and the morphological match between the data embryo and the EMAP embryo template which houses the spatial annotation.

 

 

Quicksearch Help

(Click the icon to keep this page displayed.)