ABSTRACT
Circadian systems provide a fitness advantage to organisms by allowing them to adapt to daily changes of environmental cues, such as light/dark cycles. The molecular mechanism underlying the circadian clock has been well characterized. However, how internal circadian clocks are entrained with regular daily light/dark cycles remains unclear. By collecting and analyzing indirect calorimetry (IC) data from more than 2000 wild-type mice available from the International Mouse Phenotyping Consortium (IMPC), we show that the onset time and peak phase of activity and food intake rhythms are reliable parameters for screening defects of circadian misalignment. We developed a machine learning algorithm to quantify these two parameters in our misalignment screen (SyncScreener) with existing datasets and used it to screen 750 mutant mouse lines from five IMPC phenotyping centres. Mutants of five genes (Slc7a11, Rhbdl1, Spop, Ctc1 and Oxtr) were found to be associated with altered patterns of activity or food intake. By further studying the Slc7a11tm1a/tm1a mice, we confirmed its advanced activity phase phenotype in response to a simulated jetlag and skeleton photoperiod stimuli. Disruption of Slc7a11 affected the intercellular communication in the suprachiasmatic nucleus, suggesting a defect in synchronization of clock neurons. Our study has established a systematic phenotype analysis approach that can be used to uncover the mechanism of circadian entrainment in mice.
Subject(s)
Circadian Rhythm/genetics , Amino Acid Transport System y+/genetics , Animals , Machine Learning , Male , Mice , Mice, Inbred C57BL , Mutation , Receptors, Oxytocin/genetics , Repressor Proteins/genetics , Serine Endopeptidases/genetics , Telomere-Binding Proteins/genetics , Ubiquitin-Protein Ligase Complexes/geneticsABSTRACT
Expression Atlas (http://www.ebi.ac.uk/gxa) provides information about gene and protein expression in animal and plant samples of different cell types, organism parts, developmental stages, diseases and other conditions. It consists of selected microarray and RNA-sequencing studies from ArrayExpress, which have been manually curated, annotated with ontology terms, checked for high quality and processed using standardised analysis methods. Since the last update, Atlas has grown seven-fold (1572 studies as of August 2015), and incorporates baseline expression profiles of tissues from Human Protein Atlas, GTEx and FANTOM5, and of cancer cell lines from ENCODE, CCLE and Genentech projects. Plant studies constitute a quarter of Atlas data. For genes of interest, the user can view baseline expression in tissues, and differential expression for biologically meaningful pairwise comparisons-estimated using consistent methodology across all of Atlas. Our first proteomics study in human tissues is now displayed alongside transcriptomics data in the same tissues. Novel analyses and visualisations include: 'enrichment' in each differential comparison of GO terms, Reactome, Plant Reactome pathways and InterPro domains; hierarchical clustering (by baseline expression) of most variable genes and experimental conditions; and, for a given gene-condition, distribution of baseline expression across biological replicates.
Subject(s)
Databases, Genetic , Gene Expression Profiling , Plants/metabolism , Proteins/metabolism , Proteomics , Animals , Cell Line, Tumor , Humans , Plants/genetics , User-Computer InterfaceABSTRACT
The International Mouse Phenotyping Consortium (IMPC) is building a catalogue of mammalian gene function by producing and phenotyping a knockout mouse line for every protein-coding gene. To date, the IMPC has generated and characterised 5186 mutant lines. One-third of the lines have been found to be non-viable and over 300 new mouse models of human disease have been identified thus far. While current bioinformatics efforts are focused on translating results to better understand human disease processes, IMPC data also aids understanding genetic function and processes in other species. Here we show, using gorilla genomic data, how genes essential to development in mice can be used to help assess the potentially deleterious impact of gene variants in other species. This type of analyses could be used to select optimal breeders in endangered species to maintain or increase fitness and avoid variants associated to impaired-health phenotypes or loss-of-function mutations in genes of critical importance. We also show, using selected examples from various mammal species, how IMPC data can aid in the identification of candidate genes for studying a condition of interest, deliver information about the mechanisms involved, or support predictions for the function of genes that may play a role in adaptation. With genotyping costs decreasing and the continued improvements of bioinformatics tools, the analyses we demonstrate can be routinely applied.
ABSTRACT
Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of 'baseline' expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful 'contrasts', i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user.
Subject(s)
Databases, Genetic , Gene Expression Profiling , Genomics , Humans , Internet , Oligonucleotide Array Sequence Analysis , Proteins/genetics , Proteins/metabolism , RNA Isoforms/metabolism , Sequence Analysis, RNAABSTRACT
Metabolic diseases are a worldwide problem but the underlying genetic factors and their relevance to metabolic disease remain incompletely understood. Genome-wide research is needed to characterize so-far unannotated mammalian metabolic genes. Here, we generate and analyze metabolic phenotypic data of 2016 knockout mouse strains under the aegis of the International Mouse Phenotyping Consortium (IMPC) and find 974 gene knockouts with strong metabolic phenotypes. 429 of those had no previous link to metabolism and 51 genes remain functionally completely unannotated. We compared human orthologues of these uncharacterized genes in five GWAS consortia and indeed 23 candidate genes are associated with metabolic disease. We further identify common regulatory elements in promoters of candidate genes. As each regulatory element is composed of several transcription factor binding sites, our data reveal an extensive metabolic phenotype-associated network of co-regulated genes. Our systematic mouse phenotype analysis thus paves the way for full functional annotation of the genome.
Subject(s)
Basal Metabolism/genetics , Blood Glucose/metabolism , Body Weight/genetics , Diabetes Mellitus, Type 2/genetics , Obesity/genetics , Oxygen Consumption/genetics , Triglycerides/metabolism , Animals , Area Under Curve , Gene Regulatory Networks , Genome-Wide Association Study , High-Throughput Screening Assays , Humans , Metabolic Diseases/genetics , Mice , Mice, Knockout , PhenotypeABSTRACT
The developmental and physiological complexity of the auditory system is likely reflected in the underlying set of genes involved in auditory function. In humans, over 150 non-syndromic loci have been identified, and there are more than 400 human genetic syndromes with a hearing loss component. Over 100 non-syndromic hearing loss genes have been identified in mouse and human, but we remain ignorant of the full extent of the genetic landscape involved in auditory dysfunction. As part of the International Mouse Phenotyping Consortium, we undertook a hearing loss screen in a cohort of 3006 mouse knockout strains. In total, we identify 67 candidate hearing loss genes. We detect known hearing loss genes, but the vast majority, 52, of the candidate genes were novel. Our analysis reveals a large and unexplored genetic landscape involved with auditory function.The full extent of the genetic basis for hearing impairment is unknown. Here, as part of the International Mouse Phenotyping Consortium, the authors perform a hearing loss screen in 3006 mouse knockout strains and identify 52 new candidate genes for genetic hearing loss.
Subject(s)
Hearing Loss/genetics , Protein Interaction Maps/genetics , Animals , Datasets as Topic , Genetic Testing , Hearing Loss/epidemiology , Hearing Tests , Mice , Mice, Knockout , PhenotypeABSTRACT
We describe the creation process of the Minimum Information Specification for In Situ Hybridization and Immunohistochemistry Experiments (MISFISHIE). Modeled after the existing minimum information specification for microarray data, we created a new specification for gene expression localization experiments, initially to facilitate data sharing within a consortium. After successful use within the consortium, the specification was circulated to members of the wider biomedical research community for comment and refinement. After a period of acquiring many new suggested requirements, it was necessary to enter a final phase of excluding those requirements that were deemed inappropriate as a minimum requirement for all experiments. The full specification will soon be published as a version 1.0 proposal to the community, upon which a more full discussion must take place so that the final specification may be achieved with the involvement of the whole community.
Subject(s)
Computational Biology/standards , Immunohistochemistry/standards , In Situ Hybridization/standards , Computational Biology/methods , Immunohistochemistry/methods , In Situ Hybridization/methodsABSTRACT
We present an extensible software model for the genotype and phenotype community, XGAP. Readers can download a standard XGAP (http://www.xgap.org) or auto-generate a custom version using MOLGENIS with programming interfaces to R-software and web-services or user interfaces for biologists. XGAP has simple load formats for any type of genotype, epigenotype, transcript, protein, metabolite or other phenotype data. Current functionality includes tools ranging from eQTL analysis in mouse to genome-wide association studies in humans.
Subject(s)
Genetic Association Studies/methods , Genomics/methods , Models, Genetic , Software , Animals , Humans , Mice , Quantitative Trait Loci/geneticsABSTRACT
One purpose of the biomedical literature is to report results in sufficient detail that the methods of data collection and analysis can be independently replicated and verified. Here we present reporting guidelines for gene expression localization experiments: the minimum information specification for in situ hybridization and immunohistochemistry experiments (MISFISHIE). MISFISHIE is modeled after the Minimum Information About a Microarray Experiment (MIAME) specification for microarray experiments. Both guidelines define what information should be reported without dictating a format for encoding that information. MISFISHIE describes six types of information to be provided for each experiment: experimental design, biomaterials and treatments, reporters, staining, imaging data and image characterizations. This specification has benefited the consortium within which it was developed and is expected to benefit the wider research community. We welcome feedback from the scientific community to help improve our proposal.