Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 37
Filter
1.
Bioinform Adv ; 4(1): vbae036, 2024.
Article in English | MEDLINE | ID: mdl-38577542

ABSTRACT

Motivation: Graph representation learning is a family of related approaches that learn low-dimensional vector representations of nodes and other graph elements called embeddings. Embeddings approximate characteristics of the graph and can be used for a variety of machine-learning tasks such as novel edge prediction. For many biomedical applications, partial knowledge exists about positive edges that represent relationships between pairs of entities, but little to no knowledge is available about negative edges that represent the explicit lack of a relationship between two nodes. For this reason, classification procedures are forced to assume that the vast majority of unlabeled edges are negative. Existing approaches to sampling negative edges for training and evaluating classifiers do so by uniformly sampling pairs of nodes. Results: We show here that this sampling strategy typically leads to sets of positive and negative examples with imbalanced node degree distributions. Using representative heterogeneous biomedical knowledge graph and random walk-based graph machine learning, we show that this strategy substantially impacts classification performance. If users of graph machine-learning models apply the models to prioritize examples that are drawn from approximately the same distribution as the positive examples are, then performance of models as estimated in the validation phase may be artificially inflated. We present a degree-aware node sampling approach that mitigates this effect and is simple to implement. Availability and implementation: Our code and data are publicly available at https://github.com/monarch-initiative/negativeExampleSelection.

2.
Med ; 4(12): 913-927.e3, 2023 Dec 08.
Article in English | MEDLINE | ID: mdl-37963467

ABSTRACT

BACKGROUND: Navigating the clinical literature to determine the optimal clinical management for rare diseases presents significant challenges. We introduce the Medical Action Ontology (MAxO), an ontology specifically designed to organize medical procedures, therapies, and interventions. METHODS: MAxO incorporates logical structures that link MAxO terms to numerous other ontologies within the OBO Foundry. Term development involves a blend of manual and semi-automated processes. Additionally, we have generated annotations detailing diagnostic modalities for specific phenotypic abnormalities defined by the Human Phenotype Ontology (HPO). We introduce a web application, POET, that facilitates MAxO annotations for specific medical actions for diseases using the Mondo Disease Ontology. FINDINGS: MAxO encompasses 1,757 terms spanning a wide range of biomedical domains, from human anatomy and investigations to the chemical and protein entities involved in biological processes. These terms annotate phenotypic features associated with specific disease (using HPO and Mondo). Presently, there are over 16,000 MAxO diagnostic annotations that target HPO terms. Through POET, we have created 413 MAxO annotations specifying treatments for 189 rare diseases. CONCLUSIONS: MAxO offers a computational representation of treatments and other actions taken for the clinical management of patients. Its development is closely coupled to Mondo and HPO, broadening the scope of our computational modeling of diseases and phenotypic features. We invite the community to contribute disease annotations using POET (https://poet.jax.org/). MAxO is available under the open-source CC-BY 4.0 license (https://github.com/monarch-initiative/MAxO). FUNDING: NHGRI 1U24HG011449-01A1 and NHGRI 5RM1HG010860-04.


Subject(s)
Biological Ontologies , Humans , Rare Diseases , Software , Computer Simulation
3.
Article in English | MEDLINE | ID: mdl-37684057

ABSTRACT

We identified a de novo heterozygous transient receptor potential cation channel subfamily M (melastatin) member 3 (TRPM3) missense variant, p.(Asn1126Asp), in a patient with developmental delay and manifestations of cerebral palsy (CP) using phenotype-driven prioritization analysis of whole-genome sequencing data with Exomiser. The variant is localized in the functionally important ion transport domain of the TRPM3 protein and predicted to impact the protein structure. Our report adds TRPM3 to the list of Mendelian disease-associated genes that can be associated with CP and provides further evidence for the pathogenicity of the variant p.(Asn1126Asp).


Subject(s)
Cerebral Palsy , Intellectual Disability , Nervous System Malformations , TRPM Cation Channels , Humans , Cerebral Palsy/genetics , Intellectual Disability/genetics , Mutation, Missense/genetics , Phenotype , TRPM Cation Channels/genetics
4.
medRxiv ; 2023 Jul 13.
Article in English | MEDLINE | ID: mdl-37503136

ABSTRACT

Navigating the vast landscape of clinical literature to find optimal treatments and management strategies can be a challenging task, especially for rare diseases. To address this task, we introduce the Medical Action Ontology (MAxO), the first ontology specifically designed to organize medical procedures, therapies, and interventions in a structured way. Currently, MAxO contains 1757 medical action terms added through a combination of manual and semi-automated processes. MAxO was developed with logical structures that make it compatible with several other ontologies within the Open Biological and Biomedical Ontologies (OBO) Foundry. These cover a wide range of biomedical domains, from human anatomy and investigations to the chemical and protein entities involved in biological processes. We have created a database of over 16000 annotations that describe diagnostic modalities for specific phenotypic abnormalities as defined by the Human Phenotype Ontology (HPO). Additionally, 413 annotations are provided for medical actions for 189 rare diseases. We have developed a web application called POET (https://poet.jax.org/) for the community to use to contribute MAxO annotations. MAxO provides a computational representation of treatments and other actions taken for the clinical management of patients. The development of MAxO is closely coupled to the Mondo Disease Ontology (Mondo) and the Human Phenotype Ontology (HPO) and expands the scope of our computational modeling of diseases and phenotypic features to include diagnostics and therapeutic actions. MAxO is available under the open-source CC-BY 4.0 license (https://github.com/monarch-initiative/MAxO).

5.
PLoS One ; 18(5): e0285433, 2023.
Article in English | MEDLINE | ID: mdl-37196000

ABSTRACT

The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Phenopacket-tools can be used to validate the syntax and semantics of phenopackets as well as to assess adherence to additional user-defined requirements. The documentation includes examples showing how to use the Java library and the command-line tool to create and validate phenopackets. We demonstrate how to create, convert, and validate phenopackets using the library or the command-line application. Source code, API documentation, comprehensive user guide and a tutorial can be found at https://github.com/phenopackets/phenopacket-tools. The library can be installed from the public Maven Central artifact repository and the application is available as a standalone archive. The phenopacket-tools library helps developers implement and standardize the collection and exchange of phenotypic and other clinical data for use in phenotype-driven genomic diagnostics, translational research, and precision medicine applications.


Subject(s)
Neoplasms , Software , Humans , Genomics , Databases, Factual , Gene Library
6.
Bioinformatics ; 39(4)2023 04 03.
Article in English | MEDLINE | ID: mdl-36929917

ABSTRACT

MOTIVATION: Advances in RNA sequencing technologies have achieved an unprecedented accuracy in the quantification of mRNA isoforms, but our knowledge of isoform-specific functions has lagged behind. There is a need to understand the functional consequences of differential splicing, which could be supported by the generation of accurate and comprehensive isoform-specific gene ontology annotations. RESULTS: We present isoform interpretation, a method that uses expectation-maximization to infer isoform-specific functions based on the relationship between sequence and functional isoform similarity. We predicted isoform-specific functional annotations for 85 617 isoforms of 17 900 protein-coding human genes spanning a range of 17 430 distinct gene ontology terms. Comparison with a gold-standard corpus of manually annotated human isoform functions showed that isoform interpretation significantly outperforms state-of-the-art competing methods. We provide experimental evidence that functionally related isoforms predicted by isoform interpretation show a higher degree of domain sharing and expression correlation than functionally related genes. We also show that isoform sequence similarity correlates better with inferred isoform function than with gene-level function. AVAILABILITY AND IMPLEMENTATION: Source code, documentation, and resource files are freely available under a GNU3 license at https://github.com/TheJacksonLaboratory/isopretEM and https://zenodo.org/record/7594321.


Subject(s)
Motivation , Software , Humans , Protein Isoforms/genetics , Alternative Splicing , Sequence Analysis, RNA
8.
Am J Med Genet C Semin Med Genet ; 190(2): 231-242, 2022 06.
Article in English | MEDLINE | ID: mdl-35872606

ABSTRACT

Technological advances in both genome sequencing and prenatal imaging are increasing our ability to accurately recognize and diagnose Mendelian conditions prenatally. Phenotype-driven early genetic diagnosis of fetal genetic disease can help to strategize treatment options and clinical preventive measures during the perinatal period, to plan in utero therapies, and to inform parental decision-making. Fetal phenotypes of genetic diseases are often unique and at present are not well understood; more comprehensive knowledge about prenatal phenotypes and computational resources have an enormous potential to improve diagnostics and translational research. The Human Phenotype Ontology (HPO) has been widely used to support diagnostics and translational research in human genetics. To better support prenatal usage, the HPO consortium conducted a series of workshops with a group of domain experts in a variety of medical specialties, diagnostic techniques, as well as diseases and phenotypes related to prenatal medicine, including perinatal pathology, musculoskeletal anomalies, neurology, medical genetics, hydrops fetalis, craniofacial malformations, cardiology, neonatal-perinatal medicine, fetal medicine, placental pathology, prenatal imaging, and bioinformatics. We expanded the representation of prenatal phenotypes in HPO by adding 95 new phenotype terms under the Abnormality of prenatal development or birth (HP:0001197) grouping term, and revised definitions, synonyms, and disease annotations for most of the 152 terms that existed before the beginning of this effort. The expansion of prenatal phenotypes in HPO will support phenotype-driven prenatal exome and genome sequencing for precision genetic diagnostics of rare diseases to support prenatal care.


Subject(s)
Computational Biology , Placenta , Infant, Newborn , Humans , Female , Pregnancy , Computational Biology/methods , Phenotype , Rare Diseases , Exome Sequencing
9.
Genet Med ; 24(5): 986-998, 2022 05.
Article in English | MEDLINE | ID: mdl-35101336

ABSTRACT

PURPOSE: Several professional societies have published guidelines for the clinical interpretation of somatic variants, which specifically address diagnostic, prognostic, and therapeutic implications. Although these guidelines for the clinical interpretation of variants include data types that may be used to determine the oncogenicity of a variant (eg, population frequency, functional, and in silico data or somatic frequency), they do not provide a direct, systematic, and comprehensive set of standards and rules to classify the oncogenicity of a somatic variant. This insufficient guidance leads to inconsistent classification of rare somatic variants in cancer, generates variability in their clinical interpretation, and, importantly, affects patient care. Therefore, it is essential to address this unmet need. METHODS: Clinical Genome Resource (ClinGen) Somatic Cancer Clinical Domain Working Group and ClinGen Germline/Somatic Variant Subcommittee, the Cancer Genomics Consortium, and the Variant Interpretation for Cancer Consortium used a consensus approach to develop a standard operating procedure (SOP) for the classification of oncogenicity of somatic variants. RESULTS: This comprehensive SOP has been developed to improve consistency in somatic variant classification and has been validated on 94 somatic variants in 10 common cancer-related genes. CONCLUSION: The comprehensive SOP is now available for classification of oncogenicity of somatic variants.


Subject(s)
Genome, Human , Neoplasms , Genetic Testing/methods , Genetic Variation/genetics , Genome, Human/genetics , Genomics/methods , Humans , Neoplasms/genetics , Virulence
10.
CEUR Workshop Proc ; 3073: 122-127, 2022.
Article in English | MEDLINE | ID: mdl-37324543

ABSTRACT

Ontologies have emerged to become critical to support data and knowledge representation, standardization, integration, and analysis. The SARS-CoV-2 pandemic led to the rapid proliferation of COVID-19 data, as well as the development of many COVID-19 ontologies. In the interest of supporting data interoperability, we initiated a community-based effort to harmonize COVID-19 ontologies. Our effort involves the collaborative discussion among developers of seven COVID-19 related ontologies, and the merging of four ontologies. This effort demonstrates the feasibility of harmonizing these ontologies in an interoperable framework to support integrative representation and analysis of COVID-19 related data and knowledge.

11.
NAR Genom Bioinform ; 3(4): lqab113, 2021 Dec.
Article in English | MEDLINE | ID: mdl-34888523

ABSTRACT

Inhibiting protein kinases (PKs) that cause cancers has been an important topic in cancer therapy for years. So far, almost 8% of >530 PKs have been targeted by FDA-approved medications, and around 150 protein kinase inhibitors (PKIs) have been tested in clinical trials. We present an approach based on natural language processing and machine learning to investigate the relations between PKs and cancers, predicting PKs whose inhibition would be efficacious to treat a certain cancer. Our approach represents PKs and cancers as semantically meaningful 100-dimensional vectors based on word and concept neighborhoods in PubMed abstracts. We use information about phase I-IV trials in ClinicalTrials.gov to construct a training set for random forest classification. Our results with historical data show that associations between PKs and specific cancers can be predicted years in advance with good accuracy. Our tool can be used to predict the relevance of inhibiting PKs for specific cancers and to support the design of well-focused clinical trials to discover novel PKIs for cancer therapy.

12.
EBioMedicine ; 74: 103722, 2021 Dec.
Article in English | MEDLINE | ID: mdl-34839263

ABSTRACT

BACKGROUND: Numerous publications describe the clinical manifestations of post-acute sequelae of SARS-CoV-2 (PASC or "long COVID"), but they are difficult to integrate because of heterogeneous methods and the lack of a standard for denoting the many phenotypic manifestations. Patient-led studies are of particular importance for understanding the natural history of COVID-19, but integration is hampered because they often use different terms to describe the same symptom or condition. This significant disparity in patient versus clinical characterization motivated the proposed ontological approach to specifying manifestations, which will improve capture and integration of future long COVID studies. METHODS: The Human Phenotype Ontology (HPO) is a widely used standard for exchange and analysis of phenotypic abnormalities in human disease but has not yet been applied to the analysis of COVID-19. FUNDING: We identified 303 articles published before April 29, 2021, curated 59 relevant manuscripts that described clinical manifestations in 81 cohorts three weeks or more following acute COVID-19, and mapped 287 unique clinical findings to HPO terms. We present layperson synonyms and definitions that can be used to link patient self-report questionnaires to standard medical terminology. Long COVID clinical manifestations are not assessed consistently across studies, and most manifestations have been reported with a wide range of synonyms by different authors. Across at least 10 cohorts, authors reported 31 unique clinical features corresponding to HPO terms; the most commonly reported feature was Fatigue (median 45.1%) and the least commonly reported was Nausea (median 3.9%), but the reported percentages varied widely between studies. INTERPRETATION: Translating long COVID manifestations into computable HPO terms will improve analysis, data capture, and classification of long COVID patients. If researchers, clinicians, and patients share a common language, then studies can be compared/pooled more effectively. Furthermore, mapping lay terminology to HPO will help patients assist clinicians and researchers in creating phenotypic characterizations that are computationally accessible, thereby improving the stratification, diagnosis, and treatment of long COVID. FUNDING: U24TR002306; UL1TR001439; P30AG024832; GBMF4552; R01HG010067; UL1TR002535; K23HL128909; UL1TR002389; K99GM145411.


Subject(s)
COVID-19/complications , COVID-19/pathology , COVID-19/diagnosis , Humans , SARS-CoV-2 , Post-Acute COVID-19 Syndrome
14.
Am J Hum Genet ; 108(9): 1564-1577, 2021 09 02.
Article in English | MEDLINE | ID: mdl-34289339

ABSTRACT

A critical challenge in genetic diagnostics is the computational assessment of candidate splice variants, specifically the interpretation of nucleotide changes located outside of the highly conserved dinucleotide sequences at the 5' and 3' ends of introns. To address this gap, we developed the Super Quick Information-content Random-forest Learning of Splice variants (SQUIRLS) algorithm. SQUIRLS generates a small set of interpretable features for machine learning by calculating the information-content of wild-type and variant sequences of canonical and cryptic splice sites, assessing changes in candidate splicing regulatory sequences, and incorporating characteristics of the sequence such as exon length, disruptions of the AG exclusion zone, and conservation. We curated a comprehensive collection of disease-associated splice-altering variants at positions outside of the highly conserved AG/GT dinucleotides at the termini of introns. SQUIRLS trains two random-forest classifiers for the donor and for the acceptor and combines their outputs by logistic regression to yield a final score. We show that SQUIRLS transcends previous state-of-the-art accuracy in classifying splice variants as assessed by rank analysis in simulated exomes, and is significantly faster than competing methods. SQUIRLS provides tabular output files for incorporation into diagnostic pipelines for exome and genome analysis, as well as visualizations that contextualize predicted effects of variants on splicing to make it easier to interpret splice variants in diagnostic settings.


Subject(s)
Algorithms , Data Curation/methods , Genetic Diseases, Inborn/genetics , RNA Splice Sites , RNA Splicing , Software , Base Sequence , Computational Biology/methods , Exome , Exons , Genetic Diseases, Inborn/diagnosis , Genetic Diseases, Inborn/pathology , High-Throughput Nucleotide Sequencing , Humans , Introns , Mutation , Exome Sequencing
16.
Nucleic Acids Res ; 49(D1): D1207-D1217, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33264411

ABSTRACT

The Human Phenotype Ontology (HPO, https://hpo.jax.org) was launched in 2008 to provide a comprehensive logical standard to describe and computationally analyze phenotypic abnormalities found in human disease. The HPO is now a worldwide standard for phenotype exchange. The HPO has grown steadily since its inception due to considerable contributions from clinical experts and researchers from a diverse range of disciplines. Here, we present recent major extensions of the HPO for neurology, nephrology, immunology, pulmonology, newborn screening, and other areas. For example, the seizure subontology now reflects the International League Against Epilepsy (ILAE) guidelines and these enhancements have already shown clinical validity. We present new efforts to harmonize computational definitions of phenotypic abnormalities across the HPO and multiple phenotype ontologies used for animal models of disease. These efforts will benefit software such as Exomiser by improving the accuracy and scope of cross-species phenotype matching. The computational modeling strategy used by the HPO to define disease entities and phenotypic features and distinguish between them is explained in detail.We also report on recent efforts to translate the HPO into indigenous languages. Finally, we summarize recent advances in the use of HPO in electronic health record systems.


Subject(s)
Biological Ontologies , Computational Biology/methods , Databases, Factual , Disease/genetics , Genome , Phenotype , Software , Animals , Disease Models, Animal , Genotype , Humans , Infant, Newborn , International Cooperation , Internet , Neonatal Screening/methods , Pharmacogenetics/methods , Terminology as Topic
17.
Am J Hum Genet ; 107(3): 403-417, 2020 09 03.
Article in English | MEDLINE | ID: mdl-32755546

ABSTRACT

Human Phenotype Ontology (HPO)-based analysis has become standard for genomic diagnostics of rare diseases. Current algorithms use a variety of semantic and statistical approaches to prioritize the typically long lists of genes with candidate pathogenic variants. These algorithms do not provide robust estimates of the strength of the predictions beyond the placement in a ranked list, nor do they provide measures of how much any individual phenotypic observation has contributed to the prioritization result. However, given that the overall success rate of genomic diagnostics is only around 25%-50% or less in many cohorts, a good ranking cannot be taken to imply that the gene or disease at rank one is necessarily a good candidate. Here, we present an approach to genomic diagnostics that exploits the likelihood ratio (LR) framework to provide an estimate of (1) the posttest probability of candidate diagnoses, (2) the LR for each observed HPO phenotype, and (3) the predicted pathogenicity of observed genotypes. LIkelihood Ratio Interpretation of Clinical AbnormaLities (LIRICAL) placed the correct diagnosis within the first three ranks in 92.9% of 384 case reports comprising 262 Mendelian diseases, and the correct diagnosis had a mean posttest probability of 67.3%. Simulations show that LIRICAL is robust to many typically encountered forms of genomic and phenomic noise. In summary, LIRICAL provides accurate, clinically interpretable results for phenotype-driven genomic diagnostics.


Subject(s)
Computational Biology , Databases, Genetic , Genomics , Rare Diseases/diagnosis , Algorithms , Exome/genetics , Humans , Phenotype , Rare Diseases/genetics , Software
18.
Orphanet J Rare Dis ; 15(1): 206, 2020 08 12.
Article in English | MEDLINE | ID: mdl-32787960

ABSTRACT

BACKGROUND: Rare diseases are individually rare but globally affect around 6% of the population, and in over 70% of cases are genetically determined. Their rarity translates into a delayed diagnosis, with 25% of patients waiting 5 to 30 years for one. It is essential to raise awareness of patients and clinicians of existing gene and variant-specific therapeutics at the time of diagnosis to avoid that treatment delays add up to the diagnostic odyssey of rare diseases' patients and their families. AIMS: This paper aims to provide guidance and give detailed instructions on how to write homogeneous systematic reviews of rare diseases' treatments in a manner that allows the capture of the results in a computer-accessible form. The published results need to comply with the FAIR guiding principles for scientific data management and stewardship to facilitate the extraction of datasets that are easily transposable into machine-actionable information. The ultimate purpose is the creation of a database of rare disease treatments ("Treatabolome") at gene and variant levels as part of the H2020 research project Solve-RD. RESULTS: Each systematic review follows a written protocol to address one or more rare diseases in which the authors are experts. The bibliographic search strategy requires detailed documentation to allow its replication. Data capture forms should be built to facilitate the filling of a data capture spreadsheet and to record the application of the inclusion and exclusion criteria to each search result. A PRISMA flowchart is required to provide an overview of the processes of search and selection of papers. A separate table condenses the data collected during the Systematic Review, appraised according to their level of evidence. CONCLUSIONS: This paper provides a template that includes the instructions for writing FAIR-compliant systematic reviews of rare diseases' treatments that enables the assembly of a Treatabolome database that complement existing diagnostic and management support tools with treatment awareness data.


Subject(s)
Data Management , Rare Diseases , Humans , Rare Diseases/genetics , Rare Diseases/therapy , Research Design , Systematic Reviews as Topic , Writing
19.
Orphanet J Rare Dis ; 15(1): 40, 2020 02 04.
Article in English | MEDLINE | ID: mdl-32019583

ABSTRACT

BACKGROUND: Defects in the glycosylphosphatidylinositol (GPI) biosynthesis pathway can result in a group of congenital disorders of glycosylation known as the inherited GPI deficiencies (IGDs). To date, defects in 22 of the 29 genes in the GPI biosynthesis pathway have been identified in IGDs. The early phase of the biosynthetic pathway assembles the GPI anchor (Synthesis stage) and the late phase transfers the GPI anchor to a nascent peptide in the endoplasmic reticulum (ER) (Transamidase stage), stabilizes the anchor in the ER membrane using fatty acid remodeling and then traffics the GPI-anchored protein to the cell surface (Remodeling stage). RESULTS: We addressed the hypothesis that disease-associated variants in either the Synthesis stage or Transamidase+Remodeling-stage GPI pathway genes have distinct phenotypic spectra. We reviewed clinical data from 58 publications describing 152 individual patients and encoded the phenotypic information using the Human Phenotype Ontology (HPO). We showed statistically significant differences between the Synthesis and Transamidase+Remodeling Groups in the frequencies of phenotypes in the musculoskeletal system, cleft palate, nose phenotypes, and cognitive disability. Finally, we hypothesized that phenotypic defects in the IGDs are likely to be at least partially related to defective GPI anchoring of their target proteins. Twenty-two of one hundred forty-two proteins that receive a GPI anchor are associated with one or more Mendelian diseases and 12 show some phenotypic overlap with the IGDs, represented by 34 HPO terms. Interestingly, GPC3 and GPC6, members of the glypican family of heparan sulfate proteoglycans bound to the plasma membrane through a covalent GPI linkage, are associated with 25 of these phenotypic abnormalities. CONCLUSIONS: IGDs associated with Synthesis and Transamidase+Remodeling stages of the GPI biosynthesis pathway have significantly different phenotypic spectra. GPC2 and GPC6 genes may represent a GPI target of general disruption to the GPI biosynthesis pathway that contributes to the phenotypes of some IGDs.


Subject(s)
Glycosylphosphatidylinositols , Seizures , Aminoacyltransferases , Glycosylphosphatidylinositols/genetics , Glypicans , Humans , Mutation/genetics , Phenotype
20.
Nucleic Acids Res ; 48(D1): D704-D715, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31701156

ABSTRACT

In biology and biomedicine, relating phenotypic outcomes with genetic variation and environmental factors remains a challenge: patient phenotypes may not match known diseases, candidate variants may be in genes that haven't been characterized, research organisms may not recapitulate human or veterinary diseases, environmental factors affecting disease outcomes are unknown or undocumented, and many resources must be queried to find potentially significant phenotypic associations. The Monarch Initiative (https://monarchinitiative.org) integrates information on genes, variants, genotypes, phenotypes and diseases in a variety of species, and allows powerful ontology-based search. We develop many widely adopted ontologies that together enable sophisticated computational analysis, mechanistic discovery and diagnostics of Mendelian diseases. Our algorithms and tools are widely used to identify animal models of human disease through phenotypic similarity, for differential diagnostics and to facilitate translational research. Launched in 2015, Monarch has grown with regards to data (new organisms, more sources, better modeling); new API and standards; ontologies (new Mondo unified disease ontology, improvements to ontologies such as HPO and uPheno); user interface (a redesigned website); and community development. Monarch data, algorithms and tools are being used and extended by resources such as GA4GH and NCATS Translator, among others, to aid mechanistic discovery and diagnostics.


Subject(s)
Computational Biology/methods , Genotype , Phenotype , Algorithms , Animals , Biological Ontologies , Databases, Genetic , Exome , Genetic Association Studies , Genetic Variation , Genomics , Humans , Internet , Software , Translational Research, Biomedical , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...