Pesquisa | Portal de Pesquisa da BVS

1.

The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species.

Putman, Tim E; Schaper, Kevin; Matentzoglu, Nicolas; Rubinetti, Vincent P; Alquaddoomi, Faisal S; Cox, Corey; Caufield, J Harry; Elsarboukh, Glass; Gehrke, Sarah; Hegde, Harshad; Reese, Justin T; Braun, Ian; Bruskiewich, Richard M; Cappelletti, Luca; Carbon, Seth; Caron, Anita R; Chan, Lauren E; Chute, Christopher G; Cortes, Katherina G; De Souza, Vinícius; Fontana, Tommaso; Harris, Nomi L; Hartley, Emily L; Hurwitz, Eric; Jacobsen, Julius O B; Krishnamurthy, Madan; Laraway, Bryan J; McLaughlin, James A; McMurry, Julie A; Moxon, Sierra A T; Mullen, Kathleen R; O'Neil, Shawn T; Shefchek, Kent A; Stefancsik, Ray; Toro, Sabrina; Vasilevsky, Nicole A; Walls, Ramona L; Whetzel, Patricia L; Osumi-Sutherland, David; Smedley, Damian; Robinson, Peter N; Mungall, Christopher J; Haendel, Melissa A; Munoz-Torres, Monica C.

Nucleic Acids Res ; 52(D1): D938-D949, 2024 Jan 05.

Artigo em Inglês | MEDLINE | ID: mdl-38000386

RESUMO

Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch's APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch's data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch's analytic tools by developing a customized plugin for OpenAI's ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app.

Assuntos

Bases de Dados Factuais , Doença , Genes , Fenótipo , Humanos , Internet , Bases de Dados Factuais/normas , Software , Genes/genética , Doença/genética

2.

Interpretable prioritization of splice variants in diagnostic next-generation sequencing.

Danis, Daniel; Jacobsen, Julius O B; Carmody, Leigh C; Gargano, Michael A; McMurry, Julie A; Hegde, Ayushi; Haendel, Melissa A; Valentini, Giorgio; Smedley, Damian; Robinson, Peter N.

Am J Hum Genet ; 108(9): 1564-1577, 2021 09 02.

Artigo em Inglês | MEDLINE | ID: mdl-34289339

RESUMO

A critical challenge in genetic diagnostics is the computational assessment of candidate splice variants, specifically the interpretation of nucleotide changes located outside of the highly conserved dinucleotide sequences at the 5' and 3' ends of introns. To address this gap, we developed the Super Quick Information-content Random-forest Learning of Splice variants (SQUIRLS) algorithm. SQUIRLS generates a small set of interpretable features for machine learning by calculating the information-content of wild-type and variant sequences of canonical and cryptic splice sites, assessing changes in candidate splicing regulatory sequences, and incorporating characteristics of the sequence such as exon length, disruptions of the AG exclusion zone, and conservation. We curated a comprehensive collection of disease-associated splice-altering variants at positions outside of the highly conserved AG/GT dinucleotides at the termini of introns. SQUIRLS trains two random-forest classifiers for the donor and for the acceptor and combines their outputs by logistic regression to yield a final score. We show that SQUIRLS transcends previous state-of-the-art accuracy in classifying splice variants as assessed by rank analysis in simulated exomes, and is significantly faster than competing methods. SQUIRLS provides tabular output files for incorporation into diagnostic pipelines for exome and genome analysis, as well as visualizations that contextualize predicted effects of variants on splicing to make it easier to interpret splice variants in diagnostic settings.

Assuntos

Algoritmos , Curadoria de Dados/métodos , Doenças Genéticas Inatas/genética , Sítios de Splice de RNA , Splicing de RNA , Software , Sequência de Bases , Biologia Computacional/métodos , Exoma , Éxons , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/patologia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Íntrons , Mutação , Sequenciamento do Exoma

3.

Interpretable Clinical Genomics with a Likelihood Ratio Paradigm.

Robinson, Peter N; Ravanmehr, Vida; Jacobsen, Julius O B; Danis, Daniel; Zhang, Xingmin Aaron; Carmody, Leigh C; Gargano, Michael A; Thaxton, Courtney L; Karlebach, Guy; Reese, Justin; Holtgrewe, Manuel; Köhler, Sebastian; McMurry, Julie A; Haendel, Melissa A; Smedley, Damian.

Am J Hum Genet ; 107(3): 403-417, 2020 09 03.

Artigo em Inglês | MEDLINE | ID: mdl-32755546

RESUMO

Human Phenotype Ontology (HPO)-based analysis has become standard for genomic diagnostics of rare diseases. Current algorithms use a variety of semantic and statistical approaches to prioritize the typically long lists of genes with candidate pathogenic variants. These algorithms do not provide robust estimates of the strength of the predictions beyond the placement in a ranked list, nor do they provide measures of how much any individual phenotypic observation has contributed to the prioritization result. However, given that the overall success rate of genomic diagnostics is only around 25%-50% or less in many cohorts, a good ranking cannot be taken to imply that the gene or disease at rank one is necessarily a good candidate. Here, we present an approach to genomic diagnostics that exploits the likelihood ratio (LR) framework to provide an estimate of (1) the posttest probability of candidate diagnoses, (2) the LR for each observed HPO phenotype, and (3) the predicted pathogenicity of observed genotypes. LIkelihood Ratio Interpretation of Clinical AbnormaLities (LIRICAL) placed the correct diagnosis within the first three ranks in 92.9% of 384 case reports comprising 262 Mendelian diseases, and the correct diagnosis had a mean posttest probability of 67.3%. Simulations show that LIRICAL is robust to many typically encountered forms of genomic and phenomic noise. In summary, LIRICAL provides accurate, clinically interpretable results for phenotype-driven genomic diagnostics.

Assuntos

Biologia Computacional , Bases de Dados Genéticas , Genômica , Doenças Raras/diagnóstico , Algoritmos , Exoma/genética , Humanos , Fenótipo , Doenças Raras/genética , Software

4.

Coding long COVID: characterizing a new disease through an ICD-10 lens.

Pfaff, Emily R; Madlock-Brown, Charisse; Baratta, John M; Bhatia, Abhishek; Davis, Hannah; Girvin, Andrew; Hill, Elaine; Kelly, Elizabeth; Kostka, Kristin; Loomba, Johanna; McMurry, Julie A; Wong, Rachel; Bennett, Tellen D; Moffitt, Richard; Chute, Christopher G; Haendel, Melissa.

BMC Med ; 21(1): 58, 2023 02 16.

Artigo em Inglês | MEDLINE | ID: mdl-36793086

RESUMO

BACKGROUND: Naming a newly discovered disease is a difficult process; in the context of the COVID-19 pandemic and the existence of post-acute sequelae of SARS-CoV-2 infection (PASC), which includes long COVID, it has proven especially challenging. Disease definitions and assignment of a diagnosis code are often asynchronous and iterative. The clinical definition and our understanding of the underlying mechanisms of long COVID are still in flux, and the deployment of an ICD-10-CM code for long COVID in the USA took nearly 2 years after patients had begun to describe their condition. Here, we leverage the largest publicly available HIPAA-limited dataset about patients with COVID-19 in the US to examine the heterogeneity of adoption and use of U09.9, the ICD-10-CM code for "Post COVID-19 condition, unspecified." METHODS: We undertook a number of analyses to characterize the N3C population with a U09.9 diagnosis code (n = 33,782), including assessing person-level demographics and a number of area-level social determinants of health; diagnoses commonly co-occurring with U09.9, clustered using the Louvain algorithm; and quantifying medications and procedures recorded within 60 days of U09.9 diagnosis. We stratified all analyses by age group in order to discern differing patterns of care across the lifespan. RESULTS: We established the diagnoses most commonly co-occurring with U09.9 and algorithmically clustered them into four major categories: cardiopulmonary, neurological, gastrointestinal, and comorbid conditions. Importantly, we discovered that the population of patients diagnosed with U09.9 is demographically skewed toward female, White, non-Hispanic individuals, as well as individuals living in areas with low poverty and low unemployment. Our results also include a characterization of common procedures and medications associated with U09.9-coded patients. CONCLUSIONS: This work offers insight into potential subtypes and current practice patterns around long COVID and speaks to the existence of disparities in the diagnosis of patients with long COVID. This latter finding in particular requires further research and urgent remediation.

Assuntos

COVID-19 , Síndrome de COVID-19 Pós-Aguda , Humanos , Feminino , Classificação Internacional de Doenças , Pandemias , COVID-19/diagnóstico , COVID-19/epidemiologia , SARS-CoV-2

5.

The Ontology of Biological Attributes (OBA)-computational traits for the life sciences.

Stefancsik, Ray; Balhoff, James P; Balk, Meghan A; Ball, Robyn L; Bello, Susan M; Caron, Anita R; Chesler, Elissa J; de Souza, Vinicius; Gehrke, Sarah; Haendel, Melissa; Harris, Laura W; Harris, Nomi L; Ibrahim, Arwa; Koehler, Sebastian; Matentzoglu, Nicolas; McMurry, Julie A; Mungall, Christopher J; Munoz-Torres, Monica C; Putman, Tim; Robinson, Peter; Smedley, Damian; Sollis, Elliot; Thessen, Anne E; Vasilevsky, Nicole; Walton, David O; Osumi-Sutherland, David.

Mamm Genome ; 34(3): 364-378, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37076585

RESUMO

Existing phenotype ontologies were originally developed to represent phenotypes that manifest as a character state in relation to a wild-type or other reference. However, these do not include the phenotypic trait or attribute categories required for the annotation of genome-wide association studies (GWAS), Quantitative Trait Loci (QTL) mappings or any population-focussed measurable trait data. The integration of trait and biological attribute information with an ever increasing body of chemical, environmental and biological data greatly facilitates computational analyses and it is also highly relevant to biomedical and clinical applications. The Ontology of Biological Attributes (OBA) is a formalised, species-independent collection of interoperable phenotypic trait categories that is intended to fulfil a data integration role. OBA is a standardised representational framework for observable attributes that are characteristics of biological entities, organisms, or parts of organisms. OBA has a modular design which provides several benefits for users and data integrators, including an automated and meaningful classification of trait terms computed on the basis of logical inferences drawn from domain-specific ontologies for cells, anatomical and other relevant entities. The logical axioms in OBA also provide a previously missing bridge that can computationally link Mendelian phenotypes with GWAS and quantitative traits. The term components in OBA provide semantic links and enable knowledge and data integration across specialised research community boundaries, thereby breaking silos.

Assuntos

Ontologias Biológicas , Disciplinas das Ciências Biológicas , Estudo de Associação Genômica Ampla , Fenótipo

6.

The Human Phenotype Ontology in 2021.

Köhler, Sebastian; Gargano, Michael; Matentzoglu, Nicolas; Carmody, Leigh C; Lewis-Smith, David; Vasilevsky, Nicole A; Danis, Daniel; Balagura, Ganna; Baynam, Gareth; Brower, Amy M; Callahan, Tiffany J; Chute, Christopher G; Est, Johanna L; Galer, Peter D; Ganesan, Shiva; Griese, Matthias; Haimel, Matthias; Pazmandi, Julia; Hanauer, Marc; Harris, Nomi L; Hartnett, Michael J; Hastreiter, Maximilian; Hauck, Fabian; He, Yongqun; Jeske, Tim; Kearney, Hugh; Kindle, Gerhard; Klein, Christoph; Knoflach, Katrin; Krause, Roland; Lagorce, David; McMurry, Julie A; Miller, Jillian A; Munoz-Torres, Monica C; Peters, Rebecca L; Rapp, Christina K; Rath, Ana M; Rind, Shahmir A; Rosenberg, Avi Z; Segal, Michael M; Seidel, Markus G; Smedley, Damian; Talmy, Tomer; Thomas, Yarlalu; Wiafe, Samuel A; Xian, Julie; Yüksel, Zafer; Helbig, Ingo; Mungall, Christopher J; Haendel, Melissa A.

Nucleic Acids Res ; 49(D1): D1207-D1217, 2021 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-33264411

RESUMO

The Human Phenotype Ontology (HPO, https://hpo.jax.org) was launched in 2008 to provide a comprehensive logical standard to describe and computationally analyze phenotypic abnormalities found in human disease. The HPO is now a worldwide standard for phenotype exchange. The HPO has grown steadily since its inception due to considerable contributions from clinical experts and researchers from a diverse range of disciplines. Here, we present recent major extensions of the HPO for neurology, nephrology, immunology, pulmonology, newborn screening, and other areas. For example, the seizure subontology now reflects the International League Against Epilepsy (ILAE) guidelines and these enhancements have already shown clinical validity. We present new efforts to harmonize computational definitions of phenotypic abnormalities across the HPO and multiple phenotype ontologies used for animal models of disease. These efforts will benefit software such as Exomiser by improving the accuracy and scope of cross-species phenotype matching. The computational modeling strategy used by the HPO to define disease entities and phenotypic features and distinguish between them is explained in detail.We also report on recent efforts to translate the HPO into indigenous languages. Finally, we summarize recent advances in the use of HPO in electronic health record systems.

Assuntos

Ontologias Biológicas , Biologia Computacional/métodos , Bases de Dados Factuais , Doença/genética , Genoma , Fenótipo , Software , Animais , Modelos Animais de Doenças , Genótipo , Humanos , Recém-Nascido , Cooperação Internacional , Internet , Triagem Neonatal/métodos , Farmacogenética/métodos , Terminologia como Assunto

7.

Risk factors associated with post-acute sequelae of SARS-CoV-2: an N3C and NIH RECOVER study.

Hill, Elaine L; Mehta, Hemalkumar B; Sharma, Suchetha; Mane, Klint; Singh, Sharad Kumar; Xie, Catherine; Cathey, Emily; Loomba, Johanna; Russell, Seth; Spratt, Heidi; DeWitt, Peter E; Ammar, Nariman; Madlock-Brown, Charisse; Brown, Donald; McMurry, Julie A; Chute, Christopher G; Haendel, Melissa A; Moffitt, Richard; Pfaff, Emily R; Bennett, Tellen D.

BMC Public Health ; 23(1): 2103, 2023 10 25.

Artigo em Inglês | MEDLINE | ID: mdl-37880596

RESUMO

BACKGROUND: More than one-third of individuals experience post-acute sequelae of SARS-CoV-2 infection (PASC, which includes long-COVID). The objective is to identify risk factors associated with PASC/long-COVID diagnosis. METHODS: This was a retrospective case-control study including 31 health systems in the United States from the National COVID Cohort Collaborative (N3C). 8,325 individuals with PASC (defined by the presence of the International Classification of Diseases, version 10 code U09.9 or a long-COVID clinic visit) matched to 41,625 controls within the same health system and COVID index date within ± 45 days of the corresponding case's earliest COVID index date. Measurements of risk factors included demographics, comorbidities, treatment and acute characteristics related to COVID-19. Multivariable logistic regression, random forest, and XGBoost were used to determine the associations between risk factors and PASC. RESULTS: Among 8,325 individuals with PASC, the majority were > 50 years of age (56.6%), female (62.8%), and non-Hispanic White (68.6%). In logistic regression, middle-age categories (40 to 69 years; OR ranging from 2.32 to 2.58), female sex (OR 1.4, 95% CI 1.33-1.48), hospitalization associated with COVID-19 (OR 3.8, 95% CI 3.05-4.73), long (8-30 days, OR 1.69, 95% CI 1.31-2.17) or extended hospital stay (30 + days, OR 3.38, 95% CI 2.45-4.67), receipt of mechanical ventilation (OR 1.44, 95% CI 1.18-1.74), and several comorbidities including depression (OR 1.50, 95% CI 1.40-1.60), chronic lung disease (OR 1.63, 95% CI 1.53-1.74), and obesity (OR 1.23, 95% CI 1.16-1.3) were associated with increased likelihood of PASC diagnosis or care at a long-COVID clinic. Characteristics associated with a lower likelihood of PASC diagnosis or care at a long-COVID clinic included younger age (18 to 29 years), male sex, non-Hispanic Black race, and comorbidities such as substance abuse, cardiomyopathy, psychosis, and dementia. More doctors per capita in the county of residence was associated with an increased likelihood of PASC diagnosis or care at a long-COVID clinic. Our findings were consistent in sensitivity analyses using a variety of analytic techniques and approaches to select controls. CONCLUSIONS: This national study identified important risk factors for PASC diagnosis such as middle age, severe COVID-19 disease, and specific comorbidities. Further clinical and epidemiological research is needed to better understand underlying mechanisms and the potential role of vaccines and therapeutics in altering PASC course.

Assuntos

COVID-19 , SARS-CoV-2 , Pessoa de Meia-Idade , Feminino , Masculino , Humanos , Adulto , Idoso , Adolescente , Adulto Jovem , COVID-19/epidemiologia , Síndrome de COVID-19 Pós-Aguda , Estudos de Casos e Controles , Estudos Retrospectivos , Fatores de Risco , Progressão da Doença

8.

NSAID use and clinical outcomes in COVID-19 patients: a 38-center retrospective cohort study.

Reese, Justin T; Coleman, Ben; Chan, Lauren; Blau, Hannah; Callahan, Tiffany J; Cappelletti, Luca; Fontana, Tommaso; Bradwell, Katie R; Harris, Nomi L; Casiraghi, Elena; Valentini, Giorgio; Karlebach, Guy; Deer, Rachel; McMurry, Julie A; Haendel, Melissa A; Chute, Christopher G; Pfaff, Emily; Moffitt, Richard; Spratt, Heidi; Singh, Jasvinder A; Mungall, Christopher J; Williams, Andrew E; Robinson, Peter N.

Virol J ; 19(1): 84, 2022 05 15.

Artigo em Inglês | MEDLINE | ID: mdl-35570298

RESUMO

BACKGROUND: Non-steroidal anti-inflammatory drugs (NSAIDs) are commonly used to reduce pain, fever, and inflammation but have been associated with complications in community-acquired pneumonia. Observations shortly after the start of the COVID-19 pandemic in 2020 suggested that ibuprofen was associated with an increased risk of adverse events in COVID-19 patients, but subsequent observational studies failed to demonstrate increased risk and in one case showed reduced risk associated with NSAID use. METHODS: A 38-center retrospective cohort study was performed that leveraged the harmonized, high-granularity electronic health record data of the National COVID Cohort Collaborative. A propensity-matched cohort of 19,746 COVID-19 inpatients was constructed by matching cases (treated with NSAIDs at the time of admission) and 19,746 controls (not treated) from 857,061 patients with COVID-19 available for analysis. The primary outcome of interest was COVID-19 severity in hospitalized patients, which was classified as: moderate, severe, or mortality/hospice. Secondary outcomes were acute kidney injury (AKI), extracorporeal membrane oxygenation (ECMO), invasive ventilation, and all-cause mortality at any time following COVID-19 diagnosis. RESULTS: Logistic regression showed that NSAID use was not associated with increased COVID-19 severity (OR: 0.57 95% CI: 0.53-0.61). Analysis of secondary outcomes using logistic regression showed that NSAID use was not associated with increased risk of all-cause mortality (OR 0.51 95% CI: 0.47-0.56), invasive ventilation (OR: 0.59 95% CI: 0.55-0.64), AKI (OR: 0.67 95% CI: 0.63-0.72), or ECMO (OR: 0.51 95% CI: 0.36-0.7). In contrast, the odds ratios indicate reduced risk of these outcomes, but our quantitative bias analysis showed E-values of between 1.9 and 3.3 for these associations, indicating that comparatively weak or moderate confounder associations could explain away the observed associations. CONCLUSIONS: Study interpretation is limited by the observational design. Recording of NSAID use may have been incomplete. Our study demonstrates that NSAID use is not associated with increased COVID-19 severity, all-cause mortality, invasive ventilation, AKI, or ECMO in COVID-19 inpatients. A conservative interpretation in light of the quantitative bias analysis is that there is no evidence that NSAID use is associated with risk of increased severity or the other measured outcomes. Our results confirm and extend analogous findings in previous observational studies using a large cohort of patients drawn from 38 centers in a nationally representative multicenter database.

Assuntos

Injúria Renal Aguda , COVID-19 , Anti-Inflamatórios não Esteroides/efeitos adversos , Teste para COVID-19 , Estudos de Coortes , Humanos , Pandemias , Estudos Retrospectivos

9.

Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources.

Köhler, Sebastian; Carmody, Leigh; Vasilevsky, Nicole; Jacobsen, Julius O B; Danis, Daniel; Gourdine, Jean-Philippe; Gargano, Michael; Harris, Nomi L; Matentzoglu, Nicolas; McMurry, Julie A; Osumi-Sutherland, David; Cipriani, Valentina; Balhoff, James P; Conlin, Tom; Blau, Hannah; Baynam, Gareth; Palmer, Richard; Gratian, Dylan; Dawkins, Hugh; Segal, Michael; Jansen, Anna C; Muaz, Ahmed; Chang, Willie H; Bergerson, Jenna; Laulederkind, Stanley J F; Yüksel, Zafer; Beltran, Sergi; Freeman, Alexandra F; Sergouniotis, Panagiotis I; Durkin, Daniel; Storm, Andrea L; Hanauer, Marc; Brudno, Michael; Bello, Susan M; Sincan, Murat; Rageth, Kayli; Wheeler, Matthew T; Oegema, Renske; Lourghi, Halima; Della Rocca, Maria G; Thompson, Rachel; Castellanos, Francisco; Priest, James; Cunningham-Rundles, Charlotte; Hegde, Ayushi; Lovering, Ruth C; Hajek, Catherine; Olry, Annie; Notarangelo, Luigi; Similuk, Morgan.

Nucleic Acids Res ; 47(D1): D1018-D1027, 2019 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-30476213

RESUMO

The Human Phenotype Ontology (HPO)-a standardized vocabulary of phenotypic abnormalities associated with 7000+ diseases-is used by thousands of researchers, clinicians, informaticians and electronic health record systems around the world. Its detailed descriptions of clinical abnormalities and computable disease definitions have made HPO the de facto standard for deep phenotyping in the field of rare disease. The HPO's interoperability with other ontologies has enabled it to be used to improve diagnostic accuracy by incorporating model organism data. It also plays a key role in the popular Exomiser tool, which identifies potential disease-causing variants from whole-exome or whole-genome sequencing data. Since the HPO was first introduced in 2008, its users have become both more numerous and more diverse. To meet these emerging needs, the project has added new content, language translations, mappings and computational tooling, as well as integrations with external community data. The HPO continues to collaborate with clinical adopters to improve specific areas of the ontology and extend standardized disease descriptions. The newly redesigned HPO website (www.human-phenotype-ontology.org) simplifies browsing terms and exploring clinical features, diseases, and human genes.

Assuntos

Ontologias Biológicas , Biologia Computacional/métodos , Anormalidades Congênitas/genética , Predisposição Genética para Doença/genética , Bases de Conhecimento , Doenças Raras/genética , Anormalidades Congênitas/diagnóstico , Bases de Dados Genéticas , Variação Genética , Humanos , Internet , Fenótipo , Doenças Raras/diagnóstico , Sequenciamento Completo do Genoma/métodos

10.

Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data.

McMurry, Julie A; Juty, Nick; Blomberg, Niklas; Burdett, Tony; Conlin, Tom; Conte, Nathalie; Courtot, Mélanie; Deck, John; Dumontier, Michel; Fellows, Donal K; Gonzalez-Beltran, Alejandra; Gormanns, Philipp; Grethe, Jeffrey; Hastings, Janna; Hériché, Jean-Karim; Hermjakob, Henning; Ison, Jon C; Jimenez, Rafael C; Jupp, Simon; Kunze, John; Laibe, Camille; Le Novère, Nicolas; Malone, James; Martin, Maria Jesus; McEntyre, Johanna R; Morris, Chris; Muilu, Juha; Müller, Wolfgang; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Sariyar, Murat; Snoep, Jacky L; Soiland-Reyes, Stian; Stanford, Natalie J; Swainston, Neil; Washington, Nicole; Williams, Alan R; Wimalaratne, Sarala M; Winfree, Lilly M; Wolstencroft, Katherine; Goble, Carole; Mungall, Christopher J; Haendel, Melissa A; Parkinson, Helen.

PLoS Biol ; 15(6): e2001414, 2017 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-28662064

RESUMO

In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.

Assuntos

Disciplinas das Ciências Biológicas/métodos , Biologia Computacional/métodos , Mineração de Dados/métodos , Design de Software , Software , Disciplinas das Ciências Biológicas/estatística & dados numéricos , Disciplinas das Ciências Biológicas/tendências , Biologia Computacional/tendências , Mineração de Dados/estatística & dados numéricos , Mineração de Dados/tendências , Bases de Dados Factuais/estatística & dados numéricos , Bases de Dados Factuais/tendências , Previsões , Humanos , Internet

11.

Interpretable prioritization of splice variants in diagnostic next-generation sequencing.

Danis, Daniel; Jacobsen, Julius O B; Carmody, Leigh C; Gargano, Michael A; McMurry, Julie A; Hegde, Ayushi; Haendel, Melissa A; Valentini, Giorgio; Smedley, Damian; Robinson, Peter N.

Am J Hum Genet ; 108(11): 2205, 2021 Nov 04.

Artigo em Inglês | MEDLINE | ID: mdl-34739835

12.

A Whole-Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease.

Smedley, Damian; Schubach, Max; Jacobsen, Julius O B; Köhler, Sebastian; Zemojtel, Tomasz; Spielmann, Malte; Jäger, Marten; Hochheiser, Harry; Washington, Nicole L; McMurry, Julie A; Haendel, Melissa A; Mungall, Christopher J; Lewis, Suzanna E; Groza, Tudor; Valentini, Giorgio; Robinson, Peter N.

Am J Hum Genet ; 99(3): 595-606, 2016 09 01.

Artigo em Inglês | MEDLINE | ID: mdl-27569544

RESUMO

The interpretation of non-coding variants still constitutes a major challenge in the application of whole-genome sequencing in Mendelian disease, especially for single-nucleotide and other small non-coding variants. Here we present Genomiser, an analysis framework that is able not only to score the relevance of variation in the non-coding genome, but also to associate regulatory variants to specific Mendelian diseases. Genomiser scores variants through either existing methods such as CADD or a bespoke machine learning method and combines these with allele frequency, regulatory sequences, chromosomal topological domains, and phenotypic relevance to discover variants associated to specific Mendelian disorders. Overall, Genomiser is able to identify causal regulatory variants as the top candidate in 77% of simulated whole genomes, allowing effective detection and discovery of regulatory variants in Mendelian disease.

Assuntos

Algoritmos , Doenças Genéticas Inatas/genética , Genoma Humano/genética , Mutação/genética , Frequência do Gene , Estudo de Associação Genômica Ampla , Humanos , Aprendizado de Máquina , Fases de Leitura Aberta/genética , Fenótipo , Mutação Puntual/genética

13.

The Monarch Initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species.

Mungall, Christopher J; McMurry, Julie A; Köhler, Sebastian; Balhoff, James P; Borromeo, Charles; Brush, Matthew; Carbon, Seth; Conlin, Tom; Dunn, Nathan; Engelstad, Mark; Foster, Erin; Gourdine, J P; Jacobsen, Julius O B; Keith, Dan; Laraway, Bryan; Lewis, Suzanna E; NguyenXuan, Jeremy; Shefchek, Kent; Vasilevsky, Nicole; Yuan, Zhou; Washington, Nicole; Hochheiser, Harry; Groza, Tudor; Smedley, Damian; Robinson, Peter N; Haendel, Melissa A.

Nucleic Acids Res ; 45(D1): D712-D722, 2017 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-27899636

RESUMO

The correlation of phenotypic outcomes with genetic variation and environmental factors is a core pursuit in biology and biomedicine. Numerous challenges impede our progress: patient phenotypes may not match known diseases, candidate variants may be in genes that have not been characterized, model organisms may not recapitulate human or veterinary diseases, filling evolutionary gaps is difficult, and many resources must be queried to find potentially significant genotype-phenotype associations. Non-human organisms have proven instrumental in revealing biological mechanisms. Advanced informatics tools can identify phenotypically relevant disease models in research and diagnostic contexts. Large-scale integration of model organism and clinical research data can provide a breadth of knowledge not available from individual sources and can provide contextualization of data back to these sources. The Monarch Initiative (monarchinitiative.org) is a collaborative, open science effort that aims to semantically integrate genotype-phenotype data from many species and sources in order to support precision medicine, disease modeling, and mechanistic exploration. Our integrated knowledge graph, analytic tools, and web services enable diverse users to explore relationships between phenotypes and genotypes across species.

Assuntos

Bases de Dados Genéticas , Estudos de Associação Genética/métodos , Genótipo , Fenótipo , Animais , Evolução Biológica , Biologia Computacional/métodos , Curadoria de Dados , Humanos , Ferramenta de Busca , Software , Especificidade da Espécie , Interface Usuário-Computador , Navegador

14.

The Human Phenotype Ontology in 2017.

Köhler, Sebastian; Vasilevsky, Nicole A; Engelstad, Mark; Foster, Erin; McMurry, Julie; Aymé, Ségolène; Baynam, Gareth; Bello, Susan M; Boerkoel, Cornelius F; Boycott, Kym M; Brudno, Michael; Buske, Orion J; Chinnery, Patrick F; Cipriani, Valentina; Connell, Laureen E; Dawkins, Hugh J S; DeMare, Laura E; Devereau, Andrew D; de Vries, Bert B A; Firth, Helen V; Freson, Kathleen; Greene, Daniel; Hamosh, Ada; Helbig, Ingo; Hum, Courtney; Jähn, Johanna A; James, Roger; Krause, Roland; F Laulederkind, Stanley J; Lochmüller, Hanns; Lyon, Gholson J; Ogishima, Soichi; Olry, Annie; Ouwehand, Willem H; Pontikos, Nikolas; Rath, Ana; Schaefer, Franz; Scott, Richard H; Segal, Michael; Sergouniotis, Panagiotis I; Sever, Richard; Smith, Cynthia L; Straub, Volker; Thompson, Rachel; Turner, Catherine; Turro, Ernest; Veltman, Marijcke W M; Vulliamy, Tom; Yu, Jing; von Ziegenweidt, Julie.

Nucleic Acids Res ; 45(D1): D865-D876, 2017 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-27899602

RESUMO

Deep phenotyping has been defined as the precise and comprehensive analysis of phenotypic abnormalities in which the individual components of the phenotype are observed and described. The three components of the Human Phenotype Ontology (HPO; www.human-phenotype-ontology.org) project are the phenotype vocabulary, disease-phenotype annotations and the algorithms that operate on these. These components are being used for computational deep phenotyping and precision medicine as well as integration of clinical data into translational research. The HPO is being increasingly adopted as a standard for phenotypic abnormalities by diverse groups such as international rare disease organizations, registries, clinical labs, biomedical resources, and clinical software tools and will thereby contribute toward nascent efforts at global data exchange for identifying disease etiologies. This update article reviews the progress of the HPO project since the debut Nucleic Acids Research database article in 2014, including specific areas of expansion such as common (complex) disease, new algorithms for phenotype driven genomic discovery and diagnostics, integration of cross-species mapping efforts with the Mammalian Phenotype Ontology, an improved quality control pipeline, and the addition of patient-friendly terminology.

Assuntos

Ontologias Biológicas , Biologia Computacional , Genômica , Fenótipo , Algoritmos , Biologia Computacional/métodos , Estudos de Associação Genética/métodos , Genômica/métodos , Humanos , Medicina de Precisão/métodos , Doenças Raras/diagnóstico , Doenças Raras/etiologia , Software , Pesquisa Translacional Biomédica/métodos

15.

UVB Radiation Alone May Not Explain Sunlight Inactivation of SARS-CoV-2.

Luzzatto-Fegiz, Paolo; Temprano-Coleto, Fernando; Peaudecerf, François J; Landel, Julien R; Zhu, Yangying; McMurry, Julie A.

J Infect Dis ; 223(8): 1500-1502, 2021 04 23.

Artigo em Inglês | MEDLINE | ID: mdl-33544845

Assuntos

COVID-19 , Luz Solar , Humanos , SARS-CoV-2 , Raios Ultravioleta

16.

Expression Atlas update--a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments.

Petryszak, Robert; Burdett, Tony; Fiorelli, Benedetto; Fonseca, Nuno A; Gonzalez-Porta, Mar; Hastings, Emma; Huber, Wolfgang; Jupp, Simon; Keays, Maria; Kryvych, Nataliya; McMurry, Julie; Marioni, John C; Malone, James; Megy, Karine; Rustici, Gabriella; Tang, Amy Y; Taubert, Jan; Williams, Eleanor; Mannion, Oliver; Parkinson, Helen E; Brazma, Alvis.

Nucleic Acids Res ; 42(Database issue): D926-32, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24304889

RESUMO

Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of 'baseline' expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful 'contrasts', i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user.

Assuntos

Bases de Dados Genéticas , Perfilação da Expressão Gênica , Genômica , Humanos , Internet , Análise de Sequência com Séries de Oligonucleotídeos , Proteínas/genética , Proteínas/metabolismo , Isoformas de RNA/metabolismo , Análise de Sequência de RNA

17.

Sharing Clinical and Genomic Data on Cancer - The Need for Global Solutions.

Lawler, Mark; Haussler, David; Siu, Lillian L; Haendel, Melissa A; McMurry, Julie A; Knoppers, Bartha M; Chanock, Stephen J; Calvo, Fabien; The, Bin T; Walia, Guneet; Banks, Ian; Yu, Peter P; Staudt, Louis M; Sawyers, Charles L.

N Engl J Med ; 376(21): 2006-2009, 2017 05 25.

Artigo em Inglês | MEDLINE | ID: mdl-28538124

Assuntos

Genômica , Disseminação de Informação , Cooperação Internacional , Neoplasias/genética , Humanos , Disseminação de Informação/ética , Disseminação de Informação/legislação & jurisprudência

18.

Finding Long-COVID: Temporal Topic Modeling of Electronic Health Records from the N3C and RECOVER Programs.

O'Neil, Shawn T; Madlock-Brown, Charisse; Wilkins, Kenneth J; McGrath, Brenda M; Davis, Hannah E; Assaf, Gina S; Wei, Hannah; Zareie, Parya; French, Evan T; Loomba, Johanna; McMurry, Julie A; Zhou, Andrea; Chute, Christopher G; Moffitt, Richard A; Pfaff, Emily R; Yoo, Yun Jae; Leese, Peter; Chew, Robert F; Lieberman, Michael; Haendel, Melissa A.

medRxiv ; 2024 Jun 11.

Artigo em Inglês | MEDLINE | ID: mdl-38947087

RESUMO

Post-Acute Sequelae of SARS-CoV-2 infection (PASC), also known as Long-COVID, encompasses a variety of complex and varied outcomes following COVID-19 infection that are still poorly understood. We clustered over 600 million condition diagnoses from 14 million patients available through the National COVID Cohort Collaborative (N3C), generating hundreds of highly detailed clinical phenotypes. Assessing patient clinical trajectories using these clusters allowed us to identify individual conditions and phenotypes strongly increased after acute infection. We found many conditions increased in COVID-19 patients compared to controls, and using a novel method to associate patients with clusters over time, we additionally found phenotypes specific to patient sex, age, wave of infection, and PASC diagnosis status. While many of these results reflect known PASC symptoms, the resolution provided by this unprecedented data scale suggests avenues for improved diagnostics and mechanistic understanding of this multifaceted disease.

19.

The Environmental Conditions, Treatments, and Exposures Ontology (ECTO): connecting toxicology and exposure to human health and beyond.

Chan, Lauren E; Thessen, Anne E; Duncan, William D; Matentzoglu, Nicolas; Schmitt, Charles; Grondin, Cynthia J; Vasilevsky, Nicole; McMurry, Julie A; Robinson, Peter N; Mungall, Christopher J; Haendel, Melissa A.

J Biomed Semantics ; 14(1): 3, 2023 02 24.

Artigo em Inglês | MEDLINE | ID: mdl-36823605

RESUMO

BACKGROUND: Evaluating the impact of environmental exposures on organism health is a key goal of modern biomedicine and is critically important in an age of greater pollution and chemicals in our environment. Environmental health utilizes many different research methods and generates a variety of data types. However, to date, no comprehensive database represents the full spectrum of environmental health data. Due to a lack of interoperability between databases, tools for integrating these resources are needed. In this manuscript we present the Environmental Conditions, Treatments, and Exposures Ontology (ECTO), a species-agnostic ontology focused on exposure events that occur as a result of natural and experimental processes, such as diet, work, or research activities. ECTO is intended for use in harmonizing environmental health data resources to support cross-study integration and inference for mechanism discovery. METHODS AND FINDINGS: ECTO is an ontology designed for describing organismal exposures such as toxicological research, environmental variables, dietary features, and patient-reported data from surveys. ECTO utilizes the base model established within the Exposure Ontology (ExO). ECTO is developed using a combination of manual curation and Dead Simple OWL Design Patterns (DOSDP), and contains over 2700 environmental exposure terms, and incorporates chemical and environmental ontologies. ECTO is an Open Biological and Biomedical Ontology (OBO) Foundry ontology that is designed for interoperability, reuse, and axiomatization with other ontologies. ECTO terms have been utilized in axioms within the Mondo Disease Ontology to represent diseases caused or influenced by environmental factors, as well as for survey encoding for the Personalized Environment and Genes Study (PEGS). CONCLUSIONS: We constructed ECTO to meet Open Biological and Biomedical Ontology (OBO) Foundry principles to increase translation opportunities between environmental health and other areas of biology. ECTO has a growing community of contributors consisting of toxicologists, public health epidemiologists, and health care providers to provide the necessary expertise for areas that have been identified previously as gaps.

Assuntos

Ontologias Biológicas , Humanos , Bases de Dados Factuais

20.

The N3C governance ecosystem: A model socio-technical partnership for the future of collaborative analytics at scale.

Suver, Christine; Harper, Jeremy; Loomba, Johanna; Saltz, Mary; Solway, Julian; Anzalone, Alfred Jerrod; Walters, Kellie; Pfaff, Emily; Walden, Anita; McMurry, Julie; Chute, Christopher G; Haendel, Melissa.

J Clin Transl Sci ; 7(1): e252, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-38229902

RESUMO

The National COVID Cohort Collaborative (N3C) is a public-private-government partnership established during the Coronavirus pandemic to create a centralized data resource called the "N3C data enclave." This resource contains individual-level health data from participating healthcare sites nationwide to support rapid collaborative analytics. N3C has enabled analytics within a cloud-based enclave of data from electronic health records from over 17 million people (with and without COVID-19) in the USA. To achieve this goal of a shared data resource, N3C implemented a shared governance strategy involving stakeholders in decision-making. The approach leveraged best practices in data stewardship and team science to rapidly enable COVID-19-related research at scale while respecting the privacy of data subjects and participating institutions. N3C balanced equitable access to data, team-based scientific productivity, and individual professional recognition - a key incentive for academic researchers. This governance approach makes N3C research sustainable and effective beyond the initial days of the pandemic. N3C demonstrated that shared governance can overcome traditional barriers to data sharing without compromising data security and trust. The governance innovations described herein are a helpful framework for other privacy-preserving data infrastructure programs and provide a working model for effective team science beyond COVID-19.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA