Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Hum Genomics ; 18(1): 44, 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38685113

RESUMO

BACKGROUND: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. METHODS: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. RESULTS: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. CONCLUSIONS: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.


Assuntos
Doenças Raras , Humanos , Doenças Raras/genética , Doenças Raras/diagnóstico , Genoma Humano/genética , Variação Genética/genética , Biologia Computacional/métodos , Fenótipo
2.
medRxiv ; 2023 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-37577678

RESUMO

Background: A major obstacle faced by rare disease families is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years, and causal variants are identified in under 50%. The Rare Genomes Project (RGP) is a direct-to-participant research study on the utility of genome sequencing (GS) for diagnosis and gene discovery. Families are consented for sharing of sequence and phenotype data with researchers, allowing development of a Critical Assessment of Genome Interpretation (CAGI) community challenge, placing variant prioritization models head-to-head in a real-life clinical diagnostic setting. Methods: Predictors were provided a dataset of phenotype terms and variant calls from GS of 175 RGP individuals (65 families), including 35 solved training set families, with causal variants specified, and 30 test set families (14 solved, 16 unsolved). The challenge tasked teams with identifying the causal variants in as many test set families as possible. Ranked variant predictions were submitted with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on rank position of true positive causal variants and maximum F-measure, based on precision and recall of causal variants across EPCR thresholds. Results: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performing teams recalled the causal variants in up to 13 of 14 solved families by prioritizing high quality variant calls that were rare, predicted deleterious, segregating correctly, and consistent with reported phenotype. In unsolved families, newly discovered diagnostic variants were returned to two families following confirmatory RNA sequencing, and two prioritized novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant, in an unsolved proband with phenotype overlap with asparagine synthetase deficiency. Conclusions: By objective assessment of variant predictions, we provide insights into current state-of-the-art algorithms and platforms for genome sequencing analysis for rare disease diagnosis and explore areas for future optimization. Identification of diagnostic variants in unsolved families promotes synergy between researchers with clinical and computational expertise as a means of advancing the field of clinical genome interpretation.

3.
BMC Bioinformatics ; 24(1): 294, 2023 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-37479972

RESUMO

BACKGROUND: Identifying variants associated with diseases is a challenging task in medical genetics research. Current studies that prioritize variants within individual genomes generally rely on known variants, evidence from literature and genomes, and patient symptoms and clinical signs. The functionalities of the existing tools, which rank variants based on given patient symptoms and clinical signs, are restricted to the coverage of ontologies such as the Human Phenotype Ontology (HPO). However, most clinicians do not limit themselves to HPO while describing patient symptoms/signs and their associated variants/genes. There is thus a need for an automated tool that can prioritize variants based on freely expressed patient symptoms and clinical signs. RESULTS: STARVar is a Symptom-based Tool for Automatic Ranking of Variants using evidence from literature and genomes. STARVar uses patient symptoms and clinical signs, either linked to HPO or expressed in free text format. It returns a ranked list of variants based on a combined score from two classifiers utilizing evidence from genomics and literature. STARVar improves over related tools on a set of synthetic patients. In addition, we demonstrated its distinct contribution to the domain on another synthetic dataset covering publicly available clinical genotype-phenotype associations by using symptoms and clinical signs expressed in free text format. CONCLUSIONS: STARVar stands as a unique and efficient tool that has the advantage of ranking variants with flexibly expressed patient symptoms in free-form text. Therefore, STARVar can be easily integrated into bioinformatics workflows designed to analyze disease-associated genomes. AVAILABILITY: STARVar is freely available from https://github.com/bio-ontology-research-group/STARVar .


Assuntos
Genômica , Software , Humanos , Fenótipo , Biologia Computacional , Estudos de Associação Genética
4.
Clin Genet ; 99(1): 99-110, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32888189

RESUMO

Pyridoxamine-5'-phosphate oxidase (PNPO) deficiency is an autosomal recessive pyridoxal 5'-phosphate (PLP)-vitamin-responsive epileptic encephalopathy. The emerging feature of PNPO deficiency is the occurrence of refractory seizures in the first year of life. Pre-maturity and fetal distress, combined with neonatal seizures, are other associated key characteristics. The phenotype results from a dependency of PLP which regulates several enzymes in the body. We present the phenotypic and genotypic spectrum of (PNPO) deficiency based on a literature review (2002-2020) of reports (n = 33) of patients with confirmed PNPO deficiency (n = 87). All patients who received PLP (n = 36) showed a clinical response, with a complete dramatic PLP response with seizure cessation observed in 61% of patients. In spite of effective seizure control with PLP, approximately 56% of patients affected with PLP-dependent epilepsy suffer developmental delay/intellectual disability. There is no diagnostic biomarker, and molecular testing required for diagnosis. However, we noted that cerebrospinal fluid (CSF) PLP was low in 81%, CSF glycine was high in 80% and urinary vanillactic acid was high in 91% of the cases. We observed only a weak correlation between the severity of PNPO protein disruption and disease outcomes, indicating the importance of other factors, including seizure onset and time of therapy initiation. We found that pre-maturity, the delay in initiation of PLP therapy and early onset of seizures correlate with a poor neurocognitive outcome. Given the amenability of PNPO to PLP therapy for seizure control, early diagnosis is essential.


Assuntos
Encefalopatias Metabólicas/genética , Epilepsia/genética , Hipóxia-Isquemia Encefálica/genética , Doenças Metabólicas/genética , Piridoxaminafosfato Oxidase/deficiência , Piridoxaminafosfato Oxidase/genética , Convulsões/genética , Encefalopatias Metabólicas/metabolismo , Encefalopatias Metabólicas/fisiopatologia , Epilepsia/fisiopatologia , Humanos , Hipóxia-Isquemia Encefálica/metabolismo , Hipóxia-Isquemia Encefálica/fisiopatologia , Doenças Metabólicas/metabolismo , Doenças Metabólicas/fisiopatologia , Mutação/genética , Fosfato de Piridoxal/genética , Fosfato de Piridoxal/metabolismo , Piridoxaminafosfato Oxidase/metabolismo , Convulsões/metabolismo , Convulsões/fisiopatologia
5.
Front Pediatr ; 8: 569389, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33262960

RESUMO

Background: Cystinuria is an inborn error of metabolism that manifests with renal stones due to defective renal epithelial cell transport of cystine which resulted from pathogenic variants in the SLC3A1 and/or SLC7A9 genes. Among nephrolithiasis diseases, cystinuria is potentially treatable, and further stone formation may be preventable. We report 23 patients who were identified biochemically and genetically to have cystinuria showing the diversity of the phenotype of cystinuria and expanding the genotype by identifying a broad spectrum of mutations. Patients and Methods: This is a multicenter retrospective chart review, where clinical and biochemical data, genetic analysis and the progress of the disease were documented over five years at two centers from 2014 to 2019. Results: Of 23 patients who were identified biochemically and/or genetically to have cystinuria, 14 (62%) were male. Thirteen patients were homozygous, and two were heterozygous for the SLC3A1 gene. Seven were homozygous and one was compound heterozygous for the SLC7A9 gene. We have detected 12 genetic variants including five novel variants. SLC3A1 gene variant c.1400 T > A (p.Met467Lys) is found in 38% of our cohort. Although 21 patients required surgical intervention, none developed ESRD. The number of stone episodes per year varied widely (median frequency of 0.45 stones/ per year, range between 0.06 and 78.2), with no significant difference in stone events per year between sexes (P = 0.73). Conclusion: Despite the high rate of consanguinity in Saudi Arabia, there was a broad spectrum of genetic variants. Most of our patients are homozygous recessive for SLC genes with multiple generations affected which indicates early screening and prevention of disease in these families. Phenotypic heterogeneity is well documented in our cohort even with the same genotype and the first stone episode age was variable but most commonly seen in the first decade of life.

6.
Clin Genet ; 98(6): 555-561, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-32869858

RESUMO

In recent years, several genes have been implicated in the variable disease presentation of global developmental delay (GDD) and intellectual disability (ID). The endoplasmic reticulum membrane protein complex (EMC) family is known to be involved in GDD and ID. Homozygous variants of EMC1 are associated with GDD, scoliosis, and cerebellar atrophy, indicating the relevance of this pathway for neurogenetic disorders. EMC10 is a bone marrow-derived angiogenic growth factor that plays an important role in infarct vascularization and promoting tissue repair. However, this gene has not been previously associated with human disease. Herein, we describe a Saudi family with two individuals segregating a recessive neurodevelopmental disorder. Both of the affected individuals showed mild ID, speech delay, and GDD. Whole-exome sequencing (WES) and Sanger sequencing were performed to identify candidate genes. Further, to elucidate the functional effects of the variant, quantitative real-time PCR (RT-qPCR)-based expression analysis was performed. WES revealed a homozygous splice acceptor site variant (c.679-1G>A) in EMC10 (chromosome 19q13.33) that segregated perfectly within the family. RT-qPCR showed a substantial decrease in the relative EMC10 gene expression in the patients, indicating the pathogenicity of the identified variant. For the first time in the literature, the EMC10 gene variant was associated with mild ID, speech delay, and GDD. Thus, this gene plays a key role in developmental milestones, with the potential to cause neurodevelopmental disorders in humans.


Assuntos
Deficiências do Desenvolvimento/genética , Deficiência Intelectual/genética , Transtornos do Desenvolvimento da Linguagem/genética , Proteínas de Membrana/genética , Adolescente , Criança , Consanguinidade , Deficiências do Desenvolvimento/fisiopatologia , Predisposição Genética para Doença , Homozigoto , Humanos , Deficiência Intelectual/fisiopatologia , Transtornos do Desenvolvimento da Linguagem/fisiopatologia , Masculino , Mutação/genética , Linhagem , Sítios de Splice de RNA/genética , Arábia Saudita/epidemiologia , Sequenciamento do Exoma
7.
BMC Med Genomics ; 13(1): 103, 2020 07 17.
Artigo em Inglês | MEDLINE | ID: mdl-32680510

RESUMO

BACKGROUND: Testing strategies is crucial for genetics clinics and testing laboratories. In this study, we tried to compare the hit rate between solo and trio and trio plus testing and between trio and sibship testing. Finally, we studied the impact of extended family analysis, mainly in complex and unsolved cases. METHODS: Three cohorts were used for this analysis: one cohort to assess the hit rate between solo, trio and trio plus testing, another cohort to examine the impact of the testing strategy of sibship genome vs trio-based analysis, and a third cohort to test the impact of an extended family analysis of up to eight family members to lower the number of candidate variants. RESULTS: The hit rates in solo, trio and trio plus testing were 39, 40, and 41%, respectively. The total number of candidate variants in the sibship testing strategy was 117 variants compared to 59 variants in the trio-based analysis. We noticed that the average number of coding candidate variants in trio-based analysis was 1192 variants and 26,454 noncoding variants, and this number was lowered by 50-75% after adding additional family members, with up to two coding and 66 noncoding homozygous variants only, in families with eight family members. CONCLUSION: There was no difference in the hit rate between solo and extended family members. Trio-based analysis was a better approach than sibship testing, even in a consanguineous population. Finally, each additional family member helped to narrow down the number of variants by 50-75%. Our findings could help clinicians, researchers and testing laboratories select the most cost-effective and appropriate sequencing approach for their patients. Furthermore, using extended family analysis is a very useful tool for complex cases with novel genes.


Assuntos
Consanguinidade , Exoma , Família , Marcadores Genéticos , Predisposição Genética para Doença , Testes Genéticos , Variação Genética , Adulto , Criança , Feminino , Humanos , Masculino , Estudos Retrospectivos , Sequenciamento do Exoma
8.
Orphanet J Rare Dis ; 15(1): 146, 2020 06 11.
Artigo em Inglês | MEDLINE | ID: mdl-32527280

RESUMO

BACKGROUND: Inborn errors of metabolism (IEM) represent a subclass of rare inherited diseases caused by a wide range of defects in metabolic enzymes or their regulation. Of over a thousand characterized IEMs, only about half are understood at the molecular level, and overall the development of treatment and management strategies has proved challenging. An overview of the changing landscape of therapeutic approaches is helpful in assessing strategic patterns in the approach to therapy, but the information is scattered throughout the literature and public data resources. RESULTS: We gathered data on therapeutic strategies for 300 diseases into the Drug Database for Inborn Errors of Metabolism (DDIEM). Therapeutic approaches, including both successful and ineffective treatments, were manually classified by their mechanisms of action using a new ontology. CONCLUSIONS: We present a manually curated, ontologically formalized knowledgebase of drugs, therapeutic procedures, and mitigated phenotypes. DDIEM is freely available through a web interface and for download at http://ddiem.phenomebrowser.net.


Assuntos
Bases de Dados de Produtos Farmacêuticos , Erros Inatos do Metabolismo , Humanos , Fenótipo , Doenças Raras/tratamento farmacológico
9.
J Biomed Semantics ; 11(1): 1, 2020 01 13.
Artigo em Inglês | MEDLINE | ID: mdl-31931870

RESUMO

BACKGROUND: Ontologies are widely used across biology and biomedicine for the annotation of databases. Ontology development is often a manual, time-consuming, and expensive process. Automatic or semi-automatic identification of classes that can be added to an ontology can make ontology development more efficient. RESULTS: We developed a method that uses machine learning and word embeddings to identify words and phrases that are used to refer to an ontology class in biomedical Europe PMC full-text articles. Once labels and synonyms of a class are known, we use machine learning to identify the super-classes of a class. For this purpose, we identify lexical term variants, use word embeddings to capture context information, and rely on automated reasoning over ontologies to generate features, and we use an artificial neural network as classifier. We demonstrate the utility of our approach in identifying terms that refer to diseases in the Human Disease Ontology and to distinguish between different types of diseases. CONCLUSIONS: Our method is capable of discovering labels that refer to a class in an ontology but are not present in an ontology, and it can identify whether a class should be a subclass of some high-level ontology classes. Our approach can therefore be used for the semi-automatic extension and quality control of ontologies. The algorithm, corpora and evaluation datasets are available at https://github.com/bio-ontology-research-group/ontology-extension.


Assuntos
Ontologias Biológicas , Automação , Doença , Humanos , Rede Nervosa
10.
Sci Data ; 6(1): 79, 2019 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-31160594

RESUMO

Understanding the relationship between the pathophysiology of infectious disease, the biology of the causative agent and the development of therapeutic and diagnostic approaches is dependent on the synthesis of a wide range of types of information. Provision of a comprehensive and integrated disease phenotype knowledgebase has the potential to provide novel and orthogonal sources of information for the understanding of infectious agent pathogenesis, and support for research on disease mechanisms. We have developed PathoPhenoDB, a database containing pathogen-to-phenotype associations. PathoPhenoDB relies on manual curation of pathogen-disease relations, on ontology-based text mining as well as manual curation to associate host disease phenotypes with infectious agents. Using Semantic Web technologies, PathoPhenoDB also links to knowledge about drug resistance mechanisms and drugs used in the treatment of infectious diseases. PathoPhenoDB is accessible at http://patho.phenomebrowser.net/ , and the data are freely available through a public SPARQL endpoint.


Assuntos
Doenças Transmissíveis , Interações Hospedeiro-Patógeno , Fenótipo , Bases de Dados Factuais , Humanos , Web Semântica , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA