Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
Hum Genomics ; 18(1): 44, 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38685113

RESUMO

BACKGROUND: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. METHODS: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. RESULTS: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. CONCLUSIONS: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.


Assuntos
Doenças Raras , Humanos , Doenças Raras/genética , Doenças Raras/diagnóstico , Genoma Humano/genética , Variação Genética/genética , Biologia Computacional/métodos , Fenótipo
2.
medRxiv ; 2023 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-37577678

RESUMO

Background: A major obstacle faced by rare disease families is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years, and causal variants are identified in under 50%. The Rare Genomes Project (RGP) is a direct-to-participant research study on the utility of genome sequencing (GS) for diagnosis and gene discovery. Families are consented for sharing of sequence and phenotype data with researchers, allowing development of a Critical Assessment of Genome Interpretation (CAGI) community challenge, placing variant prioritization models head-to-head in a real-life clinical diagnostic setting. Methods: Predictors were provided a dataset of phenotype terms and variant calls from GS of 175 RGP individuals (65 families), including 35 solved training set families, with causal variants specified, and 30 test set families (14 solved, 16 unsolved). The challenge tasked teams with identifying the causal variants in as many test set families as possible. Ranked variant predictions were submitted with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on rank position of true positive causal variants and maximum F-measure, based on precision and recall of causal variants across EPCR thresholds. Results: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performing teams recalled the causal variants in up to 13 of 14 solved families by prioritizing high quality variant calls that were rare, predicted deleterious, segregating correctly, and consistent with reported phenotype. In unsolved families, newly discovered diagnostic variants were returned to two families following confirmatory RNA sequencing, and two prioritized novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant, in an unsolved proband with phenotype overlap with asparagine synthetase deficiency. Conclusions: By objective assessment of variant predictions, we provide insights into current state-of-the-art algorithms and platforms for genome sequencing analysis for rare disease diagnosis and explore areas for future optimization. Identification of diagnostic variants in unsolved families promotes synergy between researchers with clinical and computational expertise as a means of advancing the field of clinical genome interpretation.

3.
Blood ; 128(10): 1408-17, 2016 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-27385790

RESUMO

Chronic myelomonocytic leukemia (CMML) is a myelodysplastic/myeloproliferative neoplasm with variable clinical course. To predict the clinical outcome, we previously developed a CMML-specific prognostic scoring system (CPSS) based on clinical parameters and cytogenetics. In this work, we tested the hypothesis that accounting for gene mutations would further improve risk stratification of CMML patients. We therefore sequenced 38 genes to explore the role of somatic mutations in disease phenotype and clinical outcome. Overall, 199 of 214 (93%) CMML patients carried at least 1 somatic mutation. Stepwise linear regression models showed that these mutations accounted for 15% to 24% of variability of clinical phenotype. Based on multivariable Cox regression analyses, cytogenetic abnormalities and mutations in RUNX1, NRAS, SETBP1, and ASXL1 were independently associated with overall survival (OS). Using these parameters, we defined a genetic score that identified 4 categories with significantly different OS and cumulative incidence of leukemic evolution. In multivariable analyses, genetic score, red blood cell transfusion dependency, white blood cell count, and marrow blasts retained independent prognostic value. These parameters were included into a clinical/molecular CPSS (CPSS-Mol) model that identified 4 risk groups with markedly different median OS (from >144 to 18 months, hazard ratio [HR] = 2.69) and cumulative incidence of leukemic evolution (from 0% to 48% at 4 years, HR = 3.84) (P < .001). The CPSS-Mol fully retained its ability to risk stratify in an independent validation cohort of 260 CMML patients. In conclusion, integrating conventional parameters and gene mutations significantly improves risk stratification of CMML patients, providing a robust basis for clinical decision-making and a reliable tool for clinical trials.


Assuntos
Biomarcadores Tumorais/genética , Aberrações Cromossômicas , Leucemia Mielomonocítica Crônica/genética , Mutação/genética , Medição de Risco/métodos , Adulto , Idoso , Idoso de 80 Anos ou mais , Tomada de Decisão Clínica , Estudos de Coortes , Feminino , Seguimentos , Humanos , Leucemia Mielomonocítica Crônica/patologia , Masculino , Pessoa de Meia-Idade , Gradação de Tumores , Fenótipo , Prognóstico , Fatores de Risco , Taxa de Sobrevida , Adulto Jovem
4.
BMC Med Genomics ; 8: 64, 2015 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-26470712

RESUMO

BACKGROUND: While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10's of dollars. RESULTS: We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets. CONCLUSIONS: Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration.


Assuntos
Computação em Nuvem/economia , Análise Custo-Benefício , Técnicas de Genotipagem/economia , Sequenciamento de Nucleotídeos em Larga Escala/economia , Benchmarking , Genômica , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA