Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
Am J Hum Genet ; 109(9): 1605-1619, 2022 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-36007526

RESUMO

Newborn screening (NBS) dramatically improves outcomes in severe childhood disorders by treatment before symptom onset. In many genetic diseases, however, outcomes remain poor because NBS has lagged behind drug development. Rapid whole-genome sequencing (rWGS) is attractive for comprehensive NBS because it concomitantly examines almost all genetic diseases and is gaining acceptance for genetic disease diagnosis in ill newborns. We describe prototypic methods for scalable, parentally consented, feedback-informed NBS and diagnosis of genetic diseases by rWGS and virtual, acute management guidance (NBS-rWGS). Using established criteria and the Delphi method, we reviewed 457 genetic diseases for NBS-rWGS, retaining 388 (85%) with effective treatments. Simulated NBS-rWGS in 454,707 UK Biobank subjects with 29,865 pathogenic or likely pathogenic variants associated with 388 disorders had a true negative rate (specificity) of 99.7% following root cause analysis. In 2,208 critically ill children with suspected genetic disorders and 2,168 of their parents, simulated NBS-rWGS for 388 disorders identified 104 (87%) of 119 diagnoses previously made by rWGS and 15 findings not previously reported (NBS-rWGS negative predictive value 99.6%, true positive rate [sensitivity] 88.8%). Retrospective NBS-rWGS diagnosed 15 children with disorders that had been undetected by conventional NBS. In 43 of the 104 children, had NBS-rWGS-based interventions been started on day of life 5, the Delphi consensus was that symptoms could have been avoided completely in seven critically ill children, mostly in 21, and partially in 13. We invite groups worldwide to refine these NBS-rWGS conditions and join us to prospectively examine clinical utility and cost effectiveness.


Assuntos
Triagem Neonatal , Medicina de Precisão , Criança , Estado Terminal , Testes Genéticos/métodos , Humanos , Recém-Nascido , Triagem Neonatal/métodos , Estudos Retrospectivos
2.
Am J Hum Genet ; 105(4): 719-733, 2019 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-31564432

RESUMO

The second Newborn Sequencing in Genomic Medicine and Public Health study was a randomized, controlled trial of the effectiveness of rapid whole-genome or -exome sequencing (rWGS or rWES, respectively) in seriously ill infants with diseases of unknown etiology. Here we report comparisons of analytic and diagnostic performance. Of 1,248 ill inpatient infants, 578 (46%) had diseases of unknown etiology. 213 infants (37% of those eligible) were enrolled within 96 h of admission. 24 infants (11%) were very ill and received ultra-rapid whole-genome sequencing (urWGS). The remaining infants were randomized, 95 to rWES and 94 to rWGS. The analytic performance of rWGS was superior to rWES, including variants likely to affect protein function, and ClinVar pathogenic/likely pathogenic variants (p < 0.0001). The diagnostic performance of rWGS and rWES were similar (18 diagnoses in 94 infants [19%] versus 19 diagnoses in 95 infants [20%], respectively), as was time to result (median 11.0 versus 11.2 days, respectively). However, the proportion diagnosed by urWGS (11 of 24 [46%]) was higher than rWES/rWGS (p = 0.004) and time to result was less (median 4.6 days, p < 0.0001). The incremental diagnostic yield of reflexing to trio after negative proband analysis was 0.7% (1 of 147). In conclusion, rapid genomic sequencing can be performed as a first-tier diagnostic test in inpatient infants. urWGS had the shortest time to result, which was important in unstable infants, and those in whom a genetic diagnosis was likely to impact immediate management. Further comparison of urWGS and rWES is warranted because genomic technologies and knowledge of variant pathogenicity are evolving rapidly.


Assuntos
Sequenciamento do Exoma , Sequenciamento Completo do Genoma , Testes Genéticos , Humanos , Lactente , Recém-Nascido
4.
Am J Hum Genet ; 99(2): 481-8, 2016 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-27486782

RESUMO

Circulating blood cell counts and indices are important indicators of hematopoietic function and a number of clinical parameters, such as blood oxygen-carrying capacity, inflammation, and hemostasis. By performing whole-exome sequence association analyses of hematologic quantitative traits in 15,459 community-dwelling individuals, followed by in silico replication in up to 52,024 independent samples, we identified two previously undescribed coding variants associated with lower platelet count: a common missense variant in CPS1 (rs1047891, MAF = 0.33, discovery + replication p = 6.38 × 10(-10)) and a rare synonymous variant in GFI1B (rs150813342, MAF = 0.009, discovery + replication p = 1.79 × 10(-27)). By performing CRISPR/Cas9 genome editing in hematopoietic cell lines and follow-up targeted knockdown experiments in primary human hematopoietic stem and progenitor cells, we demonstrate an alternative splicing mechanism by which the GFI1B rs150813342 variant suppresses formation of a GFI1B isoform that preferentially promotes megakaryocyte differentiation and platelet production. These results demonstrate how unbiased studies of natural variation in blood cell traits can provide insight into the regulation of human hematopoiesis.


Assuntos
Processamento Alternativo/genética , Análise Mutacional de DNA , Exoma/genética , Loci Gênicos/genética , Hematopoese/genética , Proteínas Proto-Oncogênicas/genética , Proteínas Repressoras/genética , Plaquetas/citologia , Sistemas CRISPR-Cas , Edição de Genes , Células-Tronco Hematopoéticas/citologia , Humanos , Megacariócitos/citologia , Contagem de Plaquetas
7.
Genome Res ; 24(7): 1180-92, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24899342

RESUMO

Unbiased next-generation sequencing (NGS) approaches enable comprehensive pathogen detection in the clinical microbiology laboratory and have numerous applications for public health surveillance, outbreak investigation, and the diagnosis of infectious diseases. However, practical deployment of the technology is hindered by the bioinformatics challenge of analyzing results accurately and in a clinically relevant timeframe. Here we describe SURPI ("sequence-based ultrarapid pathogen identification"), a computational pipeline for pathogen identification from complex metagenomic NGS data generated from clinical samples, and demonstrate use of the pipeline in the analysis of 237 clinical samples comprising more than 1.1 billion sequences. Deployable on both cloud-based and standalone servers, SURPI leverages two state-of-the-art aligners for accelerated analyses, SNAP and RAPSearch, which are as accurate as existing bioinformatics tools but orders of magnitude faster in performance. In fast mode, SURPI detects viruses and bacteria by scanning data sets of 7-500 million reads in 11 min to 5 h, while in comprehensive mode, all known microorganisms are identified, followed by de novo assembly and protein homology searches for divergent viruses in 50 min to 16 h. SURPI has also directly contributed to real-time microbial diagnosis in acutely ill patients, underscoring its potential key role in the development of unbiased NGS-based clinical assays in infectious diseases that demand rapid turnaround times.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Metagenômica/métodos , Bases de Dados de Ácidos Nucleicos , Humanos , Curva ROC , Reprodutibilidade dos Testes , Software
8.
BMC Bioinformatics ; 17(1): 361, 2016 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-27612449

RESUMO

BACKGROUND: The decreasing costs of sequencing are driving the need for cost effective and real time variant calling of whole genome sequencing data. The scale of these projects are far beyond the capacity of typical computing resources available with most research labs. Other infrastructures like the cloud AWS environment and supercomputers also have limitations due to which large scale joint variant calling becomes infeasible, and infrastructure specific variant calling strategies either fail to scale up to large datasets or abandon joint calling strategies. RESULTS: We present a high throughput framework including multiple variant callers for single nucleotide variant (SNV) calling, which leverages hybrid computing infrastructure consisting of cloud AWS, supercomputers and local high performance computing infrastructures. We present a novel binning approach for large scale joint variant calling and imputation which can scale up to over 10,000 samples while producing SNV callsets with high sensitivity and specificity. As a proof of principle, we present results of analysis on Cohorts for Heart And Aging Research in Genomic Epidemiology (CHARGE) WGS freeze 3 dataset in which joint calling, imputation and phasing of over 5300 whole genome samples was produced in under 6 weeks using four state-of-the-art callers. The callers used were SNPTools, GATK-HaplotypeCaller, GATK-UnifiedGenotyper and GotCloud. We used Amazon AWS, a 4000-core in-house cluster at Baylor College of Medicine, IBM power PC Blue BioU at Rice and Rhea at Oak Ridge National Laboratory (ORNL) for the computation. AWS was used for joint calling of 180 TB of BAM files, and ORNL and Rice supercomputers were used for the imputation and phasing step. All other steps were carried out on the local compute cluster. The entire operation used 5.2 million core hours and only transferred a total of 6 TB of data across the platforms. CONCLUSIONS: Even with increasing sizes of whole genome datasets, ensemble joint calling of SNVs for low coverage data can be accomplished in a scalable, cost effective and fast manner by using heterogeneous computing platforms without compromising on the quality of variants.


Assuntos
Genoma Humano , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Bases de Dados Genéticas , Humanos
9.
Hum Mutat ; 37(3): 231-234, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26670213

RESUMO

As the amount of human genomic sequence available from personal genomes and exomes has increased, so too has the observation of genomic positions having two or more alternative alleles, so-called multiallelic sites. For portions of the haploid genome that are present in more than one copy, including segmental duplications, variation at such multisite variant positions becomes even more complex. Despite the frequency of multiallelic variants, a number of commonly used resources and tools in genomic research and diagnostics do not support these multiallelic variants all together or require special modifications. Here, we explore the frequency of multiallelic sites in large samples with whole exome sequencing and discuss potential outcomes of failing to account for multiple variant alleles. We also briefly discuss some commonly utilized resources that fully support multiallelic sites.


Assuntos
Alelos , Exoma/genética , Genoma Humano/genética , Humanos
10.
BMC Genomics ; 16: 286, 2015 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-25886820

RESUMO

BACKGROUND: Characterizing large genomic variants is essential to expanding the research and clinical applications of genome sequencing. While multiple data types and methods are available to detect these structural variants (SVs), they remain less characterized than smaller variants because of SV diversity, complexity, and size. These challenges are exacerbated by the experimental and computational demands of SV analysis. Here, we characterize the SV content of a personal genome with Parliament, a publicly available consensus SV-calling infrastructure that merges multiple data types and SV detection methods. RESULTS: We demonstrate Parliament's efficacy via integrated analyses of data from whole-genome array comparative genomic hybridization, short-read next-generation sequencing, long-read (Pacific BioSciences RSII), long-insert (Illumina Nextera), and whole-genome architecture (BioNano Irys) data from the personal genome of a single subject (HS1011). From this genome, Parliament identified 31,007 genomic loci between 100 bp and 1 Mbp that are inconsistent with the hg19 reference assembly. Of these loci, 9,777 are supported as putative SVs by hybrid local assembly, long-read PacBio data, or multi-source heuristics. These SVs span 59 Mbp of the reference genome (1.8%) and include 3,801 events identified only with long-read data. The HS1011 data and complete Parliament infrastructure, including a BAM-to-SV workflow, are available on the cloud-based service DNAnexus. CONCLUSIONS: HS1011 SV analysis reveals the limits and advantages of multiple sequencing technologies, specifically the impact of long-read SV discovery. With the full Parliament infrastructure, the HS1011 data constitute a public resource for novel SV discovery, software calibration, and personal genome structural variation analysis.


Assuntos
Genoma Humano , Variação Estrutural do Genoma , Análise de Sequência de DNA/métodos , Biologia Computacional , Bases de Dados Genéticas , Diploide , Humanos , Software
11.
BMC Bioinformatics ; 15: 30, 2014 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-24475911

RESUMO

BACKGROUND: Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. RESULTS: To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. CONCLUSIONS: By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples.


Assuntos
Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Internet , Software , Genoma/genética , Humanos
13.
PLoS Pathog ; 8(9): e1002924, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23028323

RESUMO

Deep sequencing was used to discover a novel rhabdovirus (Bas-Congo virus, or BASV) associated with a 2009 outbreak of 3 human cases of acute hemorrhagic fever in Mangala village, Democratic Republic of Congo (DRC), Africa. The cases, presenting over a 3-week period, were characterized by abrupt disease onset, high fever, mucosal hemorrhage, and, in two patients, death within 3 days. BASV was detected in an acute serum sample from the lone survivor at a concentration of 1.09 × 10(6) RNA copies/mL, and 98.2% of the genome was subsequently de novo assembled from ≈ 140 million sequence reads. Phylogenetic analysis revealed that BASV is highly divergent and shares less than 34% amino acid identity with any other rhabdovirus. High convalescent neutralizing antibody titers of >1:1000 were detected in the survivor and an asymptomatic nurse directly caring for him, both of whom were health care workers, suggesting the potential for human-to-human transmission of BASV. The natural animal reservoir host or arthropod vector and precise mode of transmission for the virus remain unclear. BASV is an emerging human pathogen associated with acute hemorrhagic fever in Africa.


Assuntos
Febres Hemorrágicas Virais/virologia , Infecções por Rhabdoviridae/virologia , Rhabdoviridae , Adolescente , Adulto , Animais , Anticorpos Antivirais/sangue , República Democrática do Congo , Surtos de Doenças , Feminino , Genoma Viral , Febres Hemorrágicas Virais/epidemiologia , Febres Hemorrágicas Virais/transmissão , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Camundongos , Dados de Sequência Molecular , Filogenia , Rhabdoviridae/classificação , Rhabdoviridae/genética , Rhabdoviridae/imunologia , Rhabdoviridae/isolamento & purificação , Infecções por Rhabdoviridae/epidemiologia , Infecções por Rhabdoviridae/patologia , Infecções por Rhabdoviridae/transmissão
14.
JAMA ; 312(18): 1870-9, 2014 Nov 12.
Artigo em Inglês | MEDLINE | ID: mdl-25326635

RESUMO

IMPORTANCE: Clinical whole-exome sequencing is increasingly used for diagnostic evaluation of patients with suspected genetic disorders. OBJECTIVE: To perform clinical whole-exome sequencing and report (1) the rate of molecular diagnosis among phenotypic groups, (2) the spectrum of genetic alterations contributing to disease, and (3) the prevalence of medically actionable incidental findings such as FBN1 mutations causing Marfan syndrome. DESIGN, SETTING, AND PATIENTS: Observational study of 2000 consecutive patients with clinical whole-exome sequencing analyzed between June 2012 and August 2014. Whole-exome sequencing tests were performed at a clinical genetics laboratory in the United States. Results were reported by clinical molecular geneticists certified by the American Board of Medical Genetics and Genomics. Tests were ordered by the patient's physician. The patients were primarily pediatric (1756 [88%]; mean age, 6 years; 888 females [44%], 1101 males [55%], and 11 fetuses [1% gender unknown]), demonstrating diverse clinical manifestations most often including nervous system dysfunction such as developmental delay. MAIN OUTCOMES AND MEASURES: Whole-exome sequencing diagnosis rate overall and by phenotypic category, mode of inheritance, spectrum of genetic events, and reporting of incidental findings. RESULTS: A molecular diagnosis was reported for 504 patients (25.2%) with 58% of the diagnostic mutations not previously reported. Molecular diagnosis rates for each phenotypic category were 143/526 (27.2%; 95% CI, 23.5%-31.2%) for the neurological group, 282/1147 (24.6%; 95% CI, 22.1%-27.2%) for the neurological plus other organ systems group, 30/83 (36.1%; 95% CI, 26.1%-47.5%) for the specific neurological group, and 49/244 (20.1%; 95% CI, 15.6%-25.8%) for the nonneurological group. The Mendelian disease patterns of the 527 molecular diagnoses included 280 (53.1%) autosomal dominant, 181 (34.3%) autosomal recessive (including 5 with uniparental disomy), 65 (12.3%) X-linked, and 1 (0.2%) mitochondrial. Of 504 patients with a molecular diagnosis, 23 (4.6%) had blended phenotypes resulting from 2 single gene defects. About 30% of the positive cases harbored mutations in disease genes reported since 2011. There were 95 medically actionable incidental findings in genes unrelated to the phenotype but with immediate implications for management in 92 patients (4.6%), including 59 patients (3%) with mutations in genes recommended for reporting by the American College of Medical Genetics and Genomics. CONCLUSIONS AND RELEVANCE: Whole-exome sequencing provided a potential molecular diagnosis for 25% of a large cohort of patients referred for evaluation of suspected genetic conditions, including detection of rare genetic events and new mutations contributing to disease. The yield of whole-exome sequencing may offer advantages over traditional molecular diagnostic approaches in certain patients.


Assuntos
Exoma , Doenças Genéticas Inatas/diagnóstico , Técnicas de Diagnóstico Molecular , Análise de Sequência de DNA/métodos , Adolescente , Adulto , Criança , Pré-Escolar , Feminino , Feto , Testes Genéticos , Genômica , Humanos , Achados Incidentais , Lactente , Recém-Nascido , Masculino , Mutação , Fenótipo , Encaminhamento e Consulta
16.
NPJ Genom Med ; 8(1): 5, 2023 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-36788231

RESUMO

Universal newborn screening (NBS) is a highly successful public health intervention. Archived dried bloodspots (DBS) collected for NBS represent a rich resource for population genomic studies. To fully harness this resource in such studies, DBS must yield high-quality genomic DNA (gDNA) for whole genome sequencing (WGS). In this pilot study, we hypothesized that gDNA of sufficient quality and quantity for WGS could be extracted from archived DBS up to 20 years old without PCR (Polymerase Chain Reaction) amplification. We describe simple methods for gDNA extraction and WGS library preparation from several types of DBS. We tested these methods in DBS from 25 individuals who had previously undergone diagnostic, clinical WGS and 29 randomly selected DBS cards collected for NBS from the California State Biobank. While gDNA from DBS had significantly less yield than from EDTA blood from the same individuals, it was of sufficient quality and quantity for WGS without PCR. All samples DBS yielded WGS that met quality control metrics for high-confidence variant calling. Twenty-eight variants of various types that had been reported clinically in 19 samples were recapitulated in WGS from DBS. There were no significant effects of age or paper type on WGS quality. Archived DBS appear to be a suitable sample type for WGS in population genomic studies.

17.
Biochemistry ; 50(13): 2672-82, 2011 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-21348498

RESUMO

The hepatitis delta virus (HDV) ribozyme uses both metal ion and nucleobase catalysis in its cleavage mechanism. A reverse G·U wobble was observed in a recent crystal structure of the precleaved state. This unusual base pair positions a Mg(2+) ion to participate in catalysis. Herein, we used molecular dynamics (MD) and X-ray crystallography to characterize the conformation and metal binding characteristics of this base pair in product and precleaved forms. Beginning with a crystal structure of the product form, we observed formation of the reverse G·U wobble during MD trajectories. We also demonstrated that this base pair is compatible with the diffraction data for the product-bound state. During MD trajectories of the product form, Na(+) ions interacted with the reverse G·U wobble in the RNA active site, and a Mg(2+) ion, introduced in certain trajectories, remained bound at this site. Beginning with a crystal structure of the precleaved form, the reverse G·U wobble with bound Mg(2+) remained intact during MD simulations. When we removed Mg(2+) from the starting precleaved structure, Na(+) ions interacted with the reverse G·U wobble. In support of the computational results, we observed competition between Na(+) and Mg(2+) in the precleaved ribozyme crystallographically. Nonlinear Poisson-Boltzmann calculations revealed a negatively charged patch near the reverse G·U wobble. This anionic pocket likely serves to bind metal ions and to help shift the pK(a) of the catalytic nucleobase, C75. Thus, the reverse G·U wobble motif serves to organize two catalytic elements, a metal ion and catalytic nucleobase, within the active site of the HDV ribozyme.


Assuntos
Domínio Catalítico , Vírus Delta da Hepatite/metabolismo , Magnésio/metabolismo , Domínios e Motivos de Interação entre Proteínas , RNA Catalítico/química , RNA Catalítico/metabolismo , Sódio/metabolismo , Ligação Competitiva , Biocatálise , Bases de Dados de Ácidos Nucleicos , Cinética , Modelos Moleculares , Simulação de Dinâmica Molecular , Conformação de Ácido Nucleico , Distribuição de Poisson , Propriedades de Superfície
18.
NPJ Genom Med ; 6(1): 29, 2021 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-33888711

RESUMO

Congenital heart disease (CHD) is the most common congenital anomaly and a major cause of infant morbidity and mortality. While morbidity and mortality are highest in infants with underlying genetic conditions, molecular diagnoses are ascertained in only ~20% of cases using widely adopted genetic tests. Furthermore, cost of care for children and adults with CHD has increased dramatically. Rapid whole genome sequencing (rWGS) of newborns in intensive care units with suspected genetic diseases has been associated with increased rate of diagnosis and a net reduction in cost of care. In this study, we explored whether the clinical utility of rWGS extends to critically ill infants with structural CHD through a retrospective review of rWGS study data obtained from inpatient infants < 1 year with structural CHD at a regional children's hospital. rWGS diagnosed genetic disease in 46% of the enrolled infants. Moreover, genetic disease was identified five times more frequently with rWGS than microarray ± gene panel testing in 21 of these infants (rWGS diagnosed 43% versus 10% with microarray ± gene panels, p = 0.02). Molecular diagnoses ranged from syndromes affecting multiple organ systems to disorders limited to the cardiovascular system. The average daily hospital spending was lower in the time period post blood collection for rWGS compared to prior (p = 0.003) and further decreased after rWGS results (p = 0.000). The cost was not prohibitive to rWGS implementation in the care of this cohort of infants. rWGS provided timely actionable information that impacted care and there was evidence of decreased hospital spending around rWGS implementation.

19.
Genome Med ; 13(1): 153, 2021 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-34645491

RESUMO

BACKGROUND: Clinical interpretation of genetic variants in the context of the patient's phenotype is becoming the largest component of cost and time expenditure for genome-based diagnosis of rare genetic diseases. Artificial intelligence (AI) holds promise to greatly simplify and speed genome interpretation by integrating predictive methods with the growing knowledge of genetic disease. Here we assess the diagnostic performance of Fabric GEM, a new, AI-based, clinical decision support tool for expediting genome interpretation. METHODS: We benchmarked GEM in a retrospective cohort of 119 probands, mostly NICU infants, diagnosed with rare genetic diseases, who received whole-genome or whole-exome sequencing (WGS, WES). We replicated our analyses in a separate cohort of 60 cases collected from five academic medical centers. For comparison, we also analyzed these cases with current state-of-the-art variant prioritization tools. Included in the comparisons were trio, duo, and singleton cases. Variants underpinning diagnoses spanned diverse modes of inheritance and types, including structural variants (SVs). Patient phenotypes were extracted from clinical notes by two means: manually and using an automated clinical natural language processing (CNLP) tool. Finally, 14 previously unsolved cases were reanalyzed. RESULTS: GEM ranked over 90% of the causal genes among the top or second candidate and prioritized for review a median of 3 candidate genes per case, using either manually curated or CNLP-derived phenotype descriptions. Ranking of trios and duos was unchanged when analyzed as singletons. In 17 of 20 cases with diagnostic SVs, GEM identified the causal SVs as the top candidate and in 19/20 within the top five, irrespective of whether SV calls were provided or inferred ab initio by GEM using its own internal SV detection algorithm. GEM showed similar performance in absence of parental genotypes. Analysis of 14 previously unsolved cases resulted in a novel finding for one case, candidates ultimately not advanced upon manual review for 3 cases, and no new findings for 10 cases. CONCLUSIONS: GEM enabled diagnostic interpretation inclusive of all variant types through automated nomination of a very short list of candidate genes and disorders for final review and reporting. In combination with deep phenotyping by CNLP, GEM enables substantial automation of genetic disease diagnosis, potentially decreasing cost and expediting case review.


Assuntos
Inteligência Artificial , Doenças Raras/genética , Bases de Dados Genéticas , Feminino , Genômica/métodos , Genótipo , Humanos , Masculino , Fenótipo , Estudos Retrospectivos , Sequenciamento do Exoma
20.
NPJ Genom Med ; 5: 33, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32821428

RESUMO

To investigate the diagnostic and clinical utility of a partially automated reanalysis pipeline, forty-eight cases of seriously ill children with suspected genetic disease who did not receive a diagnosis upon initial manual analysis of whole-genome sequencing (WGS) were reanalyzed at least 1 year later. Clinical natural language processing (CNLP) of medical records provided automated, updated patient phenotypes, and an automated analysis system delivered limited lists of possible diagnostic variants for each case. CNLP identified a median of 79 new clinical features per patient at least 1 year later. Compared to a standard manual reanalysis pipeline, the partially automated pipeline reduced the number of variants to be analyzed by 90% (range: 74%-96%). In 2 cases, diagnoses were made upon reinterpretation, representing an incremental diagnostic yield of 4.2% (2/48, 95% CI: 0.5-14.3%). Four additional cases were flagged with a possible diagnosis to be considered during subsequent reanalysis. Separately, copy number analysis led to diagnoses in two cases. Ongoing discovery of new disease genes and refined variant classification necessitate periodic reanalysis of negative WGS cases. The clinical features of patients sequenced as infants evolve rapidly with age. Partially automated reanalysis, including automated re-phenotyping through CNLP, has the potential to identify molecular diagnoses with reduced expert labor intensity.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA