RESUMO
Universal newborn screening (NBS) is a highly successful public health intervention. Archived dried bloodspots (DBS) collected for NBS represent a rich resource for population genomic studies. To fully harness this resource in such studies, DBS must yield high-quality genomic DNA (gDNA) for whole genome sequencing (WGS). In this pilot study, we hypothesized that gDNA of sufficient quality and quantity for WGS could be extracted from archived DBS up to 20 years old without PCR (Polymerase Chain Reaction) amplification. We describe simple methods for gDNA extraction and WGS library preparation from several types of DBS. We tested these methods in DBS from 25 individuals who had previously undergone diagnostic, clinical WGS and 29 randomly selected DBS cards collected for NBS from the California State Biobank. While gDNA from DBS had significantly less yield than from EDTA blood from the same individuals, it was of sufficient quality and quantity for WGS without PCR. All samples DBS yielded WGS that met quality control metrics for high-confidence variant calling. Twenty-eight variants of various types that had been reported clinically in 19 samples were recapitulated in WGS from DBS. There were no significant effects of age or paper type on WGS quality. Archived DBS appear to be a suitable sample type for WGS in population genomic studies.
RESUMO
Newborn screening (NBS) dramatically improves outcomes in severe childhood disorders by treatment before symptom onset. In many genetic diseases, however, outcomes remain poor because NBS has lagged behind drug development. Rapid whole-genome sequencing (rWGS) is attractive for comprehensive NBS because it concomitantly examines almost all genetic diseases and is gaining acceptance for genetic disease diagnosis in ill newborns. We describe prototypic methods for scalable, parentally consented, feedback-informed NBS and diagnosis of genetic diseases by rWGS and virtual, acute management guidance (NBS-rWGS). Using established criteria and the Delphi method, we reviewed 457 genetic diseases for NBS-rWGS, retaining 388 (85%) with effective treatments. Simulated NBS-rWGS in 454,707 UK Biobank subjects with 29,865 pathogenic or likely pathogenic variants associated with 388 disorders had a true negative rate (specificity) of 99.7% following root cause analysis. In 2,208 critically ill children with suspected genetic disorders and 2,168 of their parents, simulated NBS-rWGS for 388 disorders identified 104 (87%) of 119 diagnoses previously made by rWGS and 15 findings not previously reported (NBS-rWGS negative predictive value 99.6%, true positive rate [sensitivity] 88.8%). Retrospective NBS-rWGS diagnosed 15 children with disorders that had been undetected by conventional NBS. In 43 of the 104 children, had NBS-rWGS-based interventions been started on day of life 5, the Delphi consensus was that symptoms could have been avoided completely in seven critically ill children, mostly in 21, and partially in 13. We invite groups worldwide to refine these NBS-rWGS conditions and join us to prospectively examine clinical utility and cost effectiveness.
Assuntos
Triagem Neonatal , Medicina de Precisão , Criança , Estado Terminal , Testes Genéticos/métodos , Humanos , Recém-Nascido , Triagem Neonatal/métodos , Estudos RetrospectivosRESUMO
BACKGROUND: Clinical interpretation of genetic variants in the context of the patient's phenotype is becoming the largest component of cost and time expenditure for genome-based diagnosis of rare genetic diseases. Artificial intelligence (AI) holds promise to greatly simplify and speed genome interpretation by integrating predictive methods with the growing knowledge of genetic disease. Here we assess the diagnostic performance of Fabric GEM, a new, AI-based, clinical decision support tool for expediting genome interpretation. METHODS: We benchmarked GEM in a retrospective cohort of 119 probands, mostly NICU infants, diagnosed with rare genetic diseases, who received whole-genome or whole-exome sequencing (WGS, WES). We replicated our analyses in a separate cohort of 60 cases collected from five academic medical centers. For comparison, we also analyzed these cases with current state-of-the-art variant prioritization tools. Included in the comparisons were trio, duo, and singleton cases. Variants underpinning diagnoses spanned diverse modes of inheritance and types, including structural variants (SVs). Patient phenotypes were extracted from clinical notes by two means: manually and using an automated clinical natural language processing (CNLP) tool. Finally, 14 previously unsolved cases were reanalyzed. RESULTS: GEM ranked over 90% of the causal genes among the top or second candidate and prioritized for review a median of 3 candidate genes per case, using either manually curated or CNLP-derived phenotype descriptions. Ranking of trios and duos was unchanged when analyzed as singletons. In 17 of 20 cases with diagnostic SVs, GEM identified the causal SVs as the top candidate and in 19/20 within the top five, irrespective of whether SV calls were provided or inferred ab initio by GEM using its own internal SV detection algorithm. GEM showed similar performance in absence of parental genotypes. Analysis of 14 previously unsolved cases resulted in a novel finding for one case, candidates ultimately not advanced upon manual review for 3 cases, and no new findings for 10 cases. CONCLUSIONS: GEM enabled diagnostic interpretation inclusive of all variant types through automated nomination of a very short list of candidate genes and disorders for final review and reporting. In combination with deep phenotyping by CNLP, GEM enables substantial automation of genetic disease diagnosis, potentially decreasing cost and expediting case review.
Assuntos
Inteligência Artificial , Doenças Raras/genética , Bases de Dados Genéticas , Feminino , Genômica/métodos , Genótipo , Humanos , Masculino , Fenótipo , Estudos Retrospectivos , Sequenciamento do ExomaAssuntos
Encefalopatias/genética , Sequenciamento de Nucleotídeos em Larga Escala , Erros Inatos do Metabolismo/genética , Sequenciamento Completo do Genoma/métodos , Encéfalo/diagnóstico por imagem , Encefalopatias/congênito , Humanos , Lactente , Masculino , Erros Inatos do Metabolismo/complicações , Erros Inatos do Metabolismo/diagnóstico , Medicina de Precisão , Fatores de Tempo , Tomografia Computadorizada por Raios XRESUMO
Congenital heart disease (CHD) is the most common congenital anomaly and a major cause of infant morbidity and mortality. While morbidity and mortality are highest in infants with underlying genetic conditions, molecular diagnoses are ascertained in only ~20% of cases using widely adopted genetic tests. Furthermore, cost of care for children and adults with CHD has increased dramatically. Rapid whole genome sequencing (rWGS) of newborns in intensive care units with suspected genetic diseases has been associated with increased rate of diagnosis and a net reduction in cost of care. In this study, we explored whether the clinical utility of rWGS extends to critically ill infants with structural CHD through a retrospective review of rWGS study data obtained from inpatient infants < 1 year with structural CHD at a regional children's hospital. rWGS diagnosed genetic disease in 46% of the enrolled infants. Moreover, genetic disease was identified five times more frequently with rWGS than microarray ± gene panel testing in 21 of these infants (rWGS diagnosed 43% versus 10% with microarray ± gene panels, p = 0.02). Molecular diagnoses ranged from syndromes affecting multiple organ systems to disorders limited to the cardiovascular system. The average daily hospital spending was lower in the time period post blood collection for rWGS compared to prior (p = 0.003) and further decreased after rWGS results (p = 0.000). The cost was not prohibitive to rWGS implementation in the care of this cohort of infants. rWGS provided timely actionable information that impacted care and there was evidence of decreased hospital spending around rWGS implementation.
RESUMO
To investigate the diagnostic and clinical utility of a partially automated reanalysis pipeline, forty-eight cases of seriously ill children with suspected genetic disease who did not receive a diagnosis upon initial manual analysis of whole-genome sequencing (WGS) were reanalyzed at least 1 year later. Clinical natural language processing (CNLP) of medical records provided automated, updated patient phenotypes, and an automated analysis system delivered limited lists of possible diagnostic variants for each case. CNLP identified a median of 79 new clinical features per patient at least 1 year later. Compared to a standard manual reanalysis pipeline, the partially automated pipeline reduced the number of variants to be analyzed by 90% (range: 74%-96%). In 2 cases, diagnoses were made upon reinterpretation, representing an incremental diagnostic yield of 4.2% (2/48, 95% CI: 0.5-14.3%). Four additional cases were flagged with a possible diagnosis to be considered during subsequent reanalysis. Separately, copy number analysis led to diagnoses in two cases. Ongoing discovery of new disease genes and refined variant classification necessitate periodic reanalysis of negative WGS cases. The clinical features of patients sequenced as infants evolve rapidly with age. Partially automated reanalysis, including automated re-phenotyping through CNLP, has the potential to identify molecular diagnoses with reduced expert labor intensity.
RESUMO
The second Newborn Sequencing in Genomic Medicine and Public Health study was a randomized, controlled trial of the effectiveness of rapid whole-genome or -exome sequencing (rWGS or rWES, respectively) in seriously ill infants with diseases of unknown etiology. Here we report comparisons of analytic and diagnostic performance. Of 1,248 ill inpatient infants, 578 (46%) had diseases of unknown etiology. 213 infants (37% of those eligible) were enrolled within 96 h of admission. 24 infants (11%) were very ill and received ultra-rapid whole-genome sequencing (urWGS). The remaining infants were randomized, 95 to rWES and 94 to rWGS. The analytic performance of rWGS was superior to rWES, including variants likely to affect protein function, and ClinVar pathogenic/likely pathogenic variants (p < 0.0001). The diagnostic performance of rWGS and rWES were similar (18 diagnoses in 94 infants [19%] versus 19 diagnoses in 95 infants [20%], respectively), as was time to result (median 11.0 versus 11.2 days, respectively). However, the proportion diagnosed by urWGS (11 of 24 [46%]) was higher than rWES/rWGS (p = 0.004) and time to result was less (median 4.6 days, p < 0.0001). The incremental diagnostic yield of reflexing to trio after negative proband analysis was 0.7% (1 of 147). In conclusion, rapid genomic sequencing can be performed as a first-tier diagnostic test in inpatient infants. urWGS had the shortest time to result, which was important in unstable infants, and those in whom a genetic diagnosis was likely to impact immediate management. Further comparison of urWGS and rWES is warranted because genomic technologies and knowledge of variant pathogenicity are evolving rapidly.
Assuntos
Sequenciamento do Exoma , Sequenciamento Completo do Genoma , Testes Genéticos , Humanos , Lactente , Recém-NascidoRESUMO
Genetic disorders are a leading cause of morbidity and mortality in infants. Rapid whole-genome sequencing (rWGS) can diagnose genetic disorders in time to change acute medical or surgical management (clinical utility) and improve outcomes in acutely ill infants. We report a retrospective cohort study of acutely ill inpatient infants in a regional children's hospital from July 2016-March 2017. Forty-two families received rWGS for etiologic diagnosis of genetic disorders. Probands also received standard genetic testing as clinically indicated. Primary end-points were rate of diagnosis, clinical utility, and healthcare utilization. The latter was modelled in six infants by comparing actual utilization with matched historical controls and/or counterfactual utilization had rWGS been performed at different time points. The diagnostic sensitivity of rWGS was 43% (eighteen of 42 infants) and 10% (four of 42 infants) for standard genetic tests (P = .0005). The rate of clinical utility of rWGS (31%, thirteen of 42 infants) was significantly greater than for standard genetic tests (2%, one of 42; P = .0015). Eleven (26%) infants with diagnostic rWGS avoided morbidity, one had a 43% reduction in likelihood of mortality, and one started palliative care. In six of the eleven infants, the changes in management reduced inpatient cost by $800,000-$2,000,000. These findings replicate a prior study of the clinical utility of rWGS in acutely ill inpatient infants, and demonstrate improved outcomes and net healthcare savings. rWGS merits consideration as a first tier test in this setting.
RESUMO
Genetic disorders are a leading cause of morbidity and mortality in infants in neonatal and pediatric intensive care units (NICU/PICU). While genomic sequencing is useful for genetic disease diagnosis, results are usually reported too late to guide inpatient management. We performed an investigator-initiated, partially blinded, pragmatic, randomized, controlled trial to test the hypothesis that rapid whole-genome sequencing (rWGS) increased the proportion of NICU/PICU infants receiving a genetic diagnosis within 28 days. The participants were families with infants aged <4 months in a regional NICU and PICU, with illnesses of unknown etiology. The intervention was trio rWGS. Enrollment from October 2014 to June 2016, and follow-up until November 2016. Of all, 26 female infants, 37 male infants, and 2 infants of undetermined sex were randomized to receive rWGS plus standard genetic tests (n = 32, cases) or standard genetic tests alone (n = 33, controls). The study was terminated early due to loss of equipoise: 73% (24) controls received genomic sequencing as standard tests, and 15% (five) controls underwent compassionate cross-over to receive rWGS. Nevertheless, intention to treat analysis showed the rate of genetic diagnosis within 28 days of enrollment (the primary end-point) to be higher in cases (31%, 10 of 32) than controls (3%, 1 of 33; difference, 28% [95% CI, 10-46%]; p = 0.003). Among infants enrolled in the first 25 days of life, the rate of neonatal diagnosis was higher in cases (32%, 7 of 22) than controls (0%, 0 of 23; difference, 32% [95% CI, 11-53%];p = 0.004). Median age at diagnosis (25 days [range 14-90] in cases vs. 130 days [range 37-451] in controls) and median time to diagnosis (13 days [range 1-84] in cases, vs. 107 days [range 21-429] in controls) were significantly less in cases than controls (p = 0.04). In conclusion, rWGS increased the proportion of NICU/PICU infants who received timely diagnoses of genetic diseases.
RESUMO
[This corrects the article DOI: 10.1371/journal.ppat.1002924.].
RESUMO
Inhibition of platelet reactivity is a common therapeutic strategy in secondary prevention of cardiovascular disease. Genetic and environmental factors influence inter-individual variation in platelet reactivity. Identifying genes that contribute to platelet reactivity can reveal new biological mechanisms and possible therapeutic targets. Here, we examined rare coding variation to identify genes associated with platelet reactivity in a population-based cohort. To do so, we performed whole exome sequencing in the Framingham Heart Study and conducted single variant and gene-based association tests against platelet reactivity to collagen, adenosine diphosphate (ADP), and epinephrine agonists in up to 1,211 individuals. Single variant tests revealed no significant associations (p<1.44×10-7), though we observed a suggestive association with previously implicated MRVI1 (rs11042902, p = 1.95×10-7). Using gene-based association tests of rare and low-frequency variants, we found significant associations of HYAL2 with increased ADP-induced aggregation (p = 1.07×10-7) and GSTZ1 with increased epinephrine-induced aggregation (p = 1.62×10-6). HYAL2 also showed suggestive associations with epinephrine-induced aggregation (p = 2.64×10-5). The rare variants in the HYAL2 gene-based association included a missense variant (N357S) at a known N-glycosylation site and a nonsense variant (Q406*) that removes a glycophosphatidylinositol (GPI) anchor from the resulting protein. These variants suggest that improper membrane trafficking of HYAL2 influences platelet reactivity. We also observed suggestive associations of AR (p = 7.39×10-6) and MAPRE1 (p = 7.26×10-6) with ADP-induced reactivity. Our study demonstrates that gene-based tests and other grouping strategies of rare variants are powerful approaches to detect associations in population-based analyses of complex traits not detected by single variant tests and possible new genetic influences on platelet reactivity.
Assuntos
Plaquetas/fisiologia , Doenças Cardiovasculares/genética , Moléculas de Adesão Celular/genética , Exoma/genética , Hialuronoglucosaminidase/genética , Mutação/genética , Agregação Plaquetária/genética , Difosfato de Adenosina/metabolismo , Adulto , Alelos , Doenças Cardiovasculares/epidemiologia , Estudos de Coortes , Epinefrina/metabolismo , Feminino , Proteínas Ligadas por GPI/genética , Genótipo , Humanos , Masculino , Massachusetts/epidemiologia , Proteínas de Membrana/genética , Pessoa de Meia-Idade , Fosfoproteínas/genética , Polimorfismo Genético , Grupos Populacionais , Transporte Proteico/genética , Sequenciamento do ExomaRESUMO
Kawasaki disease (KD) is the most common acquired pediatric heart disease. We analyzed Whole Genome Sequences (WGS) from a 6-member African American family in which KD affected two of four children. We sought rare, potentially causative genotypes by sequentially applying the following WGS filters: sequence quality scores, inheritance model (recessive homozygous and compound heterozygous), predicted deleteriousness, allele frequency, genes in KD-associated pathways or with significant associations in published KD genome-wide association studies (GWAS), and with differential expression in KD blood transcriptomes. Biologically plausible genotypes were identified in twelve variants in six genes in the two affected children. The affected siblings were compound heterozygous for the rare variants p.Leu194Pro and p.Arg247Lys in Toll-like receptor 6 (TLR6), which affect TLR6 signaling. The affected children were also homozygous for three common, linked (r2 = 1) intronic single nucleotide variants (SNVs) in TLR6 (rs56245262, rs56083757 and rs7669329), that have previously shown association with KD in cohorts of European descent. Using transcriptome data from pre-treatment whole blood of KD subjects (n = 146), expression quantitative trait loci (eQTL) analyses were performed. Subjects homozygous for the intronic risk allele (A allele of TLR6 rs56245262) had differential expression of Interleukin-6 (IL-6) as a function of genotype (p = 0.0007) and a higher erythrocyte sedimentation rate at diagnosis. TLR6 plays an important role in pathogen-associated molecular pattern recognition, and sequence variations may affect binding affinities that in turn influence KD susceptibility. This integrative genomic approach illustrates how the analysis of WGS in multiplex families with a complex genetic disease allows examination of both the common disease-common variant and common disease-rare variant hypotheses.
Assuntos
Negro ou Afro-Americano/genética , Síndrome de Linfonodos Mucocutâneos/genética , Receptor 6 Toll-Like/genética , Feminino , Frequência do Gene , Predisposição Genética para Doença , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Fatores de Transcrição MEF2/genética , Masculino , Polimorfismo de Nucleotídeo Único , Locos de Características QuantitativasRESUMO
The rapid development of genomic sequencing technologies has decreased the cost of genetic analysis to the extent that it seems plausible that genome-scale sequencing could have widespread availability in pediatric care. Genomic sequencing provides a powerful diagnostic modality for patients who manifest symptoms of monogenic disease and an opportunity to detect health conditions before their development. However, many technical, clinical, ethical, and societal challenges should be addressed before such technology is widely deployed in pediatric practice. This article provides an overview of the Newborn Sequencing in Genomic Medicine and Public Health Consortium, which is investigating the application of genome-scale sequencing in newborns for both diagnosis and screening.
Assuntos
Testes Genéticos , Triagem Neonatal , Saúde Pública , Análise de Sequência de DNA , Exoma/genética , Triagem de Portadores Genéticos , Pesquisa em Genética , Estudo de Associação Genômica Ampla , Variação Estrutural do Genoma/genética , Humanos , Recém-Nascido , Unidades de Terapia Intensiva Neonatal , Valor Preditivo dos Testes , Estudos Prospectivos , Estados UnidosRESUMO
To comprehensively evaluate a European-American child with severe hypertension, whole-exome sequencing (WES) was performed on the child and parents, which identified causal variation of the proband's early-onset disease. The proband's hypertension was resistant to treatment, requiring a multiple drug regimen including amiloride, spironolactone, and hydrochlorothiazide. We suspected a monogenic form of hypertension because of the persistent hypokalemia with low plasma levels of renin and aldosterone. To address this, we focused on rare functional variants and indels, and performed gene-based tests incorporating linkage scores and allele frequency and filtered on deleterious functional mutations. Drawing upon clinical presentation, 27 genes were selected evidenced to cause monogenic hypertension and matched to the gene-based results. This resulted in the identification of a stop-gain mutation in an epithelial sodium channel (ENaC), SCNN1B, an established Liddle syndrome gene, shared by the child and her father. Interestingly, the father also harbored a missense mutation (p.Trp552Arg) in the α-subunit of the ENaC trimer, SCNN1A, possibly pointing to pseudohypoaldosteronism type I. This case is unique in that we present the early-onset disease and treatment response caused by a canonical stop-gain mutation (p.Arg566*) as well as ENaC digenic hits in the father, emphasizing the utility of WES informing precision medicine.
Assuntos
Canais Epiteliais de Sódio/genética , Síndrome de Liddle/genética , Adulto , Aldosterona/sangue , Alelos , Amilorida/uso terapêutico , Pré-Escolar , Canais Epiteliais de Sódio/metabolismo , Exoma , Feminino , Frequência do Gene/genética , Mutação em Linhagem Germinativa , Humanos , Hidroclorotiazida/uso terapêutico , Hipertensão/tratamento farmacológico , Hipopotassemia/tratamento farmacológico , Síndrome de Liddle/metabolismo , Masculino , Mutação , Mutação de Sentido Incorreto , Renina/sangue , Sequenciamento do Exoma/métodosRESUMO
BACKGROUND: The decreasing costs of sequencing are driving the need for cost effective and real time variant calling of whole genome sequencing data. The scale of these projects are far beyond the capacity of typical computing resources available with most research labs. Other infrastructures like the cloud AWS environment and supercomputers also have limitations due to which large scale joint variant calling becomes infeasible, and infrastructure specific variant calling strategies either fail to scale up to large datasets or abandon joint calling strategies. RESULTS: We present a high throughput framework including multiple variant callers for single nucleotide variant (SNV) calling, which leverages hybrid computing infrastructure consisting of cloud AWS, supercomputers and local high performance computing infrastructures. We present a novel binning approach for large scale joint variant calling and imputation which can scale up to over 10,000 samples while producing SNV callsets with high sensitivity and specificity. As a proof of principle, we present results of analysis on Cohorts for Heart And Aging Research in Genomic Epidemiology (CHARGE) WGS freeze 3 dataset in which joint calling, imputation and phasing of over 5300 whole genome samples was produced in under 6 weeks using four state-of-the-art callers. The callers used were SNPTools, GATK-HaplotypeCaller, GATK-UnifiedGenotyper and GotCloud. We used Amazon AWS, a 4000-core in-house cluster at Baylor College of Medicine, IBM power PC Blue BioU at Rice and Rhea at Oak Ridge National Laboratory (ORNL) for the computation. AWS was used for joint calling of 180 TB of BAM files, and ORNL and Rice supercomputers were used for the imputation and phasing step. All other steps were carried out on the local compute cluster. The entire operation used 5.2 million core hours and only transferred a total of 6 TB of data across the platforms. CONCLUSIONS: Even with increasing sizes of whole genome datasets, ensemble joint calling of SNVs for low coverage data can be accomplished in a scalable, cost effective and fast manner by using heterogeneous computing platforms without compromising on the quality of variants.
Assuntos
Genoma Humano , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Bases de Dados Genéticas , HumanosRESUMO
Circulating blood cell counts and indices are important indicators of hematopoietic function and a number of clinical parameters, such as blood oxygen-carrying capacity, inflammation, and hemostasis. By performing whole-exome sequence association analyses of hematologic quantitative traits in 15,459 community-dwelling individuals, followed by in silico replication in up to 52,024 independent samples, we identified two previously undescribed coding variants associated with lower platelet count: a common missense variant in CPS1 (rs1047891, MAF = 0.33, discovery + replication p = 6.38 × 10(-10)) and a rare synonymous variant in GFI1B (rs150813342, MAF = 0.009, discovery + replication p = 1.79 × 10(-27)). By performing CRISPR/Cas9 genome editing in hematopoietic cell lines and follow-up targeted knockdown experiments in primary human hematopoietic stem and progenitor cells, we demonstrate an alternative splicing mechanism by which the GFI1B rs150813342 variant suppresses formation of a GFI1B isoform that preferentially promotes megakaryocyte differentiation and platelet production. These results demonstrate how unbiased studies of natural variation in blood cell traits can provide insight into the regulation of human hematopoiesis.