Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 29
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Nucleic Acids Res ; 47(D1): D1018-D1027, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30476213

ABSTRACT

The Human Phenotype Ontology (HPO)-a standardized vocabulary of phenotypic abnormalities associated with 7000+ diseases-is used by thousands of researchers, clinicians, informaticians and electronic health record systems around the world. Its detailed descriptions of clinical abnormalities and computable disease definitions have made HPO the de facto standard for deep phenotyping in the field of rare disease. The HPO's interoperability with other ontologies has enabled it to be used to improve diagnostic accuracy by incorporating model organism data. It also plays a key role in the popular Exomiser tool, which identifies potential disease-causing variants from whole-exome or whole-genome sequencing data. Since the HPO was first introduced in 2008, its users have become both more numerous and more diverse. To meet these emerging needs, the project has added new content, language translations, mappings and computational tooling, as well as integrations with external community data. The HPO continues to collaborate with clinical adopters to improve specific areas of the ontology and extend standardized disease descriptions. The newly redesigned HPO website (www.human-phenotype-ontology.org) simplifies browsing terms and exploring clinical features, diseases, and human genes.


Subject(s)
Biological Ontologies , Computational Biology/methods , Congenital Abnormalities/genetics , Genetic Predisposition to Disease/genetics , Knowledge Bases , Rare Diseases/genetics , Congenital Abnormalities/diagnosis , Databases, Genetic , Genetic Variation , Humans , Internet , Phenotype , Rare Diseases/diagnosis , Whole Genome Sequencing/methods
2.
Physiol Genomics ; 50(4): 263-271, 2018 04 01.
Article in English | MEDLINE | ID: mdl-29373073

ABSTRACT

RNA-Seq was used to better understand the molecular nature of the biological differences among the three major exocrine salivary glands in mammals. Transcriptional profiling found that the adult murine parotid, submandibular, and sublingual salivary glands express greater than 14,300 protein-coding genes, and nearly 2,000 of these genes were differentially expressed. Principle component analysis of the differentially expressed genes revealed three distinct clusters according to gland type. The three salivary gland transcriptomes were dominated by a relatively few number of highly expressed genes (6.3%) that accounted for more than 90% of transcriptional output. Of the 912 transcription factors expressed in the major salivary glands, greater than 90% of them were detected in all three glands, while expression for ~2% of them was enriched in an individual gland. Expression of these unique transcription factors correlated with sublingual and parotid specific subsets of both highly expressed and differentially expressed genes. Gene ontology analyses revealed that the highly expressed genes common to all glands were associated with global functions, while many of the genes expressed in a single gland play a major role in the function of that gland. In summary, transcriptional profiling of the three murine major salivary glands identified a limited number of highly expressed genes, differentially expressed genes, and unique transcription factors that represent the transcriptional signatures underlying gland-specific biological properties.


Subject(s)
Salivary Glands/metabolism , Transcriptome/genetics , Animals , Mice , Parotid Gland/metabolism , Sublingual Gland/metabolism
3.
Genet Med ; 18(12): 1303-1307, 2016 12.
Article in English | MEDLINE | ID: mdl-27253732

ABSTRACT

PURPOSE: Using single-nucleotide polymorphism (SNP) chip and exome sequence data from individuals participating in the National Institutes of Health (NIH) Undiagnosed Diseases Program (UDP), we evaluated the number and therapeutic informativeness of incidental pharmacogenetic variants. METHODS: Pharmacogenomics Knowledgebase (PharmGKB) annotated sequence variants were identified in 1,101 individuals. Medication records of participants were used to identify individuals prescribed medications with a genetic variant that might alter efficacy. RESULTS: A total of 395 sequence variants, including 19 PharmGKB 1A and 1B variants, were identified in SNP chip sequence data, and 388 variants, including 21 PharmGKB 1A and 1B variants, were identified in the exome sequence data. Nine participants had incidental pharmacogenetic variants associated with altered efficacy of a prescribed medication. CONCLUSIONS: Despite the small size of the NIH UDP patient cohort, we identified pharmacogenetic incidental findings potentially useful for guiding therapy. Consequently, groups conducting clinical genomic studies might consider reporting of pharmacogenetic incidental findings.Genet Med 18 12, 1303-1307.


Subject(s)
Exome/genetics , Genomics , Pharmacogenetics , Polymorphism, Single Nucleotide/genetics , Humans , Incidental Findings , National Institutes of Health (U.S.) , United States
4.
Genet Med ; 18(6): 608-17, 2016 06.
Article in English | MEDLINE | ID: mdl-26562225

ABSTRACT

PURPOSE: Medical diagnosis and molecular or biochemical confirmation typically rely on the knowledge of the clinician. Although this is very difficult in extremely rare diseases, we hypothesized that the recording of patient phenotypes in Human Phenotype Ontology (HPO) terms and computationally ranking putative disease-associated sequence variants improves diagnosis, particularly for patients with atypical clinical profiles. METHODS: Using simulated exomes and the National Institutes of Health Undiagnosed Diseases Program (UDP) patient cohort and associated exome sequence, we tested our hypothesis using Exomiser. Exomiser ranks candidate variants based on patient phenotype similarity to (i) known disease-gene phenotypes, (ii) model organism phenotypes of candidate orthologs, and (iii) phenotypes of protein-protein association neighbors. RESULTS: Benchmarking showed Exomiser ranked the causal variant as the top hit in 97% of known disease-gene associations and ranked the correct seeded variant in up to 87% when detectable disease-gene associations were unavailable. Using UDP data, Exomiser ranked the causative variant(s) within the top 10 variants for 11 previously diagnosed variants and achieved a diagnosis for 4 of 23 cases undiagnosed by clinical evaluation. CONCLUSION: Structured phenotyping of patients and computational analysis are effective adjuncts for diagnosing patients with genetic disorders.Genet Med 18 6, 608-617.


Subject(s)
Exome Sequencing/methods , Exome/genetics , Rare Diseases/genetics , Rare Diseases/physiopathology , Animals , Computational Biology , Databases, Genetic , Disease Models, Animal , Genetic Association Studies , Genetic Variation , Humans , Mice , National Institutes of Health (U.S.) , Patients , Phenotype , Rare Diseases/diagnosis , Rare Diseases/epidemiology , United States , Zebrafish
5.
Gastroenterology ; 144(1): 112-121.e2, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23041322

ABSTRACT

BACKGROUND & AIMS: Autosomal recessive polycystic kidney disease (ARPKD), the most common ciliopathy of childhood, is characterized by congenital hepatic fibrosis and progressive cystic degeneration of kidneys. We aimed to describe congenital hepatic fibrosis in patients with ARPKD, confirmed by detection of mutations in PKHD1. METHODS: Patients with ARPKD and congenital hepatic fibrosis were evaluated at the National Institutes of Health from 2003 to 2009. We analyzed clinical, molecular, and imaging data from 73 patients (age, 1-56 years; average, 12.7 ± 13.1 years) with kidney and liver involvement (based on clinical, imaging, or biopsy analyses) and mutations in PKHD1. RESULTS: Initial symptoms were liver related in 26% of patients, and others presented with kidney disease. One patient underwent liver and kidney transplantation, and 10 others received kidney transplants. Four presented with cholangitis and one with variceal bleeding. Sixty-nine percent of patients had enlarged left lobes on magnetic resonance imaging, 92% had increased liver echogenicity on ultrasonography, and 65% had splenomegaly. Splenomegaly started early in life; 60% of children younger than 5 years had enlarged spleens. Spleen volume had an inverse correlation with platelet count and prothrombin time but not with serum albumin level. Platelet count was the best predictor of spleen volume (area under the curve of 0.88905), and spleen length corrected for patient's height correlated inversely with platelet count (R(2) = 0.42, P < .0001). Spleen volume did not correlate with renal function or type of PKHD1 mutation. Twenty-two of 31 patients who underwent endoscopy were found to have varices. Five had variceal bleeding, and 2 had portosystemic shunts. Forty-percent had Caroli syndrome, and 30% had an isolated dilated common bile duct. CONCLUSIONS: Platelet count is the best predictor of the severity of portal hypertension, which has early onset but is underdiagnosed in patients with ARPKD. Seventy percent of patients with ARPKD have biliary abnormalities. Kidney and liver disease are independent, and variability in severity is not explainable by type of PKHD1 mutation; ClinicalTrials.gov number, NCT00068224.


Subject(s)
Hypertension, Portal/physiopathology , Liver Cirrhosis/congenital , Liver Cirrhosis/pathology , Polycystic Kidney, Autosomal Recessive/genetics , Receptors, Cell Surface/genetics , Adolescent , Adult , Alkaline Phosphatase/blood , Child , Child, Preschool , Cholangiopancreatography, Magnetic Resonance , Endoscopy, Gastrointestinal , Esophageal and Gastric Varices/etiology , Female , Humans , Hypertension, Portal/blood , Hypertension, Portal/complications , Infant , Kidney Transplantation , Liver Cirrhosis/diagnostic imaging , Liver Cirrhosis/genetics , Liver Transplantation , Male , Middle Aged , Mutation , Organ Size , Platelet Count , Polycystic Kidney, Autosomal Recessive/complications , Portal Pressure , Prothrombin Time , Serum Albumin , Severity of Illness Index , Splenomegaly/diagnostic imaging , Ultrasonography, Doppler, Color , Young Adult , gamma-Glutamyltransferase/blood
6.
Genet Med ; 16(10): 741-50, 2014 Oct.
Article in English | MEDLINE | ID: mdl-24784157

ABSTRACT

PURPOSE: Using exome sequence data from 159 families participating in the National Institutes of Health Undiagnosed Diseases Program, we evaluated the number and inheritance mode of reportable incidental sequence variants. METHODS: Following the American College of Medical Genetics and Genomics recommendations for reporting of incidental findings from next-generation sequencing, we extracted variants in 56 genes from the exome sequence data of 543 subjects and determined the reportable incidental findings for each participant. We also defined variant status as inherited or de novo for those with available parental sequence data. RESULTS: We identified 14 independent reportable variants in 159 (8.8%) families. For nine families with parental sequence data in our cohort, a parent transmitted the variant to one or more children (nine minor children and four adult children). The remaining five variants occurred in adults for whom parental sequences were unavailable. CONCLUSION: Our results are consistent with the expectation that a small percentage of exomes will result in identification of an incidental finding under the American College of Medical Genetics and Genomics recommendations. Additionally, our analysis of family sequence data highlights that genome and exome sequencing of families has unavoidable implications for immediate family members and therefore requires appropriate counseling for the family.


Subject(s)
Exome/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation , Sequence Analysis, DNA/methods , Adolescent , Adult , Child , Cohort Studies , Family Health , Female , Genetic Counseling , Genetic Diseases, Inborn/diagnosis , Genetic Diseases, Inborn/genetics , Genome, Human/genetics , Humans , Incidental Findings , Male , Metabolism, Inborn Errors/diagnosis , Metabolism, Inborn Errors/genetics , Middle Aged , National Institutes of Health (U.S.) , United States , Young Adult
7.
Mol Genet Metab ; 113(3): 161-70, 2014 Nov.
Article in English | MEDLINE | ID: mdl-24863970

ABSTRACT

The National Institutes of Health Undiagnosed Diseases Program evaluates patients for whom no diagnosis has been discovered despite a comprehensive diagnostic workup. Failure to diagnose a condition may arise from the mutation of genes previously unassociated with disease. However, we hypothesized that this could also co-occur with multiple genetic disorders. Demonstrating a complex syndrome caused by multiple disorders, we report two siblings manifesting both similar and disparate signs and symptoms. They shared a history of episodes of hypoglycemia and lactic acidosis, but had differing exam findings and developmental courses. Clinical acumen and exome sequencing combined with biochemical and functional studies identified three genetic conditions. One sibling had Smith-Magenis Syndrome and a nonsense mutation in the RAI1 gene. The second sibling had a de novo mutation in GRIN2B, which resulted in markedly reduced glutamate potency of the encoded receptor. Both siblings had a protein-destabilizing homozygous mutation in PCK1, which encodes the cytosolic isoform of phosphoenolpyruvate carboxykinase (PEPCK-C). In summary, we present the first clinically-characterized mutation of PCK1 and demonstrate that complex medical disorders can represent the co-occurrence of multiple diseases.


Subject(s)
Intracellular Signaling Peptides and Proteins/genetics , Phosphoenolpyruvate Carboxykinase (ATP)/deficiency , Phosphoenolpyruvate Carboxykinase (GTP)/genetics , Receptors, N-Methyl-D-Aspartate/genetics , Smith-Magenis Syndrome/diagnosis , Transcription Factors/genetics , Amino Acid Sequence , Base Sequence , Child , Child, Preschool , DNA Mutational Analysis , Female , Genetic Association Studies , HEK293 Cells , Humans , Molecular Sequence Data , Mutation, Missense , Polymorphism, Single Nucleotide , Smith-Magenis Syndrome/genetics , Trans-Activators
8.
JCO Oncol Pract ; 20(3): 370-377, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38194619

ABSTRACT

PURPOSE: Racial/ethnic inequities in next-generation sequencing (NGS) were examined for patients with advanced non-small-cell lung cancer (aNSCLC) at the practice and physician levels to inform policies to improve equitable quality of care. METHODS: This retrospective study used a nationwide electronic health record-derived deidentified database for patients with aNSCLC diagnosed between April 2018 and March 2022 in the community setting. Timely NGS was an NGS result between initial diagnosis and ≤60 days after advanced diagnosis. We studied how inequities were driven by (1) non-Latinx Black (Black) and Latinx patient under-representation at high testing practices versus (2) Black and Latinx patients being tested at lower rates than non-Latinx White (White) patients, even at the same practice. We defined these two concepts as across inequity and within inequity, respectively, with total inequity as their summation. Mean percentage point inequities were estimated using a Bayesian approach. RESULTS: A total of 12,045 patients (9,981 White; 1,528 Black; 536 Latinx) met study criteria. At the practice level, versus White patients, the mean percentage point difference in NGS testing total inequity was 7.49 for Black and 8.26 for Latinx. Within- and across-practice inequities contributed to total inequity in NGS testing for Black (48% v 52%) and Latinx patients (60% v 40%). At the physician level, versus White patients, the mean percentage point difference in total inequity was 7.73 for Black and 8.81 for Latinx patients. Within- versus across-physician inequities contributed to total inequity for Black and Latinx patients (77% v 23% and 67% v 33%). CONCLUSION: Within-practice, across-practice, and across-physician inequities were main contributors to total inequity in NGS testing, requiring a suite of interventions to effectively address inequities.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , Physicians , Humans , Carcinoma, Non-Small-Cell Lung/genetics , Carcinoma, Non-Small-Cell Lung/therapy , Bayes Theorem , Retrospective Studies , Lung Neoplasms/genetics , Lung Neoplasms/therapy , High-Throughput Nucleotide Sequencing
9.
J Am Med Inform Assoc ; 31(2): 536-541, 2024 Jan 18.
Article in English | MEDLINE | ID: mdl-38037121

ABSTRACT

OBJECTIVE: Given the importance AI in genomics and its potential impact on human health, the American Medical Informatics Association-Genomics and Translational Biomedical Informatics (GenTBI) Workgroup developed this assessment of factors that can further enable the clinical application of AI in this space. PROCESS: A list of relevant factors was developed through GenTBI workgroup discussions in multiple in-person and online meetings, along with review of pertinent publications. This list was then summarized and reviewed to achieve consensus among the group members. CONCLUSIONS: Substantial informatics research and development are needed to fully realize the clinical potential of such technologies. The development of larger datasets is crucial to emulating the success AI is achieving in other domains. It is important that AI methods do not exacerbate existing socio-economic, racial, and ethnic disparities. Genomic data standards are critical to effectively scale such technologies across institutions. With so much uncertainty, complexity and novelty in genomics and medicine, and with an evolving regulatory environment, the current focus should be on using these technologies in an interface with clinicians that emphasizes the value each brings to clinical decision-making.


Subject(s)
Artificial Intelligence , Medicine , Humans , Computational Biology , Genomics
10.
Diabetes Care ; 47(6): 1042-1047, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38652672

ABSTRACT

OBJECTIVE: To identify genetic risk factors for incident cardiovascular disease (CVD) among people with type 2 diabetes (T2D). RESEARCH DESIGN AND METHODS: We conducted a multiancestry time-to-event genome-wide association study for incident CVD among people with T2D. We also tested 204 known coronary artery disease (CAD) variants for association with incident CVD. RESULTS: Among 49,230 participants with T2D, 8,956 had incident CVD events (event rate 18.2%). We identified three novel genetic loci for incident CVD: rs147138607 (near CACNA1E/ZNF648, hazard ratio [HR] 1.23, P = 3.6 × 10-9), rs77142250 (near HS3ST1, HR 1.89, P = 9.9 × 10-9), and rs335407 (near TFB1M/NOX3, HR 1.25, P = 1.5 × 10-8). Among 204 known CAD loci, 5 were associated with incident CVD in T2D (multiple comparison-adjusted P < 0.00024, 0.05/204). A standardized polygenic score of these 204 variants was associated with incident CVD with HR 1.14 (P = 1.0 × 10-16). CONCLUSIONS: The data point to novel and known genomic regions associated with incident CVD among individuals with T2D.


Subject(s)
Cardiovascular Diseases , Diabetes Mellitus, Type 2 , Genome-Wide Association Study , Humans , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/epidemiology , Diabetes Mellitus, Type 2/complications , Cardiovascular Diseases/genetics , Cardiovascular Diseases/epidemiology , Female , Male , Middle Aged , Aged , Polymorphism, Single Nucleotide
11.
medRxiv ; 2023 Jul 28.
Article in English | MEDLINE | ID: mdl-37546893

ABSTRACT

BACKGROUND: Type 2 diabetes mellitus (T2D) confers a two- to three-fold increased risk of cardiovascular disease (CVD). However, the mechanisms underlying increased CVD risk among people with T2D are only partially understood. We hypothesized that a genetic association study among people with T2D at risk for developing incident cardiovascular complications could provide insights into molecular genetic aspects underlying CVD. METHODS: From 16 studies of the Cohorts for Heart & Aging Research in Genomic Epidemiology (CHARGE) Consortium, we conducted a multi-ancestry time-to-event genome-wide association study (GWAS) for incident CVD among people with T2D using Cox proportional hazards models. Incident CVD was defined based on a composite of coronary artery disease (CAD), stroke, and cardiovascular death that occurred at least one year after the diagnosis of T2D. Cohort-level estimated effect sizes were combined using inverse variance weighted fixed effects meta-analysis. We also tested 204 known CAD variants for association with incident CVD among patients with T2D. RESULTS: A total of 49,230 participants with T2D were included in the analyses (31,118 European ancestries and 18,112 non-European ancestries) which consisted of 8,956 incident CVD cases over a range of mean follow-up duration between 3.2 and 33.7 years (event rate 18.2%). We identified three novel, distinct genetic loci for incident CVD among individuals with T2D that reached the threshold for genome-wide significance (P<5.0×10-8): rs147138607 (intergenic variant between CACNA1E and ZNF648) with a hazard ratio (HR) 1.23, 95% confidence interval (CI) 1.15 - 1.32, P=3.6×10-9, rs11444867 (intergenic variant near HS3ST1) with HR 1.89, 95% CI 1.52 - 2.35, P=9.9×10-9, and rs335407 (intergenic variant between TFB1M and NOX3) HR 1.25, 95% CI 1.16 - 1.35, P=1.5×10-8. Among 204 known CAD loci, 32 were associated with incident CVD in people with T2D with P<0.05, and 5 were significant after Bonferroni correction (P<0.00024, 0.05/204). A polygenic score of these 204 variants was significantly associated with incident CVD with HR 1.14 (95% CI 1.12 - 1.16) per 1 standard deviation increase (P=1.0×10-16). CONCLUSIONS: The data point to novel and known genomic regions associated with incident CVD among individuals with T2D.

12.
Hum Mutat ; 33(4): 609-13, 2012 Apr.
Article in English | MEDLINE | ID: mdl-22294350

ABSTRACT

Disease gene discovery has been transformed by affordable sequencing of exomes and genomes. Identification of disease-causing mutations requires sifting through a large number of sequence variants. A subset of the variants are unlikely to be good candidates for disease causation based on one or more of the following criteria: (1) being located in genomic regions known to be highly polymorphic, (2) having characteristics suggesting assembly misalignment, and/or (3) being labeled as variants based on misleading reference genome information. We analyzed exome sequence data from 118 individuals in 29 families seen in the NIH Undiagnosed Diseases Program (UDP) to create lists of variants and genes with these characteristics. Specifically, we identified several groups of genes that are candidates for provisional exclusion during exome analysis: 23,389 positions with excess heterozygosity suggestive of alignment errors and 1,009 positions in which the hg18 human genome reference sequence appeared to contain a minor allele. Exclusion of such variants, which we provide in supplemental lists, will likely enhance identification of disease-causing mutations using exome sequence data.


Subject(s)
Exome , Genetic Diseases, Inborn/genetics , Genetic Variation , Sequence Analysis, DNA/methods , False Positive Reactions , Female , Homozygote , Humans , Loss of Heterozygosity , Mutation , National Institutes of Health (U.S.) , Polymorphism, Single Nucleotide , United States
13.
Hum Mutat ; 33(4): 593-8, 2012 Apr.
Article in English | MEDLINE | ID: mdl-22290570

ABSTRACT

The analysis of variants generated by exome sequencing (ES) of families with rare Mendelian diseases is a time-consuming, manual process that represents one barrier to applying the technology routinely. To address this issue, we have developed a software tool, VAR-MD (http://research.nhgri.nih.gov/software/var-md/), for analyzing the DNA sequence variants produced by human ES. VAR-MD generates a ranked list of variants using predicted pathogenicity, Mendelian inheritance models, genotype quality, and population variant frequency data. VAR-MD was tested using two previously solved data sets and one unsolved data set. In the solved cases, the correct variant was listed at the top of VAR-MD's variant ranking. In the unsolved case, the correct variant was highly ranked, allowing for subsequent identification and validation. We conclude that VAR-MD has the potential to enhance mutation identification using family based, annotated next generation sequencing data. Moreover, we predict an incremental advancement in software performance as the reference databases, such as Single Nucleotide Polymorphism Database and Human Gene Mutation Database, continue to improve.


Subject(s)
Exome , Genetic Variation , Pedigree , Software , Female , Gene Frequency , Humans , Male , Mixed Function Oxygenases/genetics , Polymorphism, Single Nucleotide , Reproducibility of Results , beta-Galactosidase/genetics
14.
Hum Mutat ; 33(4): 614-26, 2012 Apr.
Article in English | MEDLINE | ID: mdl-22311686

ABSTRACT

In this study, we assess exome sequencing (ES) as a diagnostic alternative for genetically heterogeneous disorders. Because ES readily identified a previously reported homozygous mutation in the CAPN3 gene for an individual with an undiagnosed limb girdle muscular dystrophy, we evaluated ES as a generalizable clinical diagnostic tool by assessing the targeting efficiency and sequencing coverage of 88 genes associated with muscle disease (MD) and spastic paraplegia (SPG). We used three exome-capture kits on 125 individuals. Exons constituting each gene were defined using the UCSC and CCDS databases. The three exome-capture kits targeted 47-92% of bases within the UCSC-defined exons and 97-99% of bases within the CCDS-defined exons. An average of 61.2-99.5% and 19.1-99.5% of targeted bases per gene were sequenced to 20X coverage within the CCDS-defined MD and SPG coding exons, respectively. Greater than 95-99% of targeted known mutation positions were sequenced to ≥1X coverage and 55-87% to ≥20X coverage in every exome. We conclude, therefore, that ES is a rapid and efficient first-tier method to screen for mutations, particularly within the CCDS annotated exons, although its application requires disclosure of the extent of coverage for each targeted gene and supplementation with second-tier Sanger sequencing for full coverage.


Subject(s)
Exome , Muscular Diseases/genetics , Paraplegia/genetics , Sequence Analysis, DNA/methods , Calpain/genetics , Female , Humans , Muscle Proteins/genetics , Muscular Dystrophies, Limb-Girdle/genetics , Mutation , Polymorphism, Single Nucleotide , Young Adult
15.
Hum Mutat ; 33(4): 599-608, 2012 Apr.
Article in English | MEDLINE | ID: mdl-22290882

ABSTRACT

The Undiagnosed Diseases Program at the National Institutes of Health uses high-throughput sequencing (HTS) to diagnose rare and novel diseases. HTS techniques generate large numbers of DNA sequence variants, which must be analyzed and filtered to find candidates for disease causation. Despite the publication of an increasing number of successful exome-based projects, there has been little formal discussion of the analytic steps applied to HTS variant lists. We present the results of our experience with over 30 families for whom HTS sequencing was used in an attempt to find clinical diagnoses. For each family, exome sequence was augmented with high-density SNP-array data. We present a discussion of the theory and practical application of each analytic step and provide example data to illustrate our approach. The article is designed to provide an analytic roadmap for variant analysis, thereby enabling a wide range of researchers and clinical genetics practitioners to perform direct analysis of HTS data for their patients and projects.


Subject(s)
Genetic Diseases, Inborn/diagnosis , Genetic Diseases, Inborn/genetics , High-Throughput Nucleotide Sequencing/methods , Software , Exome , Family , Genetic Variation , Humans
16.
Genet Med ; 14(1): 51-9, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22237431

ABSTRACT

PURPOSE: This report describes the National Institutes of Health Undiagnosed Diseases Program, details the Program's application of genomic technology to establish diagnoses, and details the Program's success rate during its first 2 years. METHODS: Each accepted study participant was extensively phenotyped. A subset of participants and selected family members (29 patients and 78 unaffected family members) was subjected to an integrated set of genomic analyses including high-density single-nucleotide polymorphism arrays and whole exome or genome analysis. RESULTS: Of 1,191 medical records reviewed, 326 patients were accepted and 160 were admitted directly to the National Institutes of Health Clinical Center on the Undiagnosed Diseases Program service. Of those, 47% were children, 55% were females, and 53% had neurologic disorders. Diagnoses were reached on 39 participants (24%) on clinical, biochemical, pathologic, or molecular grounds; 21 diagnoses involved rare or ultra-rare diseases. Three disorders were diagnosed based on single-nucleotide polymorphism array analysis and three others using whole exome sequencing and filtering of variants. Two new disorders were discovered. Analysis of the single-nucleotide polymorphism array study cohort revealed that large stretches of homozygosity were more common in affected participants relative to controls. CONCLUSION: The National Institutes of Health Undiagnosed Diseases Program addresses an unmet need, i.e., the diagnosis of patients with complex, multisystem disorders. It may serve as a model for the clinical application of emerging genomic technologies and is providing insights into the characteristics of diseases that remain undiagnosed after extensive clinical workup.


Subject(s)
Government Programs , National Health Programs , National Institutes of Health (U.S.) , Rare Diseases/diagnosis , Rare Diseases/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Biomedical Research , Child , Child, Preschool , Clinical Protocols , DNA Copy Number Variations , Exome , Female , Homozygote , Humans , Infant , Male , Middle Aged , Phenotype , Polymorphism, Single Nucleotide , Rare Diseases/mortality , United States , Young Adult
17.
Mol Genet Metab ; 105(4): 665-71, 2012 Apr.
Article in English | MEDLINE | ID: mdl-22277120

ABSTRACT

Medicine is rapidly applying exome and genome sequencing to the diagnosis and management of human disease. Somatic mosaicism, however, is not readily detectable by these means, and yet it accounts for a significant portion of undiagnosed disease. We present a rapid and sensitive method, the Continuous Distribution Function as applied to single nucleotide polymorphism (SNP) array data, to quantify somatic mosaicism throughout the genome. We also demonstrate application of the method to novel diseases and mechanisms.


Subject(s)
Chromosome Aberrations , Genome, Human , Mosaicism , Oligonucleotide Array Sequence Analysis/methods , Polymorphism, Single Nucleotide/genetics , Chromosome Mapping , Humans
18.
Mol Genet Metab ; 104(1-2): 189-91, 2011.
Article in English | MEDLINE | ID: mdl-21767969

ABSTRACT

While genomic sequencing methods are powerful tools in the discovery of the genetic underpinnings of human disease, incidentally-revealed novel genomic risk factors may be equally important, both scientifically, and as relates to direct patient care. We performed whole-exome sequencing on a child with VACTERL association who suffered severe post-surgical neonatal pulmonary hypertension, and identified a potential novel genetic risk factor for this complication: a heterozygous mutation in CPSI. Newborn screening results from this patient's monozygotic twin provided evidence that this mutation, in combination with an environmental trigger (in this case, surgery), may have resulted in pulmonary artery hypertension due to inadequate nitric oxide production. Identification of this genetic risk factor allows for targeted medical preventative measures in this patient as well as relatives with the same mutation, and illustrates the power of incidental medical information unearthed by whole-exome sequencing.


Subject(s)
Exome/genetics , Genomics , Precision Medicine , Familial Primary Pulmonary Hypertension , Humans , Hypertension, Pulmonary/genetics , Infant , Infant, Newborn , Male , Reproducibility of Results , Sequence Analysis, DNA
19.
IEEE Trans Med Imaging ; 38(4): 919-931, 2019 04.
Article in English | MEDLINE | ID: mdl-30334750

ABSTRACT

In this paper, we propose a novel deep learning framework for anatomy segmentation and automatic landmarking. Specifically, we focus on the challenging problem of mandible segmentation from cone-beam computed tomography (CBCT) scans and identification of 9 anatomical landmarks of the mandible on the geodesic space. The overall approach employs three inter-related steps. In the first step, we propose a deep neural network architecture with carefully designed regularization, and network hyper-parameters to perform image segmentation without the need for data augmentation and complex post-processing refinement. In the second step, we formulate the landmark localization problem directly on the geodesic space for sparsely-spaced anatomical landmarks. In the third step, we utilize a long short-term memory network to identify the closely-spaced landmarks, which is rather difficult to obtain using other standard networks. The proposed fully automated method showed superior efficacy compared to the state-of-the-art mandible segmentation and landmarking approaches in craniofacial anomalies and diseased states. We used a very challenging CBCT data set of 50 patients with a high-degree of craniomaxillofacial variability that is realistic in clinical practice. The qualitative visual inspection was conducted for distinct CBCT scans from 250 patients with high anatomical variability. We have also shown the state-of-the-art performance in an independent data set from the MICCAI Head-Neck Challenge (2015).


Subject(s)
Anatomic Landmarks/diagnostic imaging , Deep Learning , Image Interpretation, Computer-Assisted/methods , Adolescent , Adult , Algorithms , Child , Cone-Beam Computed Tomography/methods , Craniofacial Abnormalities/diagnostic imaging , Female , Humans , Male , Mandible/diagnostic imaging , Young Adult
20.
J Am Dent Assoc ; 150(11): 933-939.e2, 2019 11.
Article in English | MEDLINE | ID: mdl-31668172

ABSTRACT

BACKGROUND: A significant amount of clinical information captured as free-text narratives could be better used for several applications, such as clinical decision support, ontology development, evidence-based practice, and research. The Human Phenotype Ontology (HPO) is specifically used for semantic comparisons for diagnostic purposes. All these functions require quality coverage of the domain of interest. The authors used natural language processing to capture craniofacial and oral phenotype signatures from electronic health records and then used these signatures for evaluation of existing oral phenotype ontology coverage. METHODS: The authors applied a text-processing pipeline based on the clinical Text Analysis and Knowledge Extraction System to annotate the clinical notes with Unified Medical Language System codes. The authors extracted the disease or disorder phenotype terms, which were then compared with HPO terms and their synonyms. RESULTS: The authors retrieved 2,153 deidentified clinical notes from 558 patients. Finally, 2,416 unique diseases or disorders phenotype terms were extracted, which included 210 craniofacial or oral phenotype terms. Twenty-six of these phenotypes were not found in the HPO. CONCLUSIONS: The authors demonstrated that natural language processing tools could extract relevant phenotype terms from clinical narratives, which could help identify gaps in existing ontologies and enhance craniofacial and dental phenotyping vocabularies. PRACTICAL IMPLICATIONS: The expansion of terms in the dental, oral, and craniofacial domains in the HPO is particularly important as the dental community moves toward electronic health records.


Subject(s)
Natural Language Processing , Vocabulary , Electronic Health Records , Humans , Narration , Phenotype
SELECTION OF CITATIONS
SEARCH DETAIL