RESUMEN
Correction for 'High-resolution DNA size enrichment using a magnetic nano-platform and application in non-invasive prenatal testing' by Bo Zhang et al., Analyst, 2020, 145, 5733-5739, DOI: 10.1039/D0AN00813C.
RESUMEN
SUMMARY: The lollipop-diagram is one of the widely used graphical representations to visualize and explore translational effects of genetic mutations in cancer genomics. However, an easy-to-use lollipop-diagram tool with full functionality is still lacking. Here, we introduce g3viz, an R package that enables researchers to explore genetic mutation data using a lollipop-diagram in a web browser. With a few lines of R code, users can interactively visualize data details, annotate findings and export resultant diagrams in high-quality figures. Because of usefulness and usability, g3viz can be generally exploited by researchers with different levels of bioinformatics skills and programming experience. AVAILABILITY AND IMPLEMENTATION: The R package is freely available under the MIT license from CRAN (http://cran.r-project.org/web/packages/g3viz). The g3lollipop JavaScript package is freely available under MIT license at GitHub (https://github.com/g3viz/g3lollipop.js). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Genómica , Programas Informáticos , Mutación , Proteómica , Navegador WebRESUMEN
Precise DNA sizing can boost sequencing efficiency, reduce cost, improve data quality, and even allow sequencing of low-input samples, while current pervasive DNA sizing approaches are incapable of differentiating DNA fragments under 200 bp with high resolution (<20 bp). In non-invasive prenatal testing (NIPT), the size distribution of cell-free fetal DNA in maternal plasma (main peak at 143 bp) is significantly different from that of maternal cell-free DNA (main peak at 166 bp). The current pervasive workflow of NIPT and DNA sizing is unable to take advantage of this 20 bp difference, resulting in sample rejection, test inaccuracy, and restricted clinical utility. Here we report a simple, automatable, high-resolution DNA size enrichment workflow, named MiniEnrich, on a magnetic nano-platform to exploit this 20 bp size difference and to enrich fetal DNA fragments from maternal blood. Two types of magnetic nanoparticles were developed, with one able to filter high-molecular-weight DNA with high resolution and the other able to recover the remaining DNA fragments under the size threshold of interest with >95% yield. Using this method, the average fetal fraction was increased from 13% to 20% after the enrichment, as measured by plasma DNA sequencing. This approach provides a new tool for high-resolution DNA size enrichment under 200 bp, which may improve NIPT accuracy by rescuing rejected non-reportable clinical samples, and enable NIPT earlier in pregnancy. It also has the potential to improve non-invasive screening for fetal monogenic disorders, differentiate tumor-related DNA in liquid biopsy and find more applications in autoimmune disease diagnosis.
Asunto(s)
Ácidos Nucleicos Libres de Células , Diagnóstico Prenatal , ADN/genética , Femenino , Humanos , Fenómenos Magnéticos , Embarazo , Análisis de Secuencia de ADNRESUMEN
Genomic medicine attempts to build individualized strategies for diagnostic or therapeutic decision-making by utilizing patients' genomic information. Big Data analytics uncovers hidden patterns, unknown correlations, and other insights through examining large-scale various data sets. While integration and manipulation of diverse genomic data and comprehensive electronic health records (EHRs) on a Big Data infrastructure exhibit challenges, they also provide a feasible opportunity to develop an efficient and effective approach to identify clinically actionable genetic variants for individualized diagnosis and therapy. In this paper, we review the challenges of manipulating large-scale next-generation sequencing (NGS) data and diverse clinical data derived from the EHRs for genomic medicine. We introduce possible solutions for different challenges in manipulating, managing, and analyzing genomic and clinical data to implement genomic medicine. Additionally, we also present a practical Big Data toolset for identifying clinically actionable genetic variants using high-throughput NGS data and EHRs.
Asunto(s)
Minería de Datos , Registros Electrónicos de Salud , Genómica , Medicina de Precisión , Nube Computacional , Biología Computacional/métodos , Minería de Datos/métodos , Bases de Datos Factuales , Variación Genética , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Informática Médica/métodos , Medicina de Precisión/métodosRESUMEN
Chronic infection with the hepatitis C virus (HCV) affects 170 million people worldwide and is an important cause of liver-related morbidity and mortality. The standard of care therapy combines pegylated interferon (pegIFN) alpha and ribavirin (RBV), and is associated with a range of treatment-limiting adverse effects. One of the most important of these is RBV-induced haemolytic anaemia, which affects most patients and is severe enough to require dose modification in up to 15% of patients. Here we show that genetic variants leading to inosine triphosphatase deficiency, a condition not thought to be clinically important, protect against haemolytic anaemia in hepatitis-C-infected patients receiving RBV.
Asunto(s)
Anemia Hemolítica/inducido químicamente , Anemia Hemolítica/genética , Variación Genética/genética , Hepatitis C Crónica/tratamiento farmacológico , Pirofosfatasas/genética , Alelos , Anemia Hemolítica/complicaciones , Antivirales , Cromosomas Humanos Par 20 , Europa (Continente)/etnología , Estudio de Asociación del Genoma Completo , Hemoglobinas/deficiencia , Hemoglobinas/metabolismo , Hepatitis C Crónica/complicaciones , Humanos , Polimorfismo de Nucleótido Simple/genética , Pirofosfatasas/deficiencia , Pirofosfatasas/metabolismo , Grupos Raciales/genética , Ribavirina/uso terapéutico , Estados Unidos , Inosina TrifosfatasaRESUMEN
BACKGROUND AND AIMS: HBsAg loss is a desired, but rare, treatment-induced clinical endpoint in chronic hepatitis B (CHB). Few studies have evaluated viral factors contributing to HBsAg loss. METHODS: This study evaluated baseline interpatient sequence diversity across the HBV genome in tenofovir disoproxil fumarate-treated patients who lost HBsAg and compared it to that of control patients with high HBsAg levels throughout therapy. Twenty-one HBeAg+ patients (14 genotype (GT) A and 7 GT D) who achieved HBsAg loss and 27 controls (17 GT A and 10 GT D), were analyzed. Population sequencing was performed on baseline samples and pairwise genetic distances were calculated for 17 overlapping regions across the HBV genome as a measure of interpatient viral diversity. RESULTS: Overall, viral diversity was up to 10-fold higher across GT D patients compared to GT A patients throughout the HBV genome. Within the pol/RT and HBs genes, interpatient viral diversity was significantly lower among HBsAg loss patients for both GT A and D, with the difference driven largely by a reduction in diversity in the small S gene. Conversely, interpatient viral diversity was generally higher in HBsAg loss patients across the HBx gene regulatory elements and precore region. CONCLUSION: In HBsAg loss patients, less interpatient viral diversity was observed within structural-coding regions while specific regions across the HBx and precore genes encoding nonstructural regulatory elements generally displayed higher interpatient viral diversity. These distinct patterns may reflect different responses to adaptive pressure for HBV genomic structural and nonstructural elements.
Asunto(s)
ADN Viral/genética , Variación Genética/efectos de los fármacos , Antígenos de Superficie de la Hepatitis B/genética , Virus de la Hepatitis B , Hepatitis B Crónica , Tenofovir/farmacología , Adulto , Antivirales/farmacología , Transmisión de Enfermedad Infecciosa , Femenino , Antígenos e de la Hepatitis B/genética , Virus de la Hepatitis B/efectos de los fármacos , Virus de la Hepatitis B/genética , Hepatitis B Crónica/tratamiento farmacológico , Hepatitis B Crónica/inmunología , Hepatitis B Crónica/transmisión , Hepatitis B Crónica/virología , Humanos , MasculinoRESUMEN
Although there are many methods available for inferring copy-number variants (CNVs) from next-generation sequence data, there remains a need for a system that is computationally efficient but that retains good sensitivity and specificity across all types of CNVs. Here, we introduce a new method, estimation by read depth with single-nucleotide variants (ERDS), and use various approaches to compare its performance to other methods. We found that for common CNVs and high-coverage genomes, ERDS performs as well as the best method currently available (Genome STRiP), whereas for rare CNVs and high-coverage genomes, ERDS performs better than any available method. Importantly, ERDS accommodates both unique and highly amplified regions of the genome and does so without requiring separate alignments for calling CNVs and other variants. These comparisons show that for genomes sequenced at high coverage, ERDS provides a computationally convenient method that calls CNVs as well as or better than any currently available method.
Asunto(s)
Variaciones en el Número de Copia de ADN , Genoma Humano , Análisis de Secuencia de ADN/métodos , Algoritmos , Eliminación de Gen , Técnicas de Genotipaje , Humanos , Estudios de Validación como AsuntoRESUMEN
To date, the widely used genome-wide association studies (GWASs) of the human genome have reported thousands of variants that are significantly associated with various human traits. However, in the vast majority of these cases, the causal variants responsible for the observed associations remain unknown. In order to facilitate the identification of causal variants, we designed a simple computational method called the "preferential linkage disequilibrium (LD)" approach, which follows the variants discovered by GWASs to pinpoint the causal variants, even if they are rare compared with the discovery variants. The approach is based on the hypothesis that the GWAS-discovered variant is better at tagging the causal variants than are most other variants evaluated in the original GWAS. Applying the preferential LD approach to the GWAS signals of five human traits for which the causal variants are already known, we successfully placed the known causal variants among the top ten candidates in the majority of these cases. Application of this method to additional GWASs, including those of hepatitis C virus treatment response, plasma levels of clotting factors, and late-onset Alzheimer disease, has led to the identification of a number of promising candidate causal variants. This method represents a useful tool for delineating causal variants by bringing together GWAS signals and the rapidly accumulating variant data from next-generation sequencing.
Asunto(s)
Estudio de Asociación del Genoma Completo , Desequilibrio de Ligamiento , Biología Computacional/métodos , Frecuencia de los Genes , Predisposición Genética a la Enfermedad , Genoma Humano , Humanos , Polimorfismo de Nucleótido SimpleRESUMEN
We studied five individuals from three Jewish Bukharian families affected by an apparently autosomal-recessive form of hereditary spastic paraparesis accompanied by severe intellectual disability, fluctuating central hypoventilation, gastresophageal reflux disease, wake apnea, areflexia, and unique dysmorphic features. Exome sequencing identified one homozygous variant shared among all affected individuals and absent in controls: a 1 bp frameshift TECPR2 deletion leading to a premature stop codon and predicting significant degradation of the protein. TECPR2 has been reported as a positive regulator of autophagy. We thus examined the autophagy-related fate of two key autophagic proteins, SQSTM1 (p62) and MAP1LC3B (LC3), in skin fibroblasts of an affected individual, as compared to a healthy control, and found that both protein levels were decreased and that there was a more pronounced decrease in the lipidated form of LC3 (LC3II). siRNA knockdown of TECPR2 showed similar changes, consistent with aberrant autophagy. Our results are strengthened by the fact that autophagy dysfunction has been implicated in a number of other neurodegenerative diseases. The discovered TECPR2 mutation implicates autophagy, a central intracellular mechanism, in spastic paraparesis.
Asunto(s)
Autofagia/genética , Proteínas Portadoras/genética , Mutación , Proteínas del Tejido Nervioso/genética , Paraparesia Espástica/genética , Encéfalo/patología , Exones , Femenino , Fibroblastos/metabolismo , Fibroblastos/ultraestructura , Genotipo , Células HeLa , Humanos , Judíos/genética , Imagen por Resonancia Magnética , Masculino , Neuroimagen , Paraparesia Espástica/diagnóstico , Paraparesia Espástica/metabolismo , Linaje , Fenotipo , Análisis de Secuencia de ADNRESUMEN
Schizophrenia is a severe psychiatric disorder with strong heritability and marked heterogeneity in symptoms, course, and treatment response. There is strong interest in identifying genetic risk factors that can help to elucidate the pathophysiology and that might result in the development of improved treatments. Linkage and genome-wide association studies (GWASs) suggest that the genetic basis of schizophrenia is heterogeneous. However, it remains unclear whether the underlying genetic variants are mostly moderately rare and can be identified by the genotyping of variants observed in sequenced cases in large follow-up cohorts or whether they will typically be much rarer and therefore more effectively identified by gene-based methods that seek to combine candidate variants. Here, we consider 166 persons who have schizophrenia or schizoaffective disorder and who have had either their genomes or their exomes sequenced to high coverage. From these data, we selected 5,155 variants that were further evaluated in an independent cohort of 2,617 cases and 1,800 controls. No single variant showed a study-wide significant association in the initial or follow-up cohorts. However, we identified a number of case-specific variants, some of which might be real risk factors for schizophrenia, and these can be readily interrogated in other data sets. Our results indicate that schizophrenia risk is unlikely to be predominantly influenced by variants just outside the range detectable by GWASs. Rather, multiple rarer genetic variants must contribute substantially to the predisposition to schizophrenia, suggesting that both very large sample sizes and gene-based association tests will be required for securely identifying genetic risk factors.
Asunto(s)
Exoma/genética , Predisposición Genética a la Enfermedad/genética , Esquizofrenia/genética , Secuencia de Bases , Finlandia , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Datos de Secuencia Molecular , Factores de Riesgo , Alineación de Secuencia , Análisis de Secuencia de ADN , Estados UnidosRESUMEN
Idiopathic generalized epilepsy (IGE) is a complex disease with high heritability, but little is known about its genetic architecture. Rare copy-number variants have been found to explain nearly 3% of individuals with IGE; however, it remains unclear whether variants with moderate effect size and frequencies below what are reliably detected with genome-wide association studies contribute significantly to disease risk. In this study, we compare the exome sequences of 118 individuals with IGE and 242 controls of European ancestry by using next-generation sequencing. The exome-sequenced epilepsy cases include study subjects with two forms of IGE, including juvenile myoclonic epilepsy (n = 93) and absence epilepsy (n = 25). However, our discovery strategy did not assume common genetic control between the subtypes of IGE considered. In the sequence data, as expected, no variants were significantly associated with the IGE phenotype or more specific IGE diagnoses. We then selected 3,897 candidate epilepsy-susceptibility variants from the sequence data and genotyped them in a larger set of 878 individuals with IGE and 1,830 controls. Again, no variant achieved statistical significance. However, 1,935 variants were observed exclusively in cases either as heterozygous or homozygous genotypes. It is likely that this set of variants includes real risk factors. The lack of significant association evidence of single variants with disease in this two-stage approach emphasizes the high genetic heterogeneity of epilepsy disorders, suggests that the impact of any individual single-nucleotide variant in this disease is small, and indicates that gene-based approaches might be more successful for future sequencing studies of epilepsy predisposition.
Asunto(s)
Epilepsia Generalizada/genética , Exoma/genética , Predisposición Genética a la Enfermedad/genética , Secuencia de Bases , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Datos de Secuencia Molecular , Alineación de Secuencia , Análisis de Secuencia de ADN , Población Blanca/genéticaRESUMEN
Chronic infection with hepatitis C virus (HCV) affects 170 million people worldwide and is the leading cause of cirrhosis in North America. Although the recommended treatment for chronic infection involves a 48-week course of peginterferon-alpha-2b (PegIFN-alpha-2b) or -alpha-2a (PegIFN-alpha-2a) combined with ribavirin (RBV), it is well known that many patients will not be cured by treatment, and that patients of European ancestry have a significantly higher probability of being cured than patients of African ancestry. In addition to limited efficacy, treatment is often poorly tolerated because of side effects that prevent some patients from completing therapy. For these reasons, identification of the determinants of response to treatment is a high priority. Here we report that a genetic polymorphism near the IL28B gene, encoding interferon-lambda-3 (IFN-lambda-3), is associated with an approximately twofold change in response to treatment, both among patients of European ancestry (P = 1.06 x 10(-25)) and African-Americans (P = 2.06 x 10(-3)). Because the genotype leading to better response is in substantially greater frequency in European than African populations, this genetic polymorphism also explains approximately half of the difference in response rates between African-Americans and patients of European ancestry.
Asunto(s)
Variación Genética/genética , Hepacivirus/efectos de los fármacos , Hepatitis C Crónica/tratamiento farmacológico , Hepatitis C Crónica/genética , Interferón-alfa/farmacología , Interleucinas/genética , Polietilenglicoles/farmacología , Carga Viral , Negro o Afroamericano/genética , Cromosomas Humanos Par 19/genética , Ensayos Clínicos como Asunto , Europa (Continente)/etnología , Asia Oriental/etnología , Frecuencia de los Genes , Genoma Humano/genética , Estudio de Asociación del Genoma Completo , Genotipo , Hepatitis C Crónica/etnología , Hepatitis C Crónica/virología , Hispánicos o Latinos/genética , Humanos , Interferón alfa-2 , Interferón-alfa/efectos adversos , Interferón-alfa/uso terapéutico , Interferones , Farmacogenética , Polietilenglicoles/efectos adversos , Polietilenglicoles/uso terapéutico , Polimorfismo de Nucleótido Simple/genética , Proteínas RecombinantesRESUMEN
Hepatitis C virus (HCV) infection is the most common blood-borne infection in the United States, with estimates of 4 million HCV-infected individuals in the United States and 170 million worldwide. Most (70-80%) HCV infections persist and about 30% of individuals with persistent infection develop chronic liver disease, including cirrhosis and hepatocellular carcinoma. Epidemiological, viral and host factors have been associated with the differences in HCV clearance or persistence, and studies have demonstrated that a strong host immune response against HCV favours viral clearance. Thus, variation in genes involved in the immune response may contribute to the ability to clear the virus. In a recent genome-wide association study, a single nucleotide polymorphism (rs12979860) 3 kilobases upstream of the IL28B gene, which encodes the type III interferon IFN-3, was shown to associate strongly with more than a twofold difference in response to HCV drug treatment. To determine the potential effect of rs12979860 variation on outcome to HCV infection in a natural history setting, we genotyped this variant in HCV cohorts comprised of individuals who spontaneously cleared the virus (n = 388) or had persistent infection (n = 620). We show that the C/C genotype strongly enhances resolution of HCV infection among individuals of both European and African ancestry. To our knowledge, this is the strongest and most significant genetic effect associated with natural clearance of HCV, and these results implicate a primary role for IL28B in resolution of HCV infection.
Asunto(s)
Variación Genética/genética , Hepacivirus/inmunología , Hepatitis C/genética , Hepatitis C/inmunología , Interleucinas/genética , Interleucinas/inmunología , Adulto , África/etnología , Europa (Continente)/etnología , Femenino , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Genotipo , Hepacivirus/efectos de los fármacos , Hepacivirus/fisiología , Hepatitis C/tratamiento farmacológico , Hepatitis C/virología , Humanos , Interferones , Masculino , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
One of the longest running debates in evolutionary biology concerns the kind of genetic variation that is primarily responsible for phenotypic variation in species. Here, we address this question for humans specifically from the perspective of population allele frequency of variants across the complete genome, including both coding and noncoding regions. We establish simple criteria to assess the likelihood that variants are functional based on their genomic locations and then use whole-genome sequence data from 29 subjects of European origin to assess the relationship between the functional properties of variants and their population allele frequencies. We find that for all criteria used to assess the likelihood that a variant is functional, the rarer variants are significantly more likely to be functional than the more common variants. Strikingly, these patterns disappear when we focus on only those variants in which the major alleles are derived. These analyses indicate that the majority of the genetic variation in terms of phenotypic consequence may result from a mutation-selection balance, as opposed to balancing selection, and have direct relevance to the study of human disease.
Asunto(s)
Variación Genética , Alelos , Secuencia Conservada , Evolución Molecular , Frecuencia de los Genes , Genes Reguladores , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Modelos Genéticos , Mutación , Fenotipo , Polimorfismo de Nucleótido Simple , Selección Genética , Población Blanca/genéticaRESUMEN
A genome-wide screen for large structural variants showed that a copy number variant (CNV) in the region encoding killer cell immunoglobulin-like receptors (KIR) associates with HIV-1 control as measured by plasma viral load at set point in individuals of European ancestry. This CNV encompasses the KIR3DL1-KIR3DS1 locus, encoding receptors that interact with specific HLA-Bw4 molecules to regulate the activation of lymphocyte subsets including natural killer (NK) cells. We quantified the number of copies of KIR3DS1 and KIR3DL1 in a large HIV-1 positive cohort, and showed that an increase in KIR3DS1 count associates with a lower viral set point if its putative ligand is present (p = 0.00028), as does an increase in KIR3DL1 count in the presence of KIR3DS1 and appropriate ligands for both receptors (p = 0.0015). We further provide functional data that demonstrate that NK cells from individuals with multiple copies of KIR3DL1, in the presence of KIR3DS1 and the appropriate ligands, inhibit HIV-1 replication more robustly, and associated with a significant expansion in the frequency of KIR3DS1+, but not KIR3DL1+, NK cells in their peripheral blood. Our results suggest that the relative amounts of these activating and inhibitory KIR play a role in regulating the peripheral expansion of highly antiviral KIR3DS1+ NK cells, which may determine differences in HIV-1 control following infection.
Asunto(s)
Variaciones en el Número de Copia de ADN , VIH-1/fisiología , Receptores KIR/genética , Estudios de Cohortes , VIH-1/inmunología , Humanos , Células Asesinas Naturales/metabolismo , Células Asesinas Naturales/fisiología , Activación de Linfocitos , Modelos Inmunológicos , Receptores KIR/metabolismo , Carga Viral , Replicación ViralRESUMEN
Reduced fecundity, associated with severe mental disorders, places negative selection pressure on risk alleles and may explain, in part, why common variants have not been found that confer risk of disorders such as autism, schizophrenia and mental retardation. Thus, rare variants may account for a larger fraction of the overall genetic risk than previously assumed. In contrast to rare single nucleotide mutations, rare copy number variations (CNVs) can be detected using genome-wide single nucleotide polymorphism arrays. This has led to the identification of CNVs associated with mental retardation and autism. In a genome-wide search for CNVs associating with schizophrenia, we used a population-based sample to identify de novo CNVs by analysing 9,878 transmissions from parents to offspring. The 66 de novo CNVs identified were tested for association in a sample of 1,433 schizophrenia cases and 33,250 controls. Three deletions at 1q21.1, 15q11.2 and 15q13.3 showing nominal association with schizophrenia in the first sample (phase I) were followed up in a second sample of 3,285 cases and 7,951 controls (phase II). All three deletions significantly associate with schizophrenia and related psychoses in the combined sample. The identification of these rare, recurrent risk variants, having occurred independently in multiple founders and being subject to negative selection, is important in itself. CNV analysis may also point the way to the identification of additional and more prevalent risk variants in genes and pathways involved in schizophrenia.
Asunto(s)
Predisposición Genética a la Enfermedad/genética , Esquizofrenia/genética , Eliminación de Secuencia/genética , China , Cromosomas Humanos Par 1/genética , Cromosomas Humanos Par 15/genética , Europa (Continente) , Dosificación de Gen/genética , Genoma Humano/genética , Genotipo , Humanos , Pérdida de Heterocigocidad , Modelos Genéticos , Polimorfismo de Nucleótido Simple/genética , Trastornos Psicóticos/genéticaRESUMEN
Deletions at 16p13.11 are associated with schizophrenia, mental retardation, and most recently idiopathic generalized epilepsy. To evaluate the role of 16p13.11 deletions, as well as other structural variation, in epilepsy disorders, we used genome-wide screens to identify copy number variation in 3812 patients with a diverse spectrum of epilepsy syndromes and in 1299 neurologically-normal controls. Large deletions (> 100 kb) at 16p13.11 were observed in 23 patients, whereas no control had a deletion greater than 16 kb. Patients, even those with identically sized 16p13.11 deletions, presented with highly variable epilepsy phenotypes. For a subset of patients with a 16p13.11 deletion, we show a consistent reduction of expression for included genes, suggesting that haploinsufficiency might contribute to pathogenicity. We also investigated another possible mechanism of pathogenicity by using hybridization-based capture and next-generation sequencing of the homologous chromosome for ten 16p13.11-deletion patients to look for unmasked recessive mutations. Follow-up genotyping of suggestive polymorphisms failed to identify any convincing recessive-acting mutations in the homologous interval corresponding to the deletion. The observation that two of the 16p13.11 deletions were larger than 2 Mb in size led us to screen for other large deletions. We found 12 additional genomic regions harboring deletions > 2 Mb in epilepsy patients, and none in controls. Additional evaluation is needed to characterize the role of these exceedingly large, non-locus-specific deletions in epilepsy. Collectively, these data implicate 16p13.11 and possibly other large deletions as risk factors for a wide range of epilepsy disorders, and they appear to point toward haploinsufficiency as a contributor to the pathogenicity of deletions.
Asunto(s)
Cromosomas Humanos Par 16 , Susceptibilidad a Enfermedades , Epilepsia/genética , Mutación , Eliminación de Secuencia , Humanos , Hibridación de Ácido Nucleico/genética , SíndromeRESUMEN
Although more than 2,400 genes have been shown to contain variants that cause Mendelian disease, there are still several thousand such diseases yet to be molecularly defined. The ability of new whole-genome sequencing technologies to rapidly indentify most of the genetic variants in any given genome opens an exciting opportunity to identify these disease genes. Here we sequenced the whole genome of a single patient with the dominant Mendelian disease, metachondromatosis (OMIM 156250), and used partial linkage data from her small family to focus our search for the responsible variant. In the proband, we identified an 11 bp deletion in exon four of PTPN11, which alters frame, results in premature translation termination, and co-segregates with the phenotype. In a second metachondromatosis family, we confirmed our result by identifying a nonsense mutation in exon 4 of PTPN11 that also co-segregates with the phenotype. Sequencing PTPN11 exon 4 in 469 controls showed no such protein truncating variants, supporting the pathogenicity of these two mutations. This combination of a new technology and a classical genetic approach provides a powerful strategy to discover the genes responsible for unexplained Mendelian disorders.
Asunto(s)
Ligamiento Genético , Predisposición Genética a la Enfermedad , Genoma Humano , Proteína Tirosina Fosfatasa no Receptora Tipo 11/genética , Exones , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Mutación , Linaje , Análisis de Secuencia de ADNRESUMEN
We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs) discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.
Asunto(s)
Genoma Humano/genética , Análisis de Secuencia de ADN , Secuencia de Bases , Estudios de Casos y Controles , Variaciones en el Número de Copia de ADN/genética , Bases de Datos Genéticas , Exones/genética , Factor VIII/genética , Duplicación de Gen/genética , Técnicas de Inactivación de Genes , Genética de Población , Genotipo , Hemofilia A/genética , Humanos , Mutación INDEL/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Sistemas de Lectura Abierta/genética , Polimorfismo Genético , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
BACKGROUND: Information on nucleotide diversity along completely sequenced human genomes has increased tremendously over the last few years. This makes it possible to reassess the diversity status of distinct receptor proteins in different human individuals. To this end, we focused on the complete inventory of human olfactory receptor coding regions as a model for personal receptor repertoires. RESULTS: By performing data-mining from public and private sources we scored genetic variations in 413 intact OR loci, for which one or more individuals had an intact open reading frame. Using 1000 Genomes Project haplotypes, we identified a total of 4069 full-length polypeptide variants encoded by these OR loci, average of ~10 per locus, constituting a lower limit for the effective human OR repertoire. Each individual is found to harbor as many as 600 OR allelic variants, ~50% higher than the locus count. Because OR neuronal expression is allelically excluded, this has direct effect on smell perception diversity of the species. We further identified 244 OR segregating pseudogenes (SPGs), loci showing both intact and pseudogene forms in the population, twenty-six of which are annotatively "resurrected" from a pseudogene status in the reference genome. Using a custom SNP microarray we validated 150 SPGs in a cohort of 468 individuals, with every individual genome averaging 36 disrupted sequence variations, 15 in homozygote form. Finally, we generated a multi-source compendium of 63 OR loci harboring deletion Copy Number Variations (CNVs). Our combined data suggest that 271 of the 413 intact OR loci (66%) are affected by nonfunctional SNPs/indels and/or CNVs. CONCLUSIONS: These results portray a case of unusually high genetic diversity, and suggest that individual humans have a highly personalized inventory of functional olfactory receptors, a conclusion that might apply to other receptor multigene families.