Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 153
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Trends Genet ; 38(6): 521-523, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35232614

RESUMEN

Variant annotation is one of the most essential steps in selecting candidates for further investigation. With the advancement in functional genomics, new variant annotation tools focus on annotation based on empirically generated data instead of theoretically based predictions. This is a direct result of the large national and international consortia that generated enormous experiment-based or validated data at multiple omics levels. Here, we highlight the recent empirically based annotation methods and discuss their strengths and weaknesses.


Asunto(s)
Genómica , Programas Informáticos , Anotación de Secuencia Molecular
2.
J Biol Chem ; 299(2): 102839, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36581210

RESUMEN

Data from gnomAD indicate that a missense mutation encoding the T118M variation in human peripheral myelin protein 22 (PMP22) is found in roughly one of every 75 genomes of western European lineage (1:120 in the overall human population). It is unusual among PMP22 variants that cause Charcot-Marie-Tooth (CMT) disease in that it is not 100% penetrant. Here, we conducted cellular and biophysical studies to determine why T118M PMP22 predisposes humans to CMT, but with only incomplete penetrance. We found that T118M PMP22 is prone to mistraffic but differs even from the WT protein in that increased expression levels do not result in a reduction in trafficking efficiency. Moreover, the T118M mutant exhibits a reduced tendency to form large intracellular aggregates relative to other disease mutants and even WT PMP22. NMR spectroscopy revealed that the structure and dynamics of T118M PMP22 resembled those of WT. These results show that the main consequence of T118M PMP22 in WT/T118M heterozygous individuals is a reduction in surface-trafficked PMP22, unaccompanied by formation of toxic intracellular aggregates. This explains the incomplete disease penetrance and the mild neuropathy observed for WT/T118M CMT cases. We also analyzed BioVU, a biobank linked to deidentified electronic medical records, and found a statistically robust association of the T118M mutation with the occurrence of long and/or repeated episodes of carpal tunnel syndrome. Collectively, our results illuminate the cellular effects of the T118M PMP22 variation leading to CMT disease and indicate a second disorder for which it is a risk factor.


Asunto(s)
Enfermedad de Charcot-Marie-Tooth , Proteínas de la Mielina , Humanos , Enfermedad de Charcot-Marie-Tooth/genética , Mutación Missense , Proteínas de la Mielina/genética , Predisposición Genética a la Enfermedad
3.
Pharmacogenet Genomics ; 34(2): 25-32, 2024 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-37910437

RESUMEN

BACKGROUND: Excessive weight gain affects some persons with HIV after switching to integrase strand transfer inhibitor (INSTI)-containing antiretroviral therapy (ART). We studied associations between CYP2B6 genotype and weight gain after ART switch among ACTG A5001 and A5322 participants. METHODS: Eligible participants switched from efavirenz- to INSTI-containing ART, had genotype data, and had weight data at least once from 4 weeks to 2 years post-switch. Multivariable linear mixed effects models adjusted for race/ethnicity, CD4, age, BMI and INSTI type assessed relationships between CYP2B6 genotype and estimated differences in weight change. RESULTS: A total of 159 eligible participants switched ART from 2007 to 2019, of whom 138 had plasma HIV-1 RNA < 200 copies/mL (65 CYP2B6 normal, 56 intermediate, 17 poor metabolizers). Among participants with switch HIV-1 RNA < 200 copies/mL, weight increased in all 3 CYP2B6 groups. The rate of weight gain was greater in CYP2B6 poor than in CYP2B6 normal metabolizers overall, and within 9 subgroups (male, female, White, Black, Hispanic, dolutegravir, elvitegravir, raltegravir, and TDF in the pre-switch regimen); only in Hispanic and elvitegravir subgroups were these associations statistically significant ( P  < 0.05). Compared to normal metabolizers, CYP2B6 intermediate status was not consistently associated with weight gain. CONCLUSION: CYP2B6 poor metabolizer genotype was associated with greater weight gain after switch from efavirenz- to INSTI-containing ART, but results were inconsistent. Weight gain in this setting is likely complex and multifactorial.


Asunto(s)
Fármacos Anti-VIH , Infecciones por VIH , Inhibidores de Integrasa VIH , Humanos , Masculino , Femenino , Citocromo P-450 CYP2B6/genética , Farmacogenética , Inhibidores de Integrasa VIH/uso terapéutico , Benzoxazinas/efectos adversos , Infecciones por VIH/tratamiento farmacológico , Infecciones por VIH/genética , Aumento de Peso/genética , ARN/uso terapéutico , Fármacos Anti-VIH/efectos adversos
4.
Trends Genet ; 36(11): 857-867, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32773169

RESUMEN

One of the forerunners that pioneered the revolution of high-throughput genomic technologies is the genotyping microarray technology, which can genotype millions of single-nucleotide variants simultaneously. Owing to apparent benefits, such as high speed, low cost, and high throughput, the genotyping array has gained lasting applications in genome-wide association studies (GWAS) and thus accumulated an enormous amount of data. Empowered by continuous manufactural upgrades and analytical innovation, unconventional applications of genotyping array data have emerged to address more diverse genetic problems, holding promise of boosting genetic research into human diseases through the re-mining of the rich accumulated data. Here, we review several unconventional genotyping array analysis techniques that have been built on the idea of large-scale multivariant analysis and provide empirical application examples. These unconventional outcomes of genotyping arrays include polygenic score, runs of homozygosity (ROH)/heterozygosity ratio, distant pedigree computation, and mitochondrial DNA (mtDNA) copy number inference.


Asunto(s)
Biología Computacional/métodos , Estudio de Asociación del Genoma Completo , Genoma , Técnicas de Genotipaje/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Polimorfismo de Nucleótido Simple , Animales , Genómica , Genotipo , Humanos
5.
Brief Bioinform ; 21(1): 338-347, 2020 Jan 17.
Artículo en Inglés | MEDLINE | ID: mdl-30475999

RESUMEN

Expression quantitative trait loci (eQTLs) have been touted as the missing piece that can bridge the gap between genetic variants and phenotypes. Over the past decade, we have witnessed a sharp rise of effort in the identification and application of eQTLs. The successful application of eQTLs relies heavily on their reproducibility. The current eQTL databases such as Genotype-Tissue Expression (GTEx) were populated primarily with eQTLs deriving from germline single nucleotide polymorphisms and normal tissue gene expression. The novel scenarios that employ eQTL models for prediction purposes often involve disease phenotypes characterized by altered gene expressions. To evaluate eQTL reproducibility across diverse data sources and the effect of disease-specific gene expression alteration on eQTL identification, we conducted an eQTL study using 5178 samples from The Cancer Genome Atlas (TCGA). We found that the reproducibility of eQTLs between normal and tumor tissues was low in terms of the number of shared eQTLs. However, among the shared eQTLs, the effect directions were generally concordant. This suggests that the source of the gene expression (normal or tumor tissue) has a strong effect on the detectable eQTLs and the effect direction of the eQTLs. Additional analyses demonstrated good directional concordance of eQTLs between GTEx and TCGA. Furthermore, we found that multi-tissue eQTLs may exert opposite effects across multiple tissue types. In summary, our results suggest that eQTL prediction models need to carefully address tissue and disease dependency of eQTLs. Tissue-disease-specific eQTL databases can afford more accurate prediction models for future studies.

6.
Genomics ; 113(6): 3864-3871, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34562567

RESUMEN

RNA editing exerts critical impacts on numerous biological processes. While millions of RNA editings have been identified in humans, much more are expected to be discovered. In this work, we constructed Convolutional Neural Network (CNN) models to predict human RNA editing events in both Alu regions and non-Alu regions. With a validation dataset resulting from CRISPR/Cas9 knockout of the ADAR1 enzyme, the validation accuracies reached 99.5% and 93.6% for Alu and non-Alu regions, respectively. We ported our CNN models in a web service named EditPredict. EditPredict not only works on reference genome sequences but can also take into consideration single nucleotide variants in personal genomes. In addition to the human genome, EditPredict tackles other model organisms including bumblebee, fruitfly, mouse, and squid genomes. EditPredict can be used stand-alone to predict novel RNA editing and it can be used to assist in filtering for candidate RNA editing detected from RNA-Seq data.


Asunto(s)
Redes Neurales de la Computación , Edición de ARN , Animales , Genoma , ARN , RNA-Seq
7.
PLoS Comput Biol ; 16(6): e1007968, 2020 06.
Artículo en Inglés | MEDLINE | ID: mdl-32511223

RESUMEN

Very short tandem repeats bear substantial genetic, evolutional, and pathological significance in genome analyses. Here, we compiled a census of tandem mono-nucleotide/di-nucleotide/tri-nucleotide repeats (MNRs/DNRs/TNRs) in GRCh38, which we term "polytracts" in general. Of the human genome, 144.4 million nucleotides (4.7%) are occupied by polytracts, and 0.47 million single nucleotides are identified as polytract hinges, i.e., break-points of tandem polytracts. Preliminary exploration of the census suggested polytract hinge sites and boundaries of AAC polytracts may bear a higher mapping error rate than other polytract regions. Further, we revealed landscapes of polytract enrichment with respect to nearly a hundred genomic features. We found MNRs, DNRs, and TNRs displayed noticeable difference in terms of locational enrichment for miscellaneous genomic features, especially RNA editing events. Non-canonical and C-to-U RNA-editing events are enriched inside and/or adjacent to MNRs, while all categories of RNA-editing events are under-represented in DNRs. A-to-I RNA-editing events are generally under-represented in polytracts. The selective enrichment of non-canonical RNA-editing events within MNR adjacency provides a negative evidence against their authenticity. To enable similar locational enrichment analyses in relation to polytracts, we developed a software Polytrap which can handle 11 reference genomes. Additionally, we compiled polytracts of four model organisms into a Track Hub which can be integrated into USCS Genome Browser as an official track for convenient visualization of polytracts.


Asunto(s)
ADN/genética , Genoma Humano , Repeticiones de Microsatélite/genética , ARN/genética , Humanos , Edición de ARN , Programas Informáticos
8.
Brief Bioinform ; 19(5): 765-775, 2018 09 28.
Artículo en Inglés | MEDLINE | ID: mdl-28334151

RESUMEN

Illumina genotyping arrays have powered thousands of large-scale genome-wide association studies over the past decade. Yet, because of the tremendous volume and complicated genetic assumptions of Illumina genotyping data, processing and quality control (QC) of these data remain a challenge. Thorough QC ensures the accurate identification of single-nucleotide polymorphisms and is required for the correct interpretation of genetic association results. By processing genotyping data on > 100 000 subjects from >10 major Illumina genotyping arrays, we have accumulated extensive experience in handling some of the most peculiar scenarios related to the processing and QC of Illumina genotyping data. Here, we describe strategies for processing Illumina genotyping data from the raw data to an analysis ready format, and we elaborate on the necessary QC procedures required at each processing step. High-quality Illumina genotyping data sets can be obtained by following our detailed QC strategies.


Asunto(s)
Técnicas de Genotipaje/métodos , Técnicas de Genotipaje/normas , Polimorfismo de Nucleótido Simple , Algoritmos , Análisis por Conglomerados , Biología Computacional/métodos , Femenino , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo/métodos , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Genotipo , Técnicas de Genotipaje/estadística & datos numéricos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Humanos , Masculino , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos , Control de Calidad , Grupos Raciales/genética , Programas Informáticos
9.
Brief Bioinform ; 19(6): 1247-1255, 2018 11 27.
Artículo en Inglés | MEDLINE | ID: mdl-28605403

RESUMEN

Power/sample size (power) analysis estimates the likelihood of successfully finding the statistical significance in a data set. There has been a growing recognition of the importance of power analysis in the proper design of experiments. Power analysis is complex, yet necessary for the success of large studies. It is important to design a study that produces statistically accurate and reliable results. Power computation methods have been well established for both microarray-based gene expression studies and genotyping microarray-based genome-wide association studies. High-throughput sequencing (HTS) has greatly enhanced our ability to conduct biomedical studies at the highest possible resolution (per nucleotide). However, the complexity of power computations is much greater for sequencing data than for the simpler genotyping array data. Research on methods of power computations for HTS-based studies has been recently conducted but is not yet well known or widely used. In this article, we describe the power computation methods that are currently available for a range of HTS-based studies, including DNA sequencing, RNA-sequencing, microbiome sequencing and chromatin immunoprecipitation sequencing. Most importantly, we review the methods of power analysis for several types of sequencing data and guide the reader to the relevant methods for each data type.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Inmunoprecipitación de Cromatina , Estudio de Asociación del Genoma Completo , Heterocigoto , Humanos , Microbiota , Mutación , Distribución de Poisson , Análisis de Secuencia de ARN/métodos
10.
Genomics ; 111(4): 950-957, 2019 07.
Artículo en Inglés | MEDLINE | ID: mdl-29902512

RESUMEN

Genotyping arrays characterize genome-wide SNPs for a study cohort and were the primary technology behind genome wide association studies over the last decade. The Cancer Genome Atlas (TCGA) is one of the largest cancer consortium studies, and it collected genotyping data for all of its participants. Using TCGA SNP data genotyped using the Affymetrix 6.0 SNP array from 12,064 samples, we conducted a comprehensive comparisons across DNA sources (tumor tissue, normal tissue, and blood) and sample storage protocols (formalin-fixed paraffin-embedded (FFPE) vs. freshly frozen (FF)), examining genotypes, transition/transversion ratios, and mutation catalogues. During the analysis, we made important observations in relevance to the data quality issues. SNP concordance was excellent between blood and normal tissues, and slightly lower between blood and tumor tissue due to potential somatic mutations in the tumors. The observed poor SNP concordance between FFPE and FF samples suggested a batch effect. The transition/transversion ratio, a metric commonly used for quality control purpose in exome sequencing projects, appeared less applicable for genotyping array data due to the whole-genome coverage built into the array design. Moreover, there were substantially more loss of heterozygosity events than gain of heterozygosity when comparing tumors relative to normal tissues and blood. This might be a consequence of extensive copy number deletions in tumors. In summary, our thorough evaluation calls for more adequate quality control practices and provides guidelines for improved application of TCGA genotyping data.


Asunto(s)
Técnicas de Genotipaje/métodos , Neoplasias/genética , Análisis de Matrices Tisulares/métodos , Bases de Datos Genéticas/normas , Técnicas de Genotipaje/normas , Humanos , Polimorfismo de Nucleótido Simple , Análisis de Matrices Tisulares/normas
12.
BMC Genomics ; 20(1): 167, 2019 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-30832569

RESUMEN

BACKGROUND: Deep learning has made tremendous successes in numerous artificial intelligence applications and is unsurprisingly penetrating into various biomedical domains. High-throughput omics data in the form of molecular profile matrices, such as transcriptomes and metabolomes, have long existed as a valuable resource for facilitating diagnosis of patient statuses/stages. It is timely imperative to compare deep learning neural networks against classical machine learning methods in the setting of matrix-formed omics data in terms of classification accuracy and robustness. RESULTS: Using 37 high throughput omics datasets, covering transcriptomes and metabolomes, we evaluated the classification power of deep learning compared to traditional machine learning methods. Representative deep learning methods, Multi-Layer Perceptrons (MLP) and Convolutional Neural Networks (CNN), were deployed and explored in seeking optimal architectures for the best classification performance. Together with five classical supervised classification methods (Linear Discriminant Analysis, Multinomial Logistic Regression, Naïve Bayes, Random Forest, Support Vector Machine), MLP and CNN were comparatively tested on the 37 datasets to predict disease stages or to discriminate diseased samples from normal samples. MLPs achieved the highest overall accuracy among all methods tested. More thorough analyses revealed that single hidden layer MLPs with ample hidden units outperformed deeper MLPs. Furthermore, MLP was one of the most robust methods against imbalanced class composition and inaccurate class labels. CONCLUSION: Our results concluded that shallow MLPs (of one or two hidden layers) with ample hidden neurons are sufficient to achieve superior and robust classification performance in exploiting numerical matrix-formed omics data for diagnosis purpose. Specific observations regarding optimal network width, class imbalance tolerance, and inaccurate labeling tolerance will inform future improvement of neural network applications on functional genomics data.


Asunto(s)
Aprendizaje Profundo/tendencias , Perfilación de la Expresión Génica/estadística & datos numéricos , Aprendizaje Automático/tendencias , Redes Neurales de la Computación , Algoritmos , Inteligencia Artificial/estadística & datos numéricos , Teorema de Bayes , Aprendizaje Profundo/estadística & datos numéricos , Perfilación de la Expresión Génica/métodos , Humanos , Modelos Logísticos , Aprendizaje Automático/estadística & datos numéricos , Metaboloma/genética , Máquina de Vectores de Soporte/estadística & datos numéricos , Máquina de Vectores de Soporte/tendencias
13.
Crit Care Med ; 47(8): 1065-1071, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31306254

RESUMEN

OBJECTIVES: Studies suggest that mitochondrial dysfunction underlies some forms of sepsis-induced organ failure. We sought to test the hypothesis that variations in mitochondrial DNA haplogroup affect susceptibility to sepsis-associated delirium, a common manifestation of acute brain dysfunction during sepsis. DESIGN: Retrospective cohort study. SETTING: Medical and surgical ICUs at a large tertiary care center. PATIENTS: Caucasian and African American adults with sepsis. MEASUREMENTS AND MAIN RESULTS: We determined each patient's mitochondrial DNA haplogroup using single-nucleotide polymorphisms genotyping data in a DNA databank and extracted outcomes from linked electronic medical records. We then used zero-inflated negative binomial regression to analyze age-adjusted associations between mitochondrial DNA haplogroups and duration of delirium, identified using the Confusion Assessment Method for the ICU. Eight-hundred ten patients accounted for 958 sepsis admissions, with 802 (84%) by Caucasians and 156 (16%) by African Americans. In total, 795 patient admissions (83%) involved one or more days of delirium. The 7% of Caucasians belonging to mitochondrial DNA haplogroup clade IWX experienced more delirium than the 49% in haplogroup H, the most common Caucasian haplogroup (age-adjusted rate ratio for delirium 1.36; 95% CI, 1.13-1.64; p = 0.001). Alternatively, among African Americans the 24% in haplogroup L2 experienced less delirium than those in haplogroup L3, the most common African haplogroup (adjusted rate ratio for delirium 0.60; 95% CI, 0.38-0.94; p = 0.03). CONCLUSIONS: Variations in mitochondrial DNA are associated with development of and protection from delirium in Caucasians and African Americans during sepsis. Future studies are now required to determine whether mitochondrial DNA and mitochondrial dysfunction contribute to the pathogenesis of delirium during sepsis so that targeted treatments can be developed.


Asunto(s)
Negro o Afroamericano/genética , ADN Mitocondrial/genética , Haplotipos/genética , Encefalopatía Asociada a la Sepsis/genética , Población Blanca/genética , Adulto , Enfermedad Crítica , Femenino , Humanos , Masculino , Persona de Mediana Edad , Reacción en Cadena de la Polimerasa , Estudios Retrospectivos , Análisis de Secuencia de ADN
14.
PLoS Pathog ; 13(2): e1006220, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-28241052

RESUMEN

Ethnic groups can display differential genetic susceptibility to infectious diseases. The arthropod-born viral dengue disease is one such disease, with empirical and limited genetic evidence showing that African ancestry may be protective against the haemorrhagic phenotype. Global ancestry analysis based on high-throughput genotyping in admixed populations can be used to test this hypothesis, while admixture mapping can map candidate protective genes. A Cuban dengue fever cohort was genotyped using a 2.5 million SNP chip. Global ancestry was ascertained through ADMIXTURE and used in a fine-matched corrected association study, while local ancestry was inferred by the RFMix algorithm. The expression of candidate genes was evaluated by RT-PCR in a Cuban dengue patient cohort and gene set enrichment analysis was performed in a Thai dengue transcriptome. OSBPL10 and RXRA candidate genes were identified, with most significant SNPs placed in inferred weak enhancers, promoters and lncRNAs. OSBPL10 had significantly lower expression in Africans than Europeans, while for RXRA several SNPs may differentially regulate its transcription between Africans and Europeans. Their expression was confirmed to change through dengue disease progression in Cuban patients and to vary with disease severity in a Thai transcriptome dataset. These genes interact in the LXR/RXR activation pathway that integrates lipid metabolism and immune functions, being a key player in dengue virus entrance into cells, its replication therein and in cytokine production. Knockdown of OSBPL10 expression in THP-1 cells by two shRNAs followed by DENV2 infection tests led to a significant reduction in DENV replication, being a direct functional proof that the lower OSBPL10 expression profile in Africans protects this ancestry against dengue disease.


Asunto(s)
Metabolismo de los Lípidos/genética , Receptores de Esteroides/genética , Receptor alfa X Retinoide/genética , Dengue Grave/genética , Población Negra/genética , Cuba/etnología , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Reacción en Cadena de la Polimerasa , Polimorfismo de Nucleótido Simple , Dengue Grave/etnología
15.
Clin Infect Dis ; 67(5): 778-784, 2018 08 16.
Artículo en Inglés | MEDLINE | ID: mdl-29481608

RESUMEN

Background: Age-related gait speed decline is accelerated in men with human immunodeficiency virus (HIV). Mitochondrial genetic variation is associated with frailty and mortality in the general population and may provide insight into mechanisms of functional decline in people aging with HIV. Methods: Gait speed was assessed semiannually in the Multicenter AIDS Cohort Study. Mitochondrial DNA (mtDNA) haplogroups were extracted from genome-wide genotyping data, classifying men aged ≥50 years into 5 groups: mtDNA haplogroup H, J, T, Uk, and other. Differences in gait speed by haplogroups were assessed as rate of gait speed decline per year, probability of slow gait speed (<1.0 m/s), and hazard of slow gait using multivariable linear mixed-effects models, mixed-effects logistic regression models, and the Andersen-Gill model, controlling for hepatitis C virus infection, previous AIDS diagnosis, thymidine analogues exposure, education, body composition, smoking, and peripheral neuropathy. Age was further controlled for in the mixed-effects logistic regression models. Results: A total of 455 HIV-positive white men aged ≥50 years contributed 3283 person-years of follow-up. Among them, 70% had achieved HIV viral suppression. In fully adjusted models, individuals with haplogroup J had more rapid decline in gait speed (adjusted slopes, 0.018 m/s/year vs 0.011 m/s/year, pinteraction = 0.012) and increased risk of developing slow gait (adjusted odds ratio, 2.97; 95% confidence interval, 1.24-7.08) compared to those with other haplogroups. Conclusions: Among older, HIV-infected men, mtDNA haplogroup J was an independent risk factor for more rapid age-related gait speed decline.


Asunto(s)
Envejecimiento , ADN Mitocondrial/genética , Variación Genética , Infecciones por VIH/complicaciones , Velocidad al Caminar , Factores de Edad , Envejecimiento/genética , Composición Corporal , Estudios de Cohortes , Haplotipos , Humanos , Modelos Logísticos , Masculino , Persona de Mediana Edad , Oportunidad Relativa , Factores de Riesgo , Minorías Sexuales y de Género
16.
Hum Mol Genet ; 25(5): 1031-41, 2016 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-26740552

RESUMEN

With a combined carrier frequency of 1:200, heteroplasmic mitochondrial DNA (mtDNA) mutations cause human disease in ∼1:5000 of the population. Rapid shifts in the level of heteroplasmy seen within a single generation contribute to the wide range in the severity of clinical phenotypes seen in families transmitting mtDNA disease, consistent with a genetic bottleneck during transmission. Although preliminary evidence from human pedigrees points towards a random drift process underlying the shifting heteroplasmy, some reports describe differences in segregation pattern between different mtDNA mutations. However, based on limited observations and with no direct comparisons, it is not clear whether these observations simply reflect pedigree ascertainment and publication bias. To address this issue, we studied 577 mother-child pairs transmitting the m.11778G>A, m.3460G>A, m.8344A>G, m.8993T>G/C and m.3243A>G mtDNA mutations. Our analysis controlled for inter-assay differences, inter-laboratory variation and ascertainment bias. We found no evidence of selection during transmission but show that different mtDNA mutations segregate at different rates in human pedigrees. m.8993T>G/C segregated significantly faster than m.11778G>A, m.8344A>G and m.3243A>G, consistent with a tighter mtDNA genetic bottleneck in m.8993T>G/C pedigrees. Our observations support the existence of different genetic bottlenecks primarily determined by the underlying mtDNA mutation, explaining the different inheritance patterns observed in human pedigrees transmitting pathogenic mtDNA mutations.


Asunto(s)
ADN Mitocondrial/genética , Patrón de Herencia , Enfermedades Mitocondriales/genética , Modelos Genéticos , Mutación Puntual , Teorema de Bayes , Niño , Femenino , Humanos , Enfermedades Mitocondriales/patología , Linaje , Fenotipo , Polimorfismo de Longitud del Fragmento de Restricción , Sesgo de Publicación
17.
Brief Bioinform ; 17(2): 224-32, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26249222

RESUMEN

The rapid progress in high-throughput sequencing has significantly enriched our capacity for studying the mitochondrial DNA (mtDNA). In addition to performing specific mitochondrial targeted sequencing, an increasingly popular alternative approach is using the off-target reads from exome sequencing to infer mtDNA variants, including single nucleotide polymorphisms (SNPs) and heteroplasmy. However, the effectiveness and practicality of this approach have not been tested. Recently, RNAseq data have also been suggested as a good source for alternative data mining, but whether mitochondrial variants can be detected from RNAseq data has not been validated. We designed a study to evaluate the practicability of mtDNA variant detection using exome and RNA sequencing data. Five breast cancer cell lines were sequenced through mitochondrial targeted, exome, and RNA sequencing. Mitochondrial targeted sequencing was used as the gold standard to compute the validation and false discovery rates of SNP and heteroplasmy detection in exome and RNAseq data. We found that exome and RNA sequencing can accurately detect mitochondrial SNPs. However, the lower false discovery rate makes exome sequencing a better choice for heteroplasmy detection than RNAseq. Furthermore, we examined three alignment strategies and found that aligning reads directly to the mitochondrial reference genome or aligning reads to the nuclear and mitochondrial references genomes simultaneously produced the best results, and that aligning to the nuclear genome first and afterwards to the mitochondrial genome performed poorly. In conclusion, our study provides important guidelines for future studies that intend to use either exome sequencing or RNAseq data to infer mitochondrial SNPs and heteroplasmy.


Asunto(s)
Neoplasias de la Mama/genética , ADN Mitocondrial/genética , Exoma/genética , Polimorfismo de Nucleótido Simple/genética , ARN Neoplásico/genética , Análisis de Secuencia de ARN/métodos , Algoritmos , Secuencia de Bases , Línea Celular Tumoral , Variación Genética/genética , Humanos , Datos de Secuencia Molecular , Células Neoplásicas Circulantes , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
18.
Bioinformatics ; 33(15): 2399-2401, 2017 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-28402386

RESUMEN

SUMMARY: After the introduction of high-throughput sequencing, genotyping arrays continue to be a viable source for conducting large-scale genetic studies. Currently, Illumina is one of the largest genotyping array manufacturers. One technical issue that has always plagued the post-processing of Illumina genotyping array data is the strand definition. Against convention, Illumina uses their own definition of strand, which is inconsistent with the standard reference forward and reverse definition. This issue has been a major obstacle in the consistency of reporting, meta-analysis and correct interpretation of phenotype association results. To date, the strand issue has not been adequately addressed, prompting us to develop StrandScript, a tool that can convert all genotyping data generated from Illumina genotyping arrays to the reference forward strand. StrandScript works independently of the Illumina array version and is future proof for newer Illumina array designs. Furthermore, StrandScript can examine an Illumina genotyping array manifest file and can detect all problematic SNPs, including SNPs with wrong RS ID and SNPs with mismatched probe sequences. Here, we introduce StrandScript's design and development, and demonstrate its effectiveness using real genotyping data. AVAILABILITY AND IMPLEMENTATION: https://github.com/seasky002002/Strandscript. CONTACT: yan.guo.1@vanderbilt.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Técnicas de Genotipaje/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Polimorfismo de Nucleótido Simple , Programas Informáticos , Humanos , Análisis de Secuencia de ADN/métodos
19.
Hum Reprod ; 33(7): 1331-1341, 2018 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-29850888

RESUMEN

STUDY QUESTION: Does germline selection (besides random genetic drift) play a role during the transmission of heteroplasmic pathogenic mitochondrial DNA (mtDNA) mutations in humans? SUMMARY ANSWER: We conclude that inheritance of mtDNA is mutation-specific and governed by a combination of random genetic drift and negative and/or positive selection. WHAT IS KNOWN ALREADY: mtDNA inherits maternally through a genetic bottleneck, but the underlying mechanisms are largely unknown. Although random genetic drift is recognized as an important mechanism, selection mechanisms are thought to play a role as well. STUDY DESIGN, SIZE, DURATION: We determined the mtDNA mutation loads in 160 available oocytes, zygotes, and blastomeres of five carriers of the m.3243A>G mutation, one carrier of the m.8993T>G mutation, and one carrier of the m.14487T>C mutation. PARTICIPANTS/MATERIALS, SETTING, METHODS: Mutation loads were determined in PGD samples using PCR assays and analysed mathematically to test for random sampling effects. In addition, a meta-analysis has been performed on mutation load transmission data in the literature to confirm the results of the PGD samples. MAIN RESULTS AND THE ROLE OF CHANCE: By applying the Kimura distribution, which assumes random mechanisms, we found that mtDNA segregations patterns could be explained by variable bottleneck sizes among all our carriers (moment estimates ranging from 10 to 145). Marked differences in the bottleneck size would determine the probability that a carrier produces offspring with mutations markedly different than her own. We investigated whether bottleneck sizes might also be influenced by non-random mechanisms. We noted a consistent absence of high mutation loads in all our m.3243A>G carriers, indicating non-random events. To test this, we fitted a standard and a truncated Kimura distribution to the m.3243A>G segregation data. A Kimura distribution truncated at 76.5% heteroplasmy has a significantly better fit (P-value = 0.005) than the standard Kimura distribution. For the m.8993T>G mutation, we suspect a skewed mutation load distribution in the offspring. To test this hypothesis, we performed a meta-analysis on published blood mutation levels of offspring-mother (O-M) transmission for the m.3243A>G and m.8993T>G mutations. This analysis revealed some evidence that the O-M ratios for the m.8993T>G mutation are different from zero (P-value <0.001), while for the m.3243A>G mutation there was little evidence that the O-M ratios are non-zero. Lastly, for the m.14487T>G mutation, where the whole range of mutation loads was represented, we found no indications for selective events during its transmission. LARGE SCALE DATA: All data are included in the Results section of this article. LIMITATIONS, REASON FOR CAUTION: The availability of human material for the mutations is scarce, requiring additional samples to confirm our findings. WIDER IMPLICATIONS OF THE FINDINGS: Our data show that non-random mechanisms are involved during mtDNA segregation. We aimed to provide the mechanisms underlying these selection events. One explanation for selection against high m.3243A>G mutation loads could be, as previously reported, a pronounced oxidative phosphorylation (OXPHOS) deficiency at high mutation loads, which prohibits oogenesis (e.g. progression through meiosis). No maximum mutation loads of the m.8993T>G mutation seem to exist, as the OXPHOS deficiency is less severe, even at levels close to 100%. In contrast, high mutation loads seem to be favoured, probably because they lead to an increased mitochondrial membrane potential (MMP), a hallmark on which healthy mitochondria are being selected. This hypothesis could provide a possible explanation for the skewed segregation pattern observed. Our findings are corroborated by the segregation pattern of the m.14487T>C mutation, which does not affect OXPHOS and MMP significantly, and its transmission is therefore predominantly determined by random genetic drift. Our conclusion is that mutation-specific selection mechanisms occur during mtDNA inheritance, which has implications for PGD and mitochondrial replacement therapy. STUDY FUNDING/COMPETING INTEREST(S): This work has been funded by GROW-School of Oncology and Developmental Biology. The authors declare no competing interests.


Asunto(s)
Blastómeros/metabolismo , ADN Mitocondrial/genética , Mutación de Línea Germinal , Oocitos/metabolismo , Adulto , ADN Mitocondrial/metabolismo , Femenino , Células Germinativas/metabolismo , Humanos , Masculino , Fosforilación Oxidativa
20.
Genomics ; 2017 Sep 29.
Artículo en Inglés | MEDLINE | ID: mdl-28970049

RESUMEN

The human mitochondrial genome has been extensively studied for its function and disease associations. Utilizing five types of high-throughput sequencing data on ten breast cancer patients (total N=50), we examined several aspects of the mitochondrial genome that have not been thoroughly studied, including the occurrence of tri-allelic heteroplasmy, the difference between DNA and RNA, and the variants association with polynucleotide tracts. We validated four previously reported and identified 23 additional tri-allelic positions. Furthermore, we detected 18 single nucleotide and seven InDel differences between DNA and RNA. Previous studies have suggested that some of these differences are caused by post transcriptional methylation. The rest can be accredited to RNA editing, polyadenylation or sequencing errors. Most importantly, we found that the tri-allelic positions, and differences between DNA and RNA, are strongly associated with polynucleotide tracts in the mitochondrial genome, suggesting DNA instability or difficulty sequencing around the polynucleotide tract regions.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA