Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
1.
bioRxiv ; 2024 Apr 28.
Artículo en Inglés | MEDLINE | ID: mdl-38712066

RESUMEN

The evolution of gene expression responses are a critical component of adaptation to variable environments. Predicting how DNA sequence influences expression is challenging because the genotype to phenotype map is not well resolved for cis regulatory elements, transcription factor binding, regulatory interactions, and epigenetic features, not to mention how these factors respond to environment. We tested if flexible machine learning models could learn some of the underlying cis-regulatory genotype to phenotype map. We tested this approach using cold-responsive transcriptome profiles in 5 diverse Arabidopsis thaliana accessions. We first tested for evidence that cis regulation plays a role in environmental response, finding 14 and 15 motifs that were significantly enriched within the up- and down-stream regions of cold-responsive differentially regulated genes (DEGs). We next applied convolutional neural networks (CNNs), which learn de novo cis-regulatory motifs in DNA sequences to predict expression response to environment. We found that CNNs predicted differential expression with moderate accuracy, with evidence that predictions were hindered by biological complexity of regulation and the large potential regulatory code. Overall, DEGs between specific environments can be predicted based on variation in cis-regulatory sequences, although more information needs to be incorporated and better models may be required.

2.
Hum Reprod ; 2024 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-38815977

RESUMEN

STUDY QUESTION: Can a genome-wide association study (GWAS) meta-analysis, including a large sample of young premenopausal women from a founder population from Northern Finland, identify novel genetic variants for circulating anti-Müllerian hormone (AMH) levels and provide insights into single-nucleotide polymorphism enrichment in different biological pathways and tissues involved in AMH regulation? SUMMARY ANSWER: The meta-analysis identified a total of six loci associated with AMH levels at P < 5 × 10-8, three of which were novel in or near CHEK2, BMP4, and EIF4EBP1, as well as highlighted significant enrichment in renal system vasculature morphogenesis, and the pituitary gland as the top associated tissue in tissue enrichment analysis. WHAT IS KNOWN ALREADY: AMH is expressed by preantral and small antral stage ovarian follicles in women, and variation in age-specific circulating AMH levels has been associated with several health conditions. However, the biological mechanisms underlying the association between health conditions and AMH levels are not yet fully understood. Previous GWAS have identified loci associated with AMH levels in pre-menopausal women, in or near MCM8, AMH, TEX41, and CDCA7. STUDY DESIGN, SIZE, DURATION: We performed a GWAS meta-analysis for circulating AMH level measurements in 9668 pre-menopausal women. PARTICIPANTS/MATERIALS, SETTING, METHODS: We performed a GWAS meta-analysis in which we combined 2619 AMH measurements (at age 31 years) from a prospective founder population cohort (Northern Finland Birth Cohort 1966, NFBC1966) with a previous GWAS meta-analysis that included 7049 pre-menopausal women (age range 15-48 years) (N = 9668). NFBC1966 AMH measurements were quantified using an automated assay. We annotated the genetic variants, combined different data layers to prioritize potential candidate genes, described significant pathways and tissues enriched by the GWAS signals, identified plausible regulatory roles using colocalization analysis, and leveraged publicly available summary statistics to assess genetic and phenotypic correlations with multiple traits. MAIN RESULTS AND THE ROLE OF CHANCE: Three novel genome-wide significant loci were identified. One of these is in complete linkage disequilibrium with c.1100delC in CHEK2, which is found to be 4-fold enriched in the Finnish population compared to other European populations. We propose a plausible regulatory effect of some of the GWAS variants linked to AMH, as they colocalize with GWAS signals associated with gene expression levels of BMP4, TEX41, and EIFBP41. Gene set analysis highlighted significant enrichment in renal system vasculature morphogenesis, and tissue enrichment analysis ranked the pituitary gland as the top association. LARGE SCALE DATA: The GWAS meta-analysis summary statistics are available for download from the GWAS Catalogue with accession number GCST90428625. LIMITATIONS, REASONS FOR CAUTION: This study only included women of European ancestry and the lack of sufficiently sized relevant tissue data in gene expression datasets hinders the assessment of potential regulatory effects in reproductive tissues. WIDER IMPLICATIONS OF THE FINDINGS: Our results highlight the increased power of founder populations and larger sample sizes to boost the discovery of novel trait-associated variants underlying variation in AMH levels, which aided the characterization of GWAS signals enrichment in different biological pathways and plausible genetic regulatory effects linked with AMH level variation for the first time. STUDY FUNDING/COMPETING INTEREST(S): This work has received funding from the European Union's Horizon 2020 Research and Innovation Programme under the MATER Marie Sklodowska-Curie Grant Agreement No. 813707 and Oulu University Scholarship Foundation and Paulon Säätiö Foundation. (N.P.-G.), Academy of Finland, Sigrid Jusélius Foundation, Novo Nordisk, University of Oulu, Roche Diagnostics (T.T.P.). This work was supported by the Estonian Research Council Grant 1911 (R.M.). J.R. was supported by the European Union's Horizon 2020 Research and Innovation Program under Grant Agreements No. 874739 (LongITools), 824989 (EUCAN-Connect), 848158 (EarlyCause), and 733206 (LifeCycle). U.V. was supported by the Estonian Research Council grant PRG (PRG1291). The NFBC1966 received financial support from University of Oulu Grant No. 24000692, Oulu University Hospital Grant No. 24301140, and ERDF European Regional Development Fund Grant No. 539/2010 A31592. T.T.P. has received grants from Roche, Perkin Elmer, and honoraria for scientific presentations from Gedeon Richter, Exeltis, Astellas, Roche, Stragen, Astra Zeneca, Merck, MSD, Ferring, Duodecim, and Ajaton Terveys. For all other authors, there are no competing interests.

3.
G3 (Bethesda) ; 14(6)2024 Jun 05.
Artículo en Inglés | MEDLINE | ID: mdl-38656424

RESUMEN

Identifying genuine polymorphic variants is a significant challenge in sequence data analysis, although detecting low-frequency variants in sequence data is essential for estimating demographic parameters and investigating genetic processes, such as selection, within populations. Arbuscular mycorrhizal (AM) fungi are multinucleate organisms, in which individual nuclei collectively operate as a population, and the extent of genetic variation across nuclei has long been an area of scientific interest. In this study, we investigated the patterns of polymorphism discovery and the alternate allele frequency distribution by comparing polymorphism discovery in 2 distinct genomic sequence datasets of the AM fungus model species, Rhizophagus irregularis strain DAOM197198. The 2 datasets used in this study are publicly available and were generated either from pooled spores and hyphae or amplified single nuclei from a single spore. We also estimated the intraorganismal variation within the DAOM197198 strain. Our results showed that the 2 datasets exhibited different frequency patterns for discovered variants. The whole-organism dataset showed a distribution spanning low-, intermediate-, and high-frequency variants, whereas the single-nucleus dataset predominantly featured low-frequency variants with smaller proportions in intermediate and high frequencies. Furthermore, single nucleotide polymorphism density estimates within both the whole organism and individual nuclei confirmed the low intraorganismal variation of the DAOM197198 strain and that most variants are rare. Our study highlights the methodological challenges associated with detecting low-frequency variants in AM fungal whole-genome sequence data and demonstrates that alternate alleles can be reliably identified in single nuclei of AM fungi.


Asunto(s)
Glomeromycota , Micorrizas , Micorrizas/genética , Glomeromycota/genética , Genoma Fúngico , Polimorfismo de Nucleótido Simple , Frecuencia de los Genes , Variación Genética , Núcleo Celular/genética , Hongos
4.
Mol Syst Biol ; 20(4): 362-373, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38355920

RESUMEN

Unraveling the genetic sources of gene expression variation is essential to better understand the origins of phenotypic diversity in natural populations. Genome-wide association studies identified thousands of variants involved in gene expression variation, however, variants detected only explain part of the heritability. In fact, variants such as low-frequency and structural variants (SVs) are poorly captured in association studies. To assess the impact of these variants on gene expression variation, we explored a half-diallel panel composed of 323 hybrids originated from pairwise crosses of 26 natural Saccharomyces cerevisiae isolates. Using short- and long-read sequencing strategies, we established an exhaustive catalog of single nucleotide polymorphisms (SNPs) and SVs for this panel. Combining this dataset with the transcriptomes of all hybrids, we comprehensively mapped SNPs and SVs associated with gene expression variation. While SVs impact gene expression variation, SNPs exhibit a higher effect size with an overrepresentation of low-frequency variants compared to common ones. These results reinforce the importance of dissecting the heritability of complex traits with a comprehensive catalog of genetic variants at the population level.


Asunto(s)
Estudio de Asociación del Genoma Completo , Saccharomyces cerevisiae , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Expresión Génica , Polimorfismo de Nucleótido Simple/genética , Variación Genética
5.
Int J Mol Sci ; 24(18)2023 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-37762110

RESUMEN

Whole-exome sequencing (WES) in families with an unexplained tendency for venous thromboembolism (VTE) may favor detection of low-frequency variants in genes with known contribution to hemostasis or associated with VTE-related phenotypes. WES analysis in six family members, three of whom affected by documented VTE, filtered for MAF < 0.04 in 192 candidate genes, revealed 22 heterozygous (16 missense and six synonymous) variants in patients. Functional prediction by multi-component bioinformatics tools, implemented by a database/literature search, including ClinVar annotation and QTL analysis, prioritized 12 missense variants, three of which (CRP Leu61Pro, F2 Asn514Lys and NQO1 Arg139Trp) were present in all patients, and the frequent functional variants FGB Arg478Lys and IL1A Ala114Ser. Combinations of prioritized variants in each patient were used to infer functional protein interactions. Different interaction patterns, supported by high-quality evidence, included eight proteins intertwined in the "acute phase" (CRP, F2, SERPINA1 and IL1A) and/or in the "fibrinogen complex" (CRP, F2, PLAT, THBS1, VWF and FGB) significantly enriched terms. In a wide group of candidate genes, this approach highlighted six low-frequency variants (CRP Leu61Pro, F2 Asn514Lys, SERPINA1 Arg63Cys, THBS1 Asp901Glu, VWF Arg1399His and PLAT Arg164Trp), five of which were top ranked for predicted deleteriousness, which in different combinations may contribute to disease susceptibility in members of this family.


Asunto(s)
Tromboembolia Venosa , Humanos , Tromboembolia Venosa/genética , Secuenciación del Exoma , Factor de von Willebrand/genética , Genes Reguladores , Biología Computacional
6.
Infect Dis Rep ; 15(4): 436-444, 2023 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-37623048

RESUMEN

Low-frequency mutations associated with drug resistance have been related to virologic failure in subjects with no history of pre-treatment and recent HIV diagnosis. In total, 78 antiretroviral treatment (ART)-naïve subjects with a recent HIV diagnosis were selected and followed by CD4+ T lymphocytes and viral load tests to detect virologic failure. We sequenced the basal samples retrospectively using next-generation sequencing (NGS), looking for low-frequency mutations that had not been detected before using the Sanger sequencing method (SSM) and describing the response to ART. Twenty-two subjects developed virologic failure (VF), and thirteen of them had at least one drug-resistance mutation associated with Reverse Transcriptase Inhibitors (RTI) and Protease Inhibitors (PIs) at frequency levels ≤ 1%, not detected previously in their basal genotyping test. No resistance mutations were observed to Integrase Strand Transfer Inhibitors (INSTIs). We identified a possible cause of VF in ART-naïve subjects with low-frequency mutations detected. To our knowledge, this is the first evaluation of pre-existing drug resistance for HIV-1 minority variants carried out on ART-naïve people living with HIV/AIDS (PLWHA) by analyzing the HIV-1 pol gene using NGS in the country.

7.
bioRxiv ; 2023 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-37503053

RESUMEN

Unraveling the genetic sources of gene expression variation is essential to better understand the origins of phenotypic diversity in natural populations. Genome-wide association studies identified thousands of variants involved in gene expression variation, however, variants detected only explain part of the heritability. In fact, variants such as low-frequency and structural variants (SVs) are poorly captured in association studies. To assess the impact of these variants on gene expression variation, we explored a half-diallel panel composed of 323 hybrids originated from pairwise crosses of 26 natural Saccharomyces cerevisiae isolates. Using short- and long-read sequencing strategies, we established an exhaustive catalog of single nucleotide polymorphisms (SNPs) and SVs for this panel. Combining this dataset with the transcriptomes of all hybrids, we comprehensively mapped SNPs and SVs associated with gene expression variation. While SVs impact gene expression variation, SNPs exhibit a higher effect size with an overrepresentation of low-frequency variants compared to common ones. These results reinforce the importance of dissecting the heritability of complex traits with a comprehensive catalog of genetic variants at the population level.

8.
Microb Genom ; 8(9)2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-36169645

RESUMEN

Influenza viruses exhibit considerable diversity between hosts. Additionally, different quasispecies can be found within the same host. High-throughput sequencing technologies can be used to sequence a patient-derived virus population at sufficient depths to identify low-frequency variants (LFV) present in a quasispecies, but many challenges remain for reliable LFV detection because of experimental errors introduced during sample preparation and sequencing. High genomic copy numbers and extensive sequencing depths are required to differentiate false positive from real LFV, especially at low allelic frequencies (AFs). This study proposes a general approach for identifying LFV in patient-derived samples obtained during routine surveillance. Firstly, validated thresholds were determined for LFV detection, whilst balancing both the cost and feasibility of reliable LFV detection in clinical samples. Using a genetically well-defined population of influenza A viruses, thresholds of at least 104 genomes per microlitre and AF of ≥5 % were established as detection limits. Secondly, a subset of 59 retained influenza A (H3N2) samples from the 2016-2017 Belgian influenza season was composed. Thirdly, as a proof of concept for the added value of LFV for routine influenza monitoring, potential associations between patient data and whole genome sequencing data were investigated. A significant association was found between a high prevalence of LFV and disease severity. This study provides a general methodology for influenza LFV detection, which can also be adopted by other national influenza reference centres and for other viruses such as SARS-CoV-2. Additionally, this study suggests that the current relevance of LFV for routine influenza surveillance programmes might be undervalued.


Asunto(s)
COVID-19 , Gripe Humana , Genoma Viral , Humanos , Subtipo H3N2 del Virus de la Influenza A/genética , Gripe Humana/epidemiología , SARS-CoV-2
9.
Cancer Cell ; 40(10): 1223-1239.e6, 2022 10 10.
Artículo en Inglés | MEDLINE | ID: mdl-36113475

RESUMEN

We present the largest whole-genome sequencing (WGS) study of non-small cell lung cancer (NSCLC) to date among 6,004 individuals of Chinese ancestry, coupled with 23,049 individuals genotyped by SNP array. We construct a high-quality haplotype reference panel for imputation and identify 20 common and low-frequency loci (minor allele frequency [MAF] ≥ 0.5%), including five loci that have never been reported before. For rare loss-of-function (LoF) variants (MAF < 0.5%), we identify BRCA2 and 18 other cancer predisposition genes that affect 5.29% of individuals with NSCLC, and 98.91% (181 of 183) of LoF variants have not been linked previously to NSCLC risk. Promoter variants of BRCA2 also have a substantial effect on NSCLC risk, and their prevalence is comparable with BRCA2 LoF variants. The associations are validated in an independent case-control study including 4,410 individuals and a prospective cohort study including 23,826 individuals. Our findings not only provide a high-quality reference panel for future array-based association studies but depict the whole picture of rare pathogenic variants for NSCLC.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Carcinoma de Pulmón de Células no Pequeñas/genética , Estudios de Casos y Controles , China/epidemiología , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Neoplasias Pulmonares/genética , Polimorfismo de Nucleótido Simple , Estudios Prospectivos
10.
Adv Exp Med Biol ; 1361: 37-54, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35230682

RESUMEN

Re-sequencing of the human genome by next-generation sequencing (NGS) has been widely applied to discover pathogenic genetic variants and/or causative genes accounting for various types of diseases including cancers. The advances in NGS have allowed the sequencing of the entire genome of patients and identification of disease-associated variants in a reasonable timeframe and cost. The core of the variant identification relies on accurate variant calling and annotation. Numerous algorithms have been developed to elucidate the repertoire of somatic and germline variants. Each algorithm has its own distinct strengths, weaknesses, and limitations due to the difference in the statistical modeling approach adopted and read information utilized. Accurate variant calling remains challenging due to the presence of sequencing artifacts and read misalignments. All of these can lead to the discordance of the variant calling results and even misinterpretation of the discovery. For somatic variant detection, multiple factors including chromosomal abnormalities, tumor heterogeneity, tumor-normal cross contaminations, unbalanced tumor/normal sample coverage, and variants with low allele frequencies add even more layers of complexity to accurate variant identification. Given the discordances and difficulties, ensemble approaches have emerged by harmonizing information from different algorithms to improve variant calling performance. In this chapter, we first introduce the general scheme of variant calling algorithms and potential challenges at distinct stages. We next review the existing workflows of variant calling and annotation, and finally explore the strategies deployed by different callers as well as their strengths and caveats. Overall, NGS-based variant identification with careful consideration allows reliable detection of pathogenic variant and candidate variant selection for precision medicine.


Asunto(s)
Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Algoritmos , Células Germinativas , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Modelos Estadísticos , Programas Informáticos
11.
Neurobiol Aging ; 110: 106-112, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34635350

RESUMEN

NUS1 has been recently identified as a candidate gene for Parkinson's disease (PD). Few studies have examined the association of NUS1 variants with PD susceptibility and phenotypes. In the first cohort, whole-exome sequencing was performed to identify variants in NUS1 exon-coding and exon-intron regions in 1542 cases and 1625 controls. 13 variants were totally detected, of which 10 rare variants and 3 low-frequency variants. Burden analysis showed that rare NUS1 variants significantly enriched in PD (p=0.016). We also performed a meta-analysis based on previous and our studies to correlate NUS1 mutations with PD susceptibility. Integrating our previous cohort (3210 cases and 2807 controls) and the first cohort identified the significant association of rs539668656 with PD risk (odds ratio (OR) = 2.82, p = 0.016). The genotype-phenotype association analysis showed that patients carrying rare variants, or rs539668656 were significantly associated with earlier onset age, depression, emotional impairment and severe disease condition. Our results support the role of NUS1 rare variants and rs539668656 towards PD susceptibility and phenotype.


Asunto(s)
Frecuencia de los Genes/genética , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad/genética , Mutación/genética , Enfermedad de Parkinson/genética , Fenotipo , Receptores de Superficie Celular/genética , Edad de Inicio , Estudios de Cohortes , Exones/genética , Femenino , Humanos , Intrones/genética , Masculino , Enfermedad de Parkinson/psicología , Gravedad del Paciente , Riesgo , Secuenciación del Exoma
12.
Curr Issues Mol Biol ; 43(3): 1778-1793, 2021 Oct 27.
Artículo en Inglés | MEDLINE | ID: mdl-34889895

RESUMEN

Multiple Sclerosis (MS) is a complex multifactorial autoimmune disease, whose sex- and age-adjusted prevalence in Sardinia (Italy) is among the highest worldwide. To date, 233 loci were associated with MS and almost 20% of risk heritability is attributable to common genetic variants, but many low-frequency and rare variants remain to be discovered. Here, we aimed to contribute to the understanding of the genetic basis of MS by investigating potentially functional rare variants. To this end, we analyzed thirteen multiplex Sardinian families with Immunochip genotyping data. For five families, Whole Exome Sequencing (WES) data were also available. Firstly, we performed a non-parametric Homozygosity Haplotype analysis for identifying the Region from Common Ancestor (RCA). Then, on these potential disease-linked RCA, we searched for the presence of rare variants shared by the affected individuals by analyzing WES data. We found: (i) a variant (43181034 T > G) in the splicing region on exon 27 of CUL9; (ii) a variant (50245517 A > C) in the splicing region on exon 16 of ATP9A; (iii) a non-synonymous variant (43223539 A > C), on exon 9 of TTBK1; (iv) a non-synonymous variant (42976917 A > C) on exon 9 of PPP2R5D; and v) a variant (109859349-109859354) in 3'UTR of MYO16.


Asunto(s)
Secuenciación del Exoma , Predisposición Genética a la Enfermedad , Variación Genética , Haplotipos , Homocigoto , Esclerosis Múltiple/diagnóstico , Esclerosis Múltiple/genética , Alelos , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Italia , Masculino , Linaje , Polimorfismo de Nucleótido Simple
13.
Int J Mol Sci ; 22(18)2021 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-34576031

RESUMEN

TREM2 is among the most well-known Alzheimer's disease (AD) risk genes; however, the functional roles of its AD-associated variants remain to be elucidated, and most known risk alleles are low-frequency variants whose investigation is challenging. Here, we utilized a splicing-guided aggregation method in which multiple low-frequency TREM2 variants were bundled together to investigate the functional impact of those variants on alternative splicing in AD. We analyzed whole genome sequencing (WGS) and RNA-seq data generated from cognitively normal elderly controls (CN) and AD patients in two independent cohorts, representing three regions in the frontal lobe of the human brain: the dorsolateral prefrontal cortex (CN = 213 and AD = 376), frontal pole (CN = 72 and AD = 175), and inferior frontal (CN = 63 and AD = 157). We observed an exon skipping event in the second exon of TREM2, with that exon tending to be more frequently skipped (p = 0.0012) in individuals having at least one low-frequency variant that caused loss-of-function for a splicing regulatory element. In addition, genes differentially expressed between AD patients with high vs. low skipping of the second exon (i.e., loss of a TREM2 functional domain) were significantly enriched in immune-related pathways. Our splicing-guided aggregation method thus provides new insight into the regulation of alternative splicing of the second exon of TREM2 by low-frequency variants and could be a useful tool for further exploring the potential molecular mechanisms of multiple, disease-associated, low-frequency variants.


Asunto(s)
Empalme Alternativo/genética , Enfermedad de Alzheimer/genética , Predisposición Genética a la Enfermedad , Glicoproteínas de Membrana/genética , Receptores Inmunológicos/genética , Anciano , Enfermedad de Alzheimer/patología , Encéfalo/metabolismo , Encéfalo/patología , Exones/genética , Femenino , Frecuencia de los Genes/genética , Variación Genética/genética , Humanos , Masculino , Empalme del ARN/genética , RNA-Seq , Secuencias Reguladoras de Ácidos Nucleicos/genética , Secuenciación Completa del Genoma
14.
Zhonghua Zhong Liu Za Zhi ; 43(7): 801-805, 2021 Jul 23.
Artículo en Chino | MEDLINE | ID: mdl-34289576

RESUMEN

Objective: To analyze the association between low-frequency variants of ARID1A gene and primary liver cancer using latent category model. Methods: The low-frequency variants of ARID1A gene was combined according to different functional areas, and the combined variables were analyzed by using the latent class model to obtain the latent variables. Then the logistic regression was used to analyze the association between low-frequency variants of ARID1A gene and primary liver cancer. Results: The low-frequency variants of ARID1A gene were divided into three categories by the latent class model. The class 1 was mainly unmutated population, the proportion was 94.2% (2 454/2 603). The class 2 was mainly transcriptional regulatory domain mutation, take 4.8% (124/2 603). The class 3 was dominantly exon mutation, about 1.0% (27/2 603). Using class 1 as a reference, it was found that mutations in the transcriptional regulatory domain could reduce the risk of liver cancer (OR=0.601, 95% CI=0.364-0.992, P=0.046). Conclusion: The latent class model can identify low-frequency variants of gene associated with liver cancer and can be extended to more genetic association studies of low-frequency variants related to complex diseases.


Asunto(s)
Neoplasias Hepáticas , Proteínas Nucleares , Proteínas de Unión al ADN , Humanos , Análisis de Clases Latentes , Neoplasias Hepáticas/genética , Mutación , Proteínas Nucleares/genética , Factores de Transcripción/genética
15.
Mol Cell Biochem ; 476(7): 2703-2718, 2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-33666829

RESUMEN

The zinc transporter 8 (ZnT8) plays an essential role in zinc homeostasis inside pancreatic ß cells, its function is related to the stabilization of insulin hexameric form. Genome-wide association studies (GWAS) have established a positive and negative relationship of ZnT8 variants with type 2 diabetes mellitus (T2DM), exposing a dual and controversial role. The first hypotheses about its role in T2DM indicated a higher risk of developing T2DM for loss of function; nevertheless, recent GWAS of ZnT8 loss-of-function mutations in humans have shown protection against T2DM. With regard to the ZnT8 role in T2DM, most studies have focused on rodent models and common high-risk variants; however, considerable differences between human and rodent models have been found and the new approaches have included lower-frequency variants as a tool to clarify gene functions, allowing a better understanding of the disease and offering possible therapeutic targets. Therefore, this review will discuss the physiological effects of the ZnT8 variants associated with a major and lower risk of T2DM, emphasizing the low- and rare-frequency variants.


Asunto(s)
Diabetes Mellitus Tipo 2 , Transportador 8 de Zinc , Animales , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Estudio de Asociación del Genoma Completo , Humanos , Transportador 8 de Zinc/deficiencia , Transportador 8 de Zinc/metabolismo
16.
Mol Ecol Resour ; 21(4): 1216-1229, 2021 May.
Artículo en Inglés | MEDLINE | ID: mdl-33534960

RESUMEN

Population genomics is a fast-developing discipline with promising applications in a growing number of life sciences fields. Advances in sequencing technologies and bioinformatics tools allow population genomics to exploit genome-wide information to identify the molecular variants underlying traits of interest and the evolutionary forces that modulate these variants through space and time. However, the cost of genomic analyses of multiple populations is still too high to address them through individual genome sequencing. Pooling individuals for sequencing can be a more effective strategy in Single Nucleotide Polymorphism (SNP) detection and allele frequency estimation because of a higher total coverage. However, compared to individual sequencing, SNP calling from pools has the additional difficulty of distinguishing rare variants from sequencing errors, which is often avoided by establishing a minimum threshold allele frequency for the analysis. Finding an optimal balance between minimizing information loss and reducing sequencing costs is essential to ensure the success of population genomics studies. Here, we have benchmarked the performance of SNP callers for Pool-seq data, based on different approaches, under different conditions, and using computer simulations and real data. We found that SNP callers performance varied for allele frequencies up to 0.35. We also found that SNP callers based on Bayesian (SNAPE-pooled) or maximum likelihood (MAPGD) approaches outperform the two heuristic callers tested (VarScan and PoolSNP), in terms of the balance between sensitivity and FDR both in simulated and sequencing data. Our results will help inform the selection of the most appropriate SNP caller not only for large-scale population studies but also in cases where the Pool-seq strategy is the only option, such as in metagenomic or polyploid studies.


Asunto(s)
Frecuencia de los Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Polimorfismo de Nucleótido Simple , Teorema de Bayes , Simulación por Computador , Funciones de Verosimilitud
17.
BMC Bioinformatics ; 21(1): 96, 2020 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-32131723

RESUMEN

BACKGROUND: Duplex sequencing is the most accurate approach for identification of sequence variants present at very low frequencies. Its power comes from pooling together multiple descendants of both strands of original DNA molecules, which allows distinguishing true nucleotide substitutions from PCR amplification and sequencing artifacts. This strategy comes at a cost-sequencing the same molecule multiple times increases dynamic range but significantly diminishes coverage, making whole genome duplex sequencing prohibitively expensive. Furthermore, every duplex experiment produces a substantial proportion of singleton reads that cannot be used in the analysis and are thrown away. RESULTS: In this paper we demonstrate that a significant fraction of these reads contains PCR or sequencing errors within duplex tags. Correction of such errors allows "reuniting" these reads with their respective families increasing the output of the method and making it more cost effective. CONCLUSIONS: We combine an error correction strategy with a number of algorithmic improvements in a new version of the duplex analysis software, Du Novo 2.0. It is written in Python, C, AWK, and Bash. It is open source and readily available through Galaxy, Bioconda, and Github: https://github.com/galaxyproject/dunovo.


Asunto(s)
Interfaz Usuario-Computador , Algoritmos , ADN/química , ADN/metabolismo , Humanos , Alineación de Secuencia , Análisis de Secuencia de ADN
18.
Elife ; 82019 12 04.
Artículo en Inglés | MEDLINE | ID: mdl-31799931

RESUMEN

Rare genetic variants in yeast explain a large amount of phenotypic variation in a complex trait like growth.


Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Herencia Multifactorial
19.
Elife ; 82019 10 24.
Artículo en Inglés | MEDLINE | ID: mdl-31647416

RESUMEN

Genome-wide association studies (GWAS) allow to dissect complex traits and map genetic variants, which often explain relatively little of the heritability. One potential reason is the preponderance of undetected low-frequency variants. To increase their allele frequency and assess their phenotypic impact in a population, we generated a diallel panel of 3025 yeast hybrids, derived from pairwise crosses between natural isolates and examined a large number of traits. Parental versus hybrid regression analysis showed that while most phenotypic variance is explained by additivity, a third is governed by non-additive effects, with complete dominance having a key role. By performing GWAS on the diallel panel, we found that associated variants with low frequency in the initial population are overrepresented and explain a fraction of the phenotypic variance as well as an effect size similar to common variants. Overall, we highlighted the relevance of low-frequency variants on the phenotypic variation.


Asunto(s)
Variación Genética , Genoma Fúngico , Sitios de Carácter Cuantitativo , Carácter Cuantitativo Heredable , Saccharomyces cerevisiae/genética , Alelos , Evolución Biológica , Quimera , Mapeo Cromosómico , Cruzamientos Genéticos , Fenotipo , Saccharomyces cerevisiae/metabolismo , Selección Genética
20.
Front Genet ; 10: 573, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31297130

RESUMEN

In light of the complex nature of multiple sclerosis (MS) and the recently estimated contribution of low-frequency variants into disease, decoding its genetic risk components requires novel variant prioritization strategies. We selected, by reviewing MS Genome Wide Association Studies (GWAS), 107 candidate loci marked by intragenic single nucleotide polymorphisms (SNPs) with a remarkable association (p-value ≤ 5 × 10-6). A whole exome sequencing (WES)-based pilot study of SNPs with minor allele frequency (MAF) ≤ 0.04, conducted in three Italian families, revealed 15 exonic low-frequency SNPs with affected parent-child transmission. These variants were detected in 65/120 Italian unrelated MS patients, also in combination (22 patients). Compared with databases (controls gnomAD, dbSNP150, ExAC, Tuscany-1000 Genome), the allelic frequencies of C6orf10 rs16870005 and IL2RA rs12722600 were significantly higher (i.e., controls gnomAD, p = 9.89 × 10-7 and p < 1 × 10-20). TET2 rs61744960 and TRAF3 rs138943371 frequencies were also significantly higher, except in Tuscany-1000 Genome. Interestingly, the association of C6orf10 rs16870005 (Ala431Thr) with MS did not depend on its linkage disequilibrium with the HLA-DRB1 locus. Sequencing in the MS cohort of the C6orf10 3' region revealed 14 rare mutations (10 not previously reported). Four variants were null, and significantly more frequent than in the databases. Further, the C6orf10 rare variants were observed in combinations, both intra-locus and with other low-frequency SNPs. The C6orf10 Ser389Xfr was found homozygous in a patient with early onset of the MS. Taking into account the potentially functional impact of the identified exonic variants, their expression in combination at the protein level could provide functional insights in the heterogeneous pathogenetic mechanisms contributing to MS.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...