RESUMO
Metabolic traits are heritable phenotypes widely-used in assessing the risk of various diseases. We conduct a genome-wide association analysis (GWAS) of nine metabolic traits (including glycemic, lipid, liver enzyme levels) in 125,872 Korean subjects genotyped with the Korea Biobank Array. Following meta-analysis with GWAS from Biobank Japan identify 144 novel signals (MAF ≥ 1%), of which 57.0% are replicated in UK Biobank. Additionally, we discover 66 rare (MAF < 1%) variants, 94.4% of them co-incident to common loci, adding to allelic series. Although rare variants have limited contribution to overall trait variance, these lead, in carriers, substantial loss of predictive accuracy from polygenic predictions of disease risk from common variant alone. We capture groups with up to 16-fold variation in type 2 diabetes (T2D) prevalence by integration of genetic risk scores of fasting plasma glucose and T2D and the I349F rare protective variant. This study highlights the need to consider the joint contribution of both common and rare variants on inherited risk of metabolic traits and related diseases.
Assuntos
Diabetes Mellitus Tipo 2 , Estudo de Associação Genômica Ampla , Humanos , Diabetes Mellitus Tipo 2/genética , Fenótipo , Povo Asiático/genética , Glicemia/genética , Polimorfismo de Nucleotídeo Único , Variação Genética , Predisposição Genética para DoençaRESUMO
Several reports have suggested that genetic susceptibility contributes to the development and progression of diabetic retinopathy. We aimed to identify genetic loci that confer susceptibility to diabetic retinopathy in Japanese patients with type 2 diabetes. We analysed 5 790 508 single nucleotide polymorphisms (SNPs) in 8880 Japanese patients with type 2 diabetes, 4839 retinopathy cases and 4041 controls, as well as 2217 independent Japanese patients with type 2 diabetes, 693 retinopathy cases and 1524 controls. The results of these two genome-wide association studies (GWAS) were combined with an inverse variance meta-analysis (Stage-1), followed by de novo genotyping for the candidate SNP loci (P < 1.0 × 10-4) in an independent case-control study (Stage-2, 2260 cases and 723 controls). After combining the association data (Stages 1 and 2) using meta-analysis, the associations of two loci reached a genome-wide significance level: rs12630354 near STT3B on chromosome 3, P = 1.62 × 10-9, odds ratio (OR) = 1.17, 95% confidence interval (CI) 1.11-1.23, and rs140508424 within PALM2 on chromosome 9, P = 4.19 × 10-8, OR = 1.61, 95% CI 1.36-1.91. However, the association of these two loci was not replicated in Korean, European or African American populations. Gene-based analysis using Stage-1 GWAS data identified a gene-level association of EHD3 with susceptibility to diabetic retinopathy (P = 2.17 × 10-6). In conclusion, we identified two novel SNP loci, STT3B and PALM2, and a novel gene, EHD3, that confers susceptibility to diabetic retinopathy; however, further replication studies are required to validate these associations.
Assuntos
Diabetes Mellitus Tipo 2/genética , Retinopatia Diabética/genética , Loci Gênicos/genética , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , Alelos , Povo Asiático/genética , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/etnologia , Retinopatia Diabética/etnologia , Retinopatia Diabética/etiologia , Frequência do Gene , Predisposição Genética para Doença/etnologia , Genótipo , Hexosiltransferases/genética , Humanos , Japão , Proteínas de Membrana/genética , Metanálise como Assunto , Fosfoproteínas/genéticaRESUMO
INTRODUCTION: Obesity is growing global health concern and highly associated with increased risk of metabolic diseases including type 2 diabetes. We aimed to discover new differential DNA methylation patterns predisposing obesity and prioritize surrogate epigenetic markers in Koreans. RESEARCH DESIGN AND METHODS: We performed multistage epigenome-wide analyses to identify differentially expressed CpGs in obesity using the Illumina HumanMethylationEPIC array (EPIC). Forty-eight CpGs showed significant differences across three phases: 902 whole blood DNAs from two cohorts (phase 1: n=450, phase 2: n=377) and a hospital-based sample (phase 3: n=75). Samples from phase III participants were used to examine whether the 48 CpGs are significant in the fat tissue and influenced gene expression. Furthermore, we investigated the epigenetic effect of CpG loci in childhood obesity (n=94). RESULTS: Seven of the 48 CpGs exhibited similar changes in the fat tissue along with gene expression changes. In particular, hypomethylated CpG (cg13424229) on the GATA1 transcription factor cluster of CPA3 promoter was related to its increased gene expression and showed consistent effect in childhood obesity. Interestingly, subsequent analysis using RNA sequencing data from 21 preadipocytes and 26 adipocytes suggested CPA3 as a potential obesity-related gene. Moreover, expression patterns from RNA sequencing and public Gene Expression Omnibus showed the correlation between CPA3 and type 2 diabetes (T2D) and asthma. CONCLUSIONS: Our finding prioritizes influential genes in obesity and provides new evidence for the role of CPA3 linking obesity, T2D, and asthma.
Assuntos
Metilação de DNA , Diabetes Mellitus Tipo 2 , Ilhas de CpG/genética , Metilação de DNA/genética , Diabetes Mellitus Tipo 2/genética , Epigenoma , Estudo de Associação Genômica Ampla , Humanos , Inflamação/genética , Obesidade/genética , Sequências Reguladoras de Ácido Nucleico , República da CoreiaRESUMO
This study investigated whether the promoter region of DNA methylation positively or negatively regulates tissue-specific genes (TSGs) and if it correlates with disease pathophysiology. We assessed tissue specificity metrics in five human tissues, using sequencing-based approaches, including 52 whole genome bisulfite sequencing (WGBS), 52 RNA-seq, and 144 chromatin immunoprecipitation sequencing (ChIP-seq) data. A correlation analysis was performed between the gene expression and DNA methylation levels of the TSG promoter region. The TSG enrichment analyses were conducted in the gene-disease association network (DisGeNET). The epigenomic association analyses of CpGs in enriched TSG promoters were performed using 1986 Infinium MethylationEPIC array data. A correlation analysis showed significant associations between the promoter methylation and 449 TSGs' expression. A disease enrichment analysis showed that diabetes- and obesity-related diseases were high-ranked. In an epigenomic association analysis based on obesity, 62 CpGs showed statistical significance. Among them, three obesity-related CpGs were newly identified and replicated with statistical significance in independent data. In particular, a CpG (cg17075888 of PDK4), considered as potential therapeutic targets, were associated with complex diseases, including obesity and type 2 diabetes. The methylation changes in a substantial number of the TSG promoters showed a significant association with metabolic diseases. Collectively, our findings provided strong evidence of the relationship between tissue-specific patterns of epigenetic changes and metabolic diseases.
Assuntos
Metilação de DNA , Diabetes Mellitus Tipo 2/genética , Obesidade/genética , Transcriptoma , Animais , Ilhas de CpG , Epigênese Genética , Redes Reguladoras de Genes , Genoma Humano , Humanos , Especificidade de Órgãos/genética , Regiões Promotoras Genéticas , Sequenciamento Completo do GenomaRESUMO
Genomic analysis begins with de novo assembly of short-read fragments in order to reconstruct full-length base sequences without exploiting a reference genome sequence. Then, in the annotation step, gene locations are identified within the base sequences, and the structures and functions of these genes are determined. Recently, a wide range of powerful tools have been developed and published for whole-genome analysis, enabling even individual researchers in small laboratories to perform whole-genome analyses on their objects of interest. However, these analytical tools are generally complex and use diverse algorithms, parameter setting methods, and input formats; thus, it remains difficult for individual researchers to select, utilize, and combine these tools to obtain their final results. To resolve these issues, we have developed a genome analysis pipeline (GAAP) for semiautomated, iterative, and high-throughput analysis of whole-genome data. This pipeline is designed to perform read correction, de novo genome (transcriptome) assembly, gene prediction, and functional annotation using a range of proven tools and databases. We aim to assist non-IT researchers by describing each stage of analysis in detail and discussing current approaches. We also provide practical advice on how to access and use the bioinformatics tools and databases and how to implement the provided suggestions. Whole-genome analysis of Toxocara canis is used as case study to show intermediate results at each stage, demonstrating the practicality of the proposed method.
Assuntos
Bases de Dados de Ácidos Nucleicos , Genoma Helmíntico , Anotação de Sequência Molecular , Toxocara canis/genética , Sequenciamento Completo do Genoma , Animais , GenômicaRESUMO
Copy number variations (CNVs) are structural variants associated with human diseases. Recent studies verified that disease-related genes are based on the extraction of rare de novo and transmitted CNVs from exome sequencing data. The need for more efficient and accurate methods has increased, which still remains a challenging problem due to coverage biases, as well as the sparse, small-sized, and noncontinuous nature of exome sequencing. In this study, we developed a new CNV detection method, ExCNVSS, based on read coverage depth evaluation and scale-space filtering to resolve these problems. We also developed the method ExCNVSS_noRatio, which is a version of ExCNVSS, for applying to cases with an input of test data only without the need to consider the availability of a matched control. To evaluate the performance of our method, we tested it with 11 different simulated data sets and 10 real HapMap samples' data. The results demonstrated that ExCNVSS outperformed three other state-of-the-art methods and that our method corrected for coverage biases and detected all-sized CNVs even without matched control data.
Assuntos
Variações do Número de Cópias de DNA , Exoma , Sequenciamento de Nucleotídeos em Larga Escala , Modelos GenéticosRESUMO
This study aimed at constructing a draft genome of the adult female worm Toxocara canis using next-generation sequencing (NGS) and de novo assembly, as well as to find new genes after annotation using functional genomics tools. Using an NGS machine, we produced DNA read data of T. canis. The de novo assembly of the read data was performed using SOAPdenovo. RNA read data were assembled using Trinity. Structural annotation, homology search, functional annotation, classification of protein domains, and KEGG pathway analysis were carried out. Besides them, recently developed tools such as MAKER, PASA, Evidence Modeler, and Blast2GO were used. The scaffold DNA was obtained, the N50 was 108,950 bp, and the overall length was 341,776,187 bp. The N50 of the transcriptome was 940 bp, and its length was 53,046,952 bp. The GC content of the entire genome was 39.3%. The total number of genes was 20,178, and the total number of protein sequences was 22,358. Of the 22,358 protein sequences, 4,992 were newly observed in T. canis. Following proteins previously unknown were found: E3 ubiquitin-protein ligase cbl-b and antigen T-cell receptor, zeta chain for T-cell and B-cell regulation; endoprotease bli-4 for cuticle metabolism; mucin 12Ea and polymorphic mucin variant C6/1/40r2.1 for mucin production; tropomodulin-family protein and ryanodine receptor calcium release channels for muscle movement. We were able to find new hypothetical polypeptides sequences unique to T. canis, and the findings of this study are capable of serving as a basis for extending our biological understanding of T. canis.