Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 2020 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-32459350

RESUMO

Functional crosstalk between histone modifications and chromatin remodeling has emerged as a key regulatory mode of transcriptional control during cell fate decisions, but the underlying mechanisms are not fully understood. Here we discover an HRP2-DPF3a-BAF epigenetic pathway that coordinates methylated histone H3 lysine 36 (H3K36me) and ATP-dependent chromatin remodeling to regulate chromatin dynamics and gene transcription during myogenic differentiation. Using siRNA screening targeting epigenetic modifiers, we identify hepatoma-derived growth factor-related protein 2 (HRP2) as a key regulator of myogenesis. Knockout of HRP2 in mice leads to impaired muscle regeneration. Mechanistically, through its HIV integrase binding domain (IBD), HRP2 associates with the BRG1/BRM-associated factor (BAF) chromatin remodeling complex by interacting directly with the BAF45c (DPF3a) subunit. Through its Pro-Trp-Trp-Pro (PWWP) domain, HRP2 preferentially binds to H3K36me2. Consistent with the biochemical studies, ChIP-seq analyses show that HRP2 colocalizes with DPF3a across the genome and that the recruitment of HRP2/DPF3a to chromatin is dependent on H3K36me2. Integrative transcriptomic and cistromic analyses, coupled with ATAC-seq, reveal that HRP2 and DPF3a activate myogenic genes by increasing chromatin accessibility through recruitment of BRG1, the ATPase subunit of the BAF complex. Taken together, these results illuminate a key role for the HRP2-DPF3a-BAF complex in the epigenetic coordination of gene transcription during myogenic differentiation.

2.
Cell Death Differ ; 27(3): 1052-1066, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31358914

RESUMO

The initiation and transduction of DNA damage response (DDR) occur in the context of chromatin, and modifications as well as the structure of chromatin are crucial for DDR signaling. How the profound chromatin alterations are confined to DNA lesions by epigenetic factors remains largely unclear. Here, we discover that JMJD6, a Jumonji C domain-containing protein, is recruited to DNA double-strand breaks (DSBs) after microirradiation. JMJD6 controls the spreading of histone ubiquitination, as well as the subsequent accumulation of repair proteins and transcriptional silencing around DSBs, but does not regulate the initial DNA damage sensing. Furthermore, JMJD6 deficiency results in promotion of the efficiency of nonhomologous end joining (NHEJ) and homologous recombination (HR), rapid cell-cycle checkpoint recovery, and enhanced survival after irradiation. Regarding the mechanism involved, we demonstrate that JMJD6, independently of its catalytic activity, interacts with SIRT1 and recruits it to chromatin to downregulate H4K16ac around DSBs. Our study reveals JMJD6 as a modulator of the epigenome around DNA lesions, and adds to the understanding of the role of epigenetic factors in DNA damage response.

3.
Nucleic Acids Res ; 48(D1): D807-D816, 2020 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-31691819

RESUMO

Genome-wide association studies (GWASs) have revolutionized the field of complex trait genetics over the past decade, yet for most of the significant genotype-phenotype associations the true causal variants remain unknown. Identifying and interpreting how causal genetic variants confer disease susceptibility is still a big challenge. Herein we introduce a new database, CAUSALdb, to integrate the most comprehensive GWAS summary statistics to date and identify credible sets of potential causal variants using uniformly processed fine-mapping. The database has six major features: it (i) curates 3052 high-quality, fine-mappable GWAS summary statistics across five human super-populations and 2629 unique traits; (ii) estimates causal probabilities of all genetic variants in GWAS significant loci using three state-of-the-art fine-mapping tools; (iii) maps the reported traits to a powerful ontology MeSH, making it simple for users to browse studies on the trait tree; (iv) incorporates highly interactive Manhattan and LocusZoom-like plots to allow visualization of credible sets in a single web page more efficiently; (v) enables online comparison of causal relations on variant-, gene- and trait-levels among studies with different sample sizes or populations and (vi) offers comprehensive variant annotations by integrating massive base-wise and allele-specific functional annotations. CAUSALdb is freely available at http://mulinlab.org/causaldb.

4.
Nucleic Acids Res ; 48(D1): D983-D991, 2020 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-31598699

RESUMO

Recent advances in genome sequencing and functional genomic profiling have promoted many large-scale quantitative trait locus (QTL) studies, which connect genotypes with tissue/cell type-specific cellular functions from transcriptional to post-translational level. However, no comprehensive resource can perform QTL lookup across multiple molecular phenotypes and investigate the potential cascade effect of functional variants. We developed a versatile resource, named QTLbase, for interpreting the possible molecular functions of genetic variants, as well as their tissue/cell-type specificity. Overall, QTLbase has five key functions: (i) curating and compiling genome-wide QTL summary statistics for 13 human molecular traits from 233 independent studies; (ii) mapping QTL-relevant tissue/cell types to 78 unified terms according to a standard anatomogram; (iii) normalizing variant and trait information uniformly, yielding >170 million significant QTLs; (iv) providing a rich web client that enables phenome- and tissue-wise visualization; and (v) integrating the most comprehensive genomic features and functional predictions to annotate the potential QTL mechanisms. QTLbase provides a one-stop shop for QTL retrieval and comparison across multiple tissues and multiple layers of molecular complexity, and will greatly help researchers interrogate the biological mechanism of causal variants and guide the direction of functional validation. QTLbase is freely available at http://mulinlab.org/qtlbase.

5.
Mol Psychiatry ; 25(3): 517-529, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31827248

RESUMO

The Chinese Imaging Genetics (CHIMGEN) study establishes the largest Chinese neuroimaging genetics cohort and aims to identify genetic and environmental factors and their interactions that are associated with neuroimaging and behavioral phenotypes. This study prospectively collected genomic, neuroimaging, environmental, and behavioral data from more than 7000 healthy Chinese Han participants aged 18-30 years. As a pioneer of large-sample neuroimaging genetics cohorts of non-Caucasian populations, this cohort can provide new insights into ethnic differences in genetic-neuroimaging associations by being compared with Caucasian cohorts. In addition to micro-environmental measurements, this study also collects hundreds of quantitative macro-environmental measurements from remote sensing and national survey databases based on the locations of each participant from birth to present, which will facilitate discoveries of new environmental factors associated with neuroimaging phenotypes. With lifespan environmental measurements, this study can also provide insights on the macro-environmental exposures that affect the human brain as well as their timing and mechanisms of action.

6.
Brief Bioinform ; 2019 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-31750520

RESUMO

In clinical cancer treatment, genomic alterations would often affect the response of patients to anticancer drugs. Studies have shown that molecular features of tumors could be biomarkers predictive of sensitivity or resistance to anticancer agents, but the identification of actionable mutations are often constrained by the incomplete understanding of cancer genomes. Recent progresses of next-generation sequencing technology greatly facilitate the extensive molecular characterization of tumors and promote precision medicine in cancers. More and more clinical studies, cancer cell lines studies, CRISPR screening studies as well as patient-derived model studies were performed to identify potential actionable mutations predictive of drug response, which provide rich resources of molecularly and pharmacologically profiled cancer samples at different levels. Such abundance of data also enables the development of various computational models and algorithms to solve the problem of drug sensitivity prediction, biomarker identification and in silico drug prioritization by the integration of multiomics data. Here, we review the recent development of methods and resources that identifies mutation-dependent effects for cancer treatment in clinical studies, functional genomics studies and computational studies and discuss the remaining gaps and future directions in this area.

7.
Nucleic Acids Res ; 47(21): e134, 2019 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-31511901

RESUMO

Predicting the functional or pathogenic regulatory variants in the human non-coding genome facilitates the interpretation of disease causation. While numerous prediction methods are available, their performance is inconsistent or restricted to specific tasks, which raises the demand of developing comprehensive integration for those methods. Here, we compile whole genome base-wise aggregations, regBase, that incorporate largest prediction scores. Building on different assumptions of causality, we train three composite models to score functional, pathogenic and cancer driver non-coding regulatory variants respectively. We demonstrate the superior and stable performance of our models using independent benchmarks and show great success to fine-map causal regulatory variants on specific locus or at base-wise resolution. We believe that regBase database together with three composite models will be useful in different areas of human genetic studies, such as annotation-based casual variant fine-mapping, pathogenic variant discovery as well as cancer driver mutation identification. regBase is freely available at https://github.com/mulinlab/regBase.

8.
Nucleic Acids Res ; 47(16): e96, 2019 09 19.
Artigo em Inglês | MEDLINE | ID: mdl-31287869

RESUMO

Genomic identification of driver mutations and genes in cancer cells are critical for precision medicine. Due to difficulty in modelling distribution of background mutation counts, existing statistical methods are often underpowered to discriminate cancer-driver genes from passenger genes. Here we propose a novel statistical approach, weighted iterative zero-truncated negative-binomial regression (WITER, http://grass.cgs.hku.hk/limx/witer or KGGSeq,http://grass.cgs.hku.hk/limx/kggseq/), to detect cancer-driver genes showing an excess of somatic mutations. By fitting the distribution of background mutation counts properly, this approach works well even in small or moderate samples. Compared to alternative methods, it detected more significant and cancer-consensus genes in most tested cancers. Applying this approach, we estimated 229 driver genes in 26 different types of cancers. In silico validation confirmed 78% of predicted genes as likely known drivers and many other genes as very likely new drivers for corresponding cancers. The technical advances of WITER enable the detection of driver genes in TCGA datasets as small as 30 subjects and rescue of more genes missed by alternative tools in moderate or small samples.


Assuntos
Regulação Neoplásica da Expressão Gênica , Genômica/estatística & dados numéricos , Proteínas de Neoplasias/genética , Neoplasias/diagnóstico , Oncogenes , Software , Benchmarking , Simulação por Computador , Genômica/métodos , Humanos , Internet , Mutação , Proteínas de Neoplasias/classificação , Proteínas de Neoplasias/metabolismo , Neoplasias/classificação , Neoplasias/genética , Análise de Regressão , Tamanho da Amostra
9.
Sci Adv ; 5(6): eaaw3593, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31183407

RESUMO

Positive transcription elongation factor b (P-TEFb) functions as a central regulator of transcription elongation. Activation of P-TEFb occurs through its dissociation from the transcriptionally inactive P-TEFb/HEXIM1/7SK snRNP complex. However, the mechanisms of signal-regulated P-TEFb activation and its roles in human diseases remain largely unknown. Here, we demonstrate that cAMP-PKA signaling disrupts the inactive P-TEFb/HEXIM1/7SK snRNP complex by PKA-mediated phosphorylation of HEXIM1 at serine-158. The cAMP pathway plays central roles in the development of autosomal dominant polycystic kidney disease (ADPKD), and we show that P-TEFb is hyperactivated in mouse and human ADPKD kidneys. Genetic activation of P-TEFb promotes cyst formation in a zebrafish ADPKD model, while pharmacological inhibition of P-TEFb attenuates cyst development by suppressing the pathological gene expression program in ADPKD mice. Our study therefore elucidates a mechanism by which P-TEFb activation by cAMP-PKA signaling promotes cystogenesis in ADPKD.

10.
Thyroid ; 29(6): 809-823, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-30924726

RESUMO

Background: Anaplastic thyroid carcinoma (ATC) is one of the most aggressive malignancies, with no effective treatment currently available. The molecular mechanisms of ATC carcinogenesis remain poorly understood. The objective of this study was to investigate the mechanisms and functions of super-enhancer (SE)-driven oncogenic transcriptional addiction in the progression of ATC and identify new drug targets for ATC treatments. Methods: High-throughput chemical screening was performed to identify new drugs inhibiting ATC cell growth. Cell viability assay, colony formation analysis, cell-cycle analysis, and animal study were used to examine the effects of drug treatments on ATC progression. Chromatin immunoprecipitation sequencing was conducted to establish a SE landscape of ATC. Integrative analysis of RNA sequencing, chromatin immunoprecipitation sequencing, and CRISPR/Cas9-mediated gene editing was used to identify THZ1 target genes. Drug combination analysis was performed to assess drug synergy. Patient samples were analyzed to evaluate candidate biomarkers of prognosis in ATC. Results: THZ1, a covalent inhibitor of cyclin-dependent kinase 7 (CDK7), was identified as a potent anti-ATC compound by high-throughput chemical screening. ATC cells, but not papillary thyroid carcinoma cells, are exceptionally sensitive to CDK7 inhibition. An integrative analysis of both gene expression profiles and SE features revealed that the SE-mediated oncogenic transcriptional amplification mediates the vulnerability of ATC cells to THZ1 treatment. Combining this integrative analysis with functional assays led to the discovery of a number of novel cancer genes of ATC, including PPP1R15A, SMG9, and KLF2. Inhibition of PPP1R15A with Guanabenz or Sephin1 greatly suppresses ATC growth. Significantly, the expression level of PPP1R15A is correlated with CDK7 expression in ATC tissue samples. Elevated expression of PPP1R15A and CDK7 are both associated with poor clinical prognosis in ATC patients. Importantly, CDK7 or PPP1R15A inhibition sensitizes ATC cells to conventional chemotherapy. Conclusions: Taken together, these findings demonstrate transcriptional addiction in ATC pathobiology and identify CDK7 and PPP1R15A as potential biomarkers and therapeutic targets for ATC.

11.
Nucleic Acids Res ; 46(W1): W114-W120, 2018 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-29771388

RESUMO

Genome-wide association studies have generated over thousands of susceptibility loci for many human complex traits, and yet for most of these associations the true causal variants remain unknown. Tissue/cell type-specific prediction and prioritization of non-coding regulatory variants will facilitate the identification of causal variants and underlying pathogenic mechanisms for particular complex diseases and traits. By leveraging recent large-scale functional genomics/epigenomics data, we develop an intuitive web server, GWAS4D (http://mulinlab.tmu.edu.cn/gwas4d or http://mulinlab.org/gwas4d), that systematically evaluates GWAS signals and identifies context-specific regulatory variants. The updated web server includes six major features: (i) updates the regulatory variant prioritization method with our new algorithm; (ii) incorporates 127 tissue/cell type-specific epigenomes data; (iii) integrates motifs of 1480 transcriptional regulators from 13 public resources; (iv) uniformly processes Hi-C data and generates significant interactions at 5 kb resolution across 60 tissues/cell types; (v) adds comprehensive non-coding variant functional annotations; (vi) equips a highly interactive visualization function for SNP-target interaction. Using a GWAS fine-mapped set for 161 coronary artery disease risk loci, we demonstrate that GWAS4D is able to efficiently prioritize disease-causal regulatory variants.


Assuntos
Doenças Genéticas Inatas , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas/genética , Software , Biologia Computacional/tendências , Genômica/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genética
12.
Bioinformatics ; 34(18): 3145-3150, 2018 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-29718103

RESUMO

Motivation: Recently many studies showed single nucleotide polymorphisms (SNPs) affect gene expression and contribute to development of complex traits/diseases in a tissue context-dependent manner. However, little is known about haplotype's influence on gene expression and complex traits, which reflects the interaction effect between SNPs. Results: In the present study, we firstly proposed a regulatory region guided eQTL haplotype association analysis approach, and then systematically investigate the expression quantitative trait loci (eQTL) haplotypes in 20 different tissues by the approach. The approach has a powerful design of reducing computational burden by the utilization of regulatory predictions for candidate SNP selection and multiple testing corrections on non-independent haplotypes. The application results in multiple tissues showed that haplotype-based eQTLs not only increased the number of eQTL genes in a tissue specific manner, but were also enriched in loci that associated with complex traits in a tissue-matched manner. In addition, we found that tag SNPs of eQTL haplotypes from whole blood were selectively enriched in certain combination of regulatory elements (e.g. promoters and enhancers) according to predicted chromatin states. In summary, this eQTL haplotype detection approach, together with the application results, shed insights into synergistic effect of sequence variants on gene expression and their susceptibility to complex diseases. Availability and implementation: The executable application 'eHaplo' is implemented in Java and is publicly available at http://grass.cgs.hku.hk/limx/ehaplo/. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Haplótipos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Estudo de Associação Genômica Ampla , Herança Multifatorial , Fenótipo
13.
Nucleic Acids Res ; 45(W1): W215-W221, 2017 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-28482068

RESUMO

Cancer therapies have experienced rapid progress in recent years, with a number of novel small-molecule kinase inhibitors and monoclonal antibodies now being widely used to treat various types of human cancers. During cancer treatments, mutations can have important effects on drug sensitivity. However, the relationship between tumor genomic profiles and the effectiveness of cancer drugs remains elusive. We introduce Mutation To Cancer Therapy Scan (mTCTScan) web server (http://jjwanglab.org/mTCTScan) that can systematically analyze mutations affecting cancer drug sensitivity based on individual genomic profiles. The platform was developed by leveraging the latest knowledge on mutation-cancer drug sensitivity associations and the results from large-scale chemical screening using human cancer cell lines. Using an evidence-based scoring scheme based on current integrative evidences, mTCTScan is able to prioritize mutations according to their associations with cancer drugs and preclinical compounds. It can also show related drugs/compounds with sensitivity classification by considering the context of the entire genomic profile. In addition, mTCTScan incorporates comprehensive filtering functions and cancer-related annotations to better interpret mutation effects and their association with cancer drugs. This platform will greatly benefit both researchers and clinicians for interrogating mechanisms of mutation-dependent drug response, which will have a significant impact on cancer precision medicine.


Assuntos
Resistencia a Medicamentos Antineoplásicos/genética , Mutação , Software , Antineoplásicos/farmacologia , Linhagem Celular Tumoral , Genômica , Humanos , Internet , Anotação de Sequência Molecular , Neoplasias/genética
14.
Nucleic Acids Res ; 45(10): 5653-5665, 2017 Jun 02.
Artigo em Inglês | MEDLINE | ID: mdl-28472449

RESUMO

Competing endogenous RNAs (ceRNAs) are RNA molecules that sequester shared microRNAs (miRNAs) thereby affecting the expression of other targets of the miRNAs. Whether genetic variants in ceRNA can affect its biological function and disease development is still an open question. Here we identified a large number of genetic variants that are associated with ceRNA's function using Geuvaids RNA-seq data for 462 individuals from the 1000 Genomes Project. We call these loci competing endogenous RNA expression quantitative trait loci or 'cerQTL', and found that a large number of them were unexplored in conventional eQTL mapping. We identified many cerQTLs that have undergone recent positive selection in different human populations, and showed that single nucleotide polymorphisms in gene 3΄UTRs at the miRNA seed binding regions can simultaneously regulate gene expression changes in both cis and trans by the ceRNA mechanism. We also discovered that cerQTLs are significantly enriched in traits/diseases associated variants reported from genome-wide association studies in the miRNA binding sites, suggesting that disease susceptibilities could be attributed to ceRNA regulation. Further in vitro functional experiments demonstrated that a cerQTL rs11540855 can regulate ceRNA function. These results provide a comprehensive catalog of functional non-coding regulatory variants that may be responsible for ceRNA crosstalk at the post-transcriptional level.


Assuntos
Regulação da Expressão Gênica , Redes Reguladoras de Genes , Genoma Humano , MicroRNAs/genética , Locos de Características Quantitativas , RNA não Traduzido/genética , Regiões 3' não Traduzidas , Pareamento de Bases , Sítios de Ligação , Mapeamento Cromossômico , Estudo de Associação Genômica Ampla , Humanos , MicroRNAs/metabolismo , Polimorfismo de Nucleotídeo Único , RNA não Traduzido/metabolismo
15.
Sci Rep ; 7: 46204, 2017 04 10.
Artigo em Inglês | MEDLINE | ID: mdl-28393844

RESUMO

Accumulating data from genome-wide association studies (GWAS) have provided a collection of novel candidate genes associated with complex diseases, such as atherosclerosis. We identified an atherosclerosis-associated single-nucleotide polymorphism (SNP) located in the intron of the long noncoding RNA (lncRNA) LINC00305 by searching the GWAS database. Although the function of LINC00305 is unknown, we found that LINC00305 expression is enriched in atherosclerotic plaques and monocytes. Overexpression of LINC00305 promoted the expression of inflammation-associated genes in THP-1 cells and reduced the expression of contractile markers in co-cultured human aortic smooth muscle cells (HASMCs). We showed that overexpression of LINC00305 activated nuclear factor-kappa beta (NF-κB) and that inhibition of NF-κB abolished LINC00305-mediated activation of cytokine expression. Mechanistically, LINC00305 interacted with lipocalin-1 interacting membrane receptor (LIMR), enhanced the interaction of LIMR and aryl-hydrocarbon receptor repressor (AHRR), and promoted protein expression as well as nuclear localization of AHRR. Moreover, LINC00305 activated NF-κB exclusively in the presence of LIMR and AHRR. In light of these findings, we propose that LINC00305 promotes monocyte inflammation by facilitating LIMR and AHRR cooperation and the AHRR activation, which eventually activates NF-κB, thereby inducing HASMC phenotype switching.


Assuntos
Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Inflamação/genética , Inflamação/patologia , Monócitos/metabolismo , Monócitos/patologia , NF-kappa B/metabolismo , RNA Longo não Codificante/metabolismo , Proteínas Repressoras/metabolismo , Aorta/patologia , Aterosclerose/genética , Aterosclerose/patologia , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Linhagem Celular , Núcleo Celular/metabolismo , Estudo de Associação Genômica Ampla , Humanos , Miócitos de Músculo Liso/metabolismo , Fenótipo , Transporte Proteico , RNA Longo não Codificante/genética , Receptores de Superfície Celular/metabolismo , Proteínas Repressoras/genética , Transdução de Sinais/genética , Regulação para Cima
16.
Genome Biol ; 18(1): 52, 2017 03 16.
Artigo em Inglês | MEDLINE | ID: mdl-28302177

RESUMO

It remains challenging to predict regulatory variants in particular tissues or cell types due to highly context-specific gene regulation. By connecting large-scale epigenomic profiles to expression quantitative trait loci (eQTLs) in a wide range of human tissues/cell types, we identify critical chromatin features that predict variant regulatory potential. We present cepip, a joint likelihood framework, for estimating a variant's regulatory probability in a context-dependent manner. Our method exhibits significant GWAS signal enrichment and is superior to existing cell type-specific methods. Furthermore, using phenotypically relevant epigenomes to weight the GWAS single-nucleotide polymorphisms, we improve the statistical power of the gene-based association test.


Assuntos
Epigênese Genética , Epigenômica/métodos , Regulação da Expressão Gênica , Predisposição Genética para Doença , Variação Genética , Estudo de Associação Genômica Ampla/métodos , Cromatina/genética , Análise por Conglomerados , Expressão Gênica , Histonas/metabolismo , Humanos , Especificidade de Órgãos/genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
17.
Nucleic Acids Res ; 45(9): e75, 2017 May 19.
Artigo em Inglês | MEDLINE | ID: mdl-28115622

RESUMO

Whole genome sequencing (WGS) is a promising strategy to unravel variants or genes responsible for human diseases and traits. However, there is a lack of robust platforms for a comprehensive downstream analysis. In the present study, we first proposed three novel algorithms, sequence gap-filled gene feature annotation, bit-block encoded genotypes and sectional fast access to text lines to address three fundamental problems. The three algorithms then formed the infrastructure of a robust parallel computing framework, KGGSeq, for integrating downstream analysis functions for whole genome sequencing data. KGGSeq has been equipped with a comprehensive set of analysis functions for quality control, filtration, annotation, pathogenic prediction and statistical tests. In the tests with whole genome sequencing data from 1000 Genomes Project, KGGSeq annotated several thousand more reliable non-synonymous variants than other widely used tools (e.g. ANNOVAR and SNPEff). It took only around half an hour on a small server with 10 CPUs to access genotypes of ∼60 million variants of 2504 subjects, while a popular alternative tool required around one day. KGGSeq's bit-block genotype format used 1.5% or less space to flexibly represent phased or unphased genotypes with multiple alleles and achieved a speed of over 1000 times faster to calculate genotypic correlation.


Assuntos
Algoritmos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Humanos
18.
Bioinformatics ; 32(18): 2729-36, 2016 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-27273672

RESUMO

MOTIVATION: Prediction and prioritization of human non-coding regulatory variants is critical for understanding the regulatory mechanisms of disease pathogenesis and promoting personalized medicine. Existing tools utilize functional genomics data and evolutionary information to evaluate the pathogenicity or regulatory functions of non-coding variants. However, different algorithms lead to inconsistent and even conflicting predictions. Combining multiple methods may increase accuracy in regulatory variant prediction. RESULTS: Here, we compiled an integrative resource for predictions from eight different tools on functional annotation of non-coding variants. We further developed a composite strategy to integrate multiple predictions and computed the composite likelihood of a given variant being regulatory variant. Benchmarked by multiple independent causal variants datasets, we demonstrated that our composite model significantly improves the prediction performance. AVAILABILITY AND IMPLEMENTATION: We implemented our model and scoring procedure as a tool, named PRVCS, which is freely available to academic and non-profit usage at http://jjwanglab.org/PRVCS CONTACT: wang.junwen@mayo.edu, jliu@stat.harvard.edu, or limx54@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Modelos Teóricos , Anotação de Sequência Molecular , Software , Evolução Biológica , Variação Genética , Humanos , RNA não Traduzido
19.
Nucleic Acids Res ; 44(D1): D869-76, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26615194

RESUMO

Genome-wide association studies (GWASs), now as a routine approach to study single-nucleotide polymorphism (SNP)-trait association, have uncovered over ten thousand significant trait/disease associated SNPs (TASs). Here, we updated GWASdb (GWASdb v2, http://jjwanglab.org/gwasdb) which provides comprehensive data curation and knowledge integration for GWAS TASs. These updates include: (i) Up to August 2015, we collected 2479 unique publications from PubMed and other resources; (ii) We further curated moderate SNP-trait associations (P-value < 1.0 × 10(-3)) from each original publication, and generated a total of 252,530 unique TASs in all GWASdb v2 collected studies; (iii) We manually mapped 1610 GWAS traits to 501 Human Phenotype Ontology (HPO) terms, 435 Disease Ontology (DO) terms and 228 Disease Ontology Lite (DOLite) terms. For each ontology term, we also predicted the putative causal genes; (iv) We curated the detailed sub-populations and related sample size for each study; (v) Importantly, we performed extensive function annotation for each TAS by incorporating gene-based information, ENCODE ChIP-seq assays, eQTL, population haplotype, functional prediction across multiple biological domains, evolutionary signals and disease-related annotation; (vi) Additionally, we compiled a SNP-drug response association dataset for 650 pharmacogenetic studies involving 257 drugs in this update; (vii) Last, we improved the user interface of website.


Assuntos
Bases de Dados Genéticas , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Ontologias Biológicas , Doença/genética , Genes , Humanos , Anotação de Sequência Molecular
20.
Nat Genet ; 47(8): 856-60, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26121088

RESUMO

Over a quarter of drugs that enter clinical development fail because they are ineffective. Growing insight into genes that influence human disease may affect how drug targets and indications are selected. However, there is little guidance about how much weight should be given to genetic evidence in making these key decisions. To answer this question, we investigated how well the current archive of genetic evidence predicts drug mechanisms. We found that, among well-studied indications, the proportion of drug mechanisms with direct genetic support increases significantly across the drug development pipeline, from 2.0% at the preclinical stage to 8.2% among mechanisms for approved drugs, and varies dramatically among disease areas. We estimate that selecting genetically supported targets could double the success rate in clinical development. Therefore, using the growing wealth of human genetic data to select the best targets and indications should have a measurable impact on the successful development of new drugs.


Assuntos
Aprovação de Drogas/estatística & dados numéricos , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Polimorfismo de Nucleotídeo Único , Mapeamento Cromossômico , Bases de Dados Genéticas/estatística & dados numéricos , Estudos de Associação Genética/estatística & dados numéricos , Genética Médica/métodos , Genética Médica/estatística & dados numéricos , Humanos , Desequilíbrio de Ligação , Medical Subject Headings/estatística & dados numéricos , Terapia de Alvo Molecular/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA