Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 2.208
Filtrar
1.
Nat Commun ; 11(1): 4703, 2020 09 17.
Artigo em Inglês | MEDLINE | ID: mdl-32943643

RESUMO

Deep learning models have shown great promise in predicting regulatory effects from DNA sequence, but their informativeness for human complex diseases is not fully understood. Here, we evaluate genome-wide SNP annotations from two previous deep learning models, DeepSEA and Basenji, by applying stratified LD score regression to 41 diseases and traits (average N = 320K), conditioning on a broad set of coding, conserved and regulatory annotations. We aggregated annotations across all (respectively blood or brain) tissues/cell-types in meta-analyses across all (respectively 11 blood or 8 brain) traits. The annotations were highly enriched for disease heritability, but produced only limited conditionally significant results: non-tissue-specific and brain-specific Basenji-H3K4me3 for all traits and brain traits respectively. We conclude that deep learning models have yet to achieve their full potential to provide considerable unique information for complex disease, and that their conditional informativeness for disease cannot be inferred from their accuracy in predicting regulatory annotations.


Assuntos
Aprendizado Profundo , Doença/genética , Anotação de Sequência Molecular , Alelos , Predisposição Genética para Doença , Genoma Humano , Estudo de Associação Genômica Ampla , Histonas/genética , Humanos , Desequilíbrio de Ligação , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único
2.
Adv Exp Med Biol ; 1255: 1-6, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32949386

RESUMO

Clinical single-cell biomedicine has become a new emerging discipline, which integrates single-cell RNA and DNA sequencing, proteomics, and functions with clinical phenomes, therapeutic responses, and prognosis. It is of great value to discover disease-, phenome-, and therapy-specific diagnostic biomarkers and therapeutic targets on the basis of the principle of clinical single-cell biomedicine. This book reviews the roles of single-cell sequencing and methylation in diseases and explores disease-specific alterations of single-cell sequencing and methylation, especially focusing on potential applications of methodologies on human single-cell sequencing and methylation, on potential correlations between those changes with pulmonary diseases, and on potential roles of signaling pathways that cause heterogeneous cellular responses during treatment. This book also emphasizes the importance of methodologies in clinical practice and application, the potential of perspectives, challenges and solutions, and the significance of single-cell preparation standardization. Alterations of DNA and RNA methylation, demethylation in lung diseases, and a deep knowledge about the regulation and function of target gene methylation for diagnosing and treating diseases at the early stage are also provided. Importantly, this book aims to apply the measurement of single-cell sequencing and methylation for clinical diagnosis and treatment and to understand clinical values of those parameters and to headline and foresee the potential values of the application of single-cell sequencing in non-cancer diseases.


Assuntos
Metilação de DNA , Doença/genética , Análise de Sequência , Análise de Célula Única , DNA/genética , DNA/metabolismo , Humanos , Proteômica , RNA/genética , RNA/metabolismo
3.
BMC Bioinformatics ; 21(1): 339, 2020 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-32736513

RESUMO

BACKGROUND: It has been widely accepted that long non-coding RNAs (lncRNAs) play important roles in the development and progression of human diseases. Many association prediction models have been proposed for predicting lncRNA functions and identifying potential lncRNA-disease associations. Nevertheless, among them, little effort has been attempted to measure lncRNA functional similarity, which is an essential part of association prediction models. RESULTS: In this study, we presented an lncRNA functional similarity calculation model, IDSSIM for short, based on an improved disease semantic similarity method, highlight of which is the introduction of information content contribution factor into the semantic value calculation to take into account both the hierarchical structures of disease directed acyclic graphs and the disease specificities. IDSSIM and three state-of-the-art models, i.e., LNCSIM1, LNCSIM2, and ILNCSIM, were evaluated by applying their disease semantic similarity matrices and the lncRNA functional similarity matrices, as well as corresponding matrices of human lncRNA-disease associations coming from either lncRNADisease database or MNDR database, into an association prediction method WKNKN for lncRNA-disease association prediction. In addition, case studies of breast cancer and adenocarcinoma were also performed to validate the effectiveness of IDSSIM. CONCLUSIONS: Results demonstrated that in terms of ROC curves and AUC values, IDSSIM is superior to compared models, and can improve accuracy of disease semantic similarity effectively, leading to increase the association prediction ability of the IDSSIM-WKNKN model; in terms of case studies, most of potential disease-associated lncRNAs predicted by IDSSIM can be confirmed by databases and literatures, implying that IDSSIM can serve as a promising tool for predicting lncRNA functions, identifying potential lncRNA-disease associations, and pre-screening candidate lncRNAs to perform biological experiments. The IDSSIM code, all experimental data and prediction results are available online at https://github.com/CDMB-lab/IDSSIM .


Assuntos
Algoritmos , Biologia Computacional/métodos , Doença/genética , Modelos Genéticos , RNA Longo não Codificante/genética , Semântica , Adenocarcinoma/genética , Área Sob a Curva , Neoplasias da Mama/genética , Bases de Dados Genéticas , Feminino , Humanos , Curva ROC
4.
Stud Health Technol Inform ; 270: 267-271, 2020 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-32570388

RESUMO

Information relevant to pharmacogenomics studies is available in several open databases, which makes it difficult to synthetize the available data. Within the PractikPharma project, several databases were integrated to PGxLOD, a resource dedicated to the generation and verification of pharmacogenomic influence on drug responses. The Comparative Toxicogenomic Database (CTD) describes the toxic effects of many chemicals on living species based on the literature. Since drugs are peculiar chemicals and side effects are peculiar toxic effects, we aimed at extracting information from CTD that matches drug side effects in the human specie.


Assuntos
Doença/etiologia , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Substâncias Perigosas/toxicidade , Farmacogenética , Toxicogenética , Bases de Dados Factuais , Doença/genética , Humanos , Pesquisa , Integração de Sistemas
5.
Nucleic Acids Res ; 48(W1): W193-W199, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32459338

RESUMO

A current challenge in genomics is to interpret non-coding regions and their role in transcriptional regulation of possibly distant target genes. Genome-wide association studies show that a large part of genomic variants are found in those non-coding regions, but their mechanisms of gene regulation are often unknown. An additional challenge is to reliably identify the target genes of the regulatory regions, which is an essential step in understanding their impact on gene expression. Here we present the EpiRegio web server, a resource of regulatory elements (REMs). REMs are genomic regions that exhibit variations in their chromatin accessibility profile associated with changes in expression of their target genes. EpiRegio incorporates both epigenomic and gene expression data for various human primary cell types and tissues, providing an integrated view of REMs in the genome. Our web server allows the analysis of genes and their associated REMs, including the REM's activity and its estimated cell type-specific contribution to its target gene's expression. Further, it is possible to explore genomic regions for their regulatory potential, investigate overlapping REMs and by that the dissection of regions of large epigenomic complexity. EpiRegio allows programmatic access through a REST API and is freely available at https://epiregio.de/.


Assuntos
Elementos Reguladores de Transcrição , Software , Sequenciamento de Cromatina por Imunoprecipitação , Doença/genética , Regulação da Expressão Gênica , Humanos , Fatores de Transcrição/metabolismo
6.
Nucleic Acids Res ; 48(W1): W147-W153, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32469063

RESUMO

Significant efforts have been invested into understanding and predicting the molecular consequences of mutations in protein coding regions, however nearly all approaches have been developed using globular, soluble proteins. These methods have been shown to poorly translate to studying the effects of mutations in membrane proteins. To fill this gap, here we report, mCSM-membrane, a user-friendly web server that can be used to analyse the impacts of mutations on membrane protein stability and the likelihood of them being disease associated. mCSM-membrane derives from our well-established mutation modelling approach that uses graph-based signatures to model protein geometry and physicochemical properties for supervised learning. Our stability predictor achieved correlations of up to 0.72 and 0.67 (on cross validation and blind tests, respectively), while our pathogenicity predictor achieved a Matthew's Correlation Coefficient (MCC) of up to 0.77 and 0.73, outperforming previously described methods in both predicting changes in stability and in identifying pathogenic variants. mCSM-membrane will be an invaluable and dedicated resource for investigating the effects of single-point mutations on membrane proteins through a freely available, user friendly web server at http://biosig.unimelb.edu.au/mcsm_membrane.


Assuntos
Proteínas de Membrana/química , Proteínas de Membrana/genética , Mutação de Sentido Incorreto , Software , Doença/genética , Humanos , Mutação Puntual , Estabilidade Proteica , Homologia Estrutural de Proteína
7.
Nat Commun ; 11(1): 2073, 2020 04 29.
Artigo em Inglês | MEDLINE | ID: mdl-32350270

RESUMO

Functional variomics provides the foundation for personalized medicine by linking genetic variation to disease expression, outcome and treatment, yet its utility is dependent on appropriate assays to evaluate mutation impact on protein function. To fully assess the effects of 106 missense and nonsense variants of PTEN associated with autism spectrum disorder, somatic cancer and PTEN hamartoma syndrome (PHTS), we take a deep phenotypic profiling approach using 18 assays in 5 model systems spanning diverse cellular environments ranging from molecular function to neuronal morphogenesis and behavior. Variants inducing instability occur across the protein, resulting in partial-to-complete loss-of-function (LoF), which is well correlated across models. However, assays are selectively sensitive to variants located in substrate binding and catalytic domains, which exhibit complete LoF or dominant negativity independent of effects on stability. Our results indicate that full characterization of variant impact requires assays sensitive to instability and a range of protein functions.


Assuntos
Doença/genética , Modelos Genéticos , Mutação de Sentido Incorreto/genética , PTEN Fosfo-Hidrolase/genética , Animais , Comportamento Animal , Caenorhabditis elegans/fisiologia , Células Cultivadas , Dendritos/fisiologia , Drosophila/genética , Drosophila/crescimento & desenvolvimento , Ensaios Enzimáticos , Células HEK293 , Humanos , Neoplasias/genética , Sistema Nervoso/crescimento & desenvolvimento , Fosforilação , Estabilidade Proteica , Proteínas Proto-Oncogênicas c-akt/metabolismo , Células Piramidais/metabolismo , Ratos Sprague-Dawley , Saccharomyces cerevisiae/metabolismo
9.
PLoS Comput Biol ; 16(5): e1007775, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32413045

RESUMO

The human genome harbors a variety of genetic variations. Single-nucleotide changes that alter amino acids in protein-coding regions are one of the major causes of human phenotypic variation and diseases. These single-amino acid variations (SAVs) are routinely found in whole genome and exome sequencing. Evaluating the functional impact of such genomic alterations is crucial for diagnosis of genetic disorders. We developed DeepSAV, a deep-learning convolutional neural network to differentiate disease-causing and benign SAVs based on a variety of protein sequence, structural and functional properties. Our method outperforms most stand-alone programs, and the version incorporating population and gene-level information (DeepSAV+PG) has similar predictive power as some of the best available. We transformed DeepSAV scores of rare SAVs in the human population into a quantity termed "mutation severity measure" for each human protein-coding gene. It reflects a gene's tolerance to deleterious missense mutations and serves as a useful tool to study gene-disease associations. Genes implicated in cancer, autism, and viral interaction are found by this measure as intolerant to mutations, while genes associated with a number of other diseases are scored as tolerant. Among known disease-associated genes, those that are mutation-intolerant are likely to function in development and signal transduction pathways, while those that are mutation-tolerant tend to encode metabolic and mitochondrial proteins.


Assuntos
Doença/genética , Previsões/métodos , Genoma Humano/genética , Alelos , Sequência de Aminoácidos/genética , Biologia Computacional/métodos , Aprendizado Profundo , Redes Reguladoras de Genes/genética , Humanos , Mutação/genética , Mutação de Sentido Incorreto/genética , Rede Nervosa , Fases de Leitura Aberta/genética , Análise de Sequência/métodos , Sequenciamento Completo do Exoma/métodos
10.
PLoS One ; 15(5): e0233438, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32459809

RESUMO

Researchers and clinicians face a significant challenge in keeping up-to-date with the rapid rate of new associations between genetic mutations and diseases. To remedy this problem, this research mined the ClinicalTrials.gov corpus to extract relevant biological insights, produce unique reports to summarize findings, and make the meta-data available via APIs. An automated text-analysis pipeline performed the following features: parsing the ClinicalTrials.gov files, extracting and analyzing mutations from the corpus, mapping clinical trials to Human Phenotype Ontology (HPO), and finding associations between clinical trials and HPO nodes. Unique reports were created for each mutation (SNPs and protein mutations) mentioned in the corpus, as well as for each clinical trial that references a mutation. These reports, which have been run over multiple time points, along with APIs to access meta-data, are freely available at http://snpminertrials.com. Additionally, HPO was used to normalize disease terms and associate clinical trials with relevant genes. The creation of the pipeline and reports, the association of clinical trials with HPO terms, and the insights, public repository, and APIs produced are all novel in this work. The freely-available resources present relevant biological information and novel insights between biomedical entities in a robust and accessible manner, mitigating the challenge of being informed about new associations between mutations, genes, and diseases.


Assuntos
Ensaios Clínicos como Assunto , Mineração de Dados/métodos , Mutação , Ontologias Biológicas , Doença/genética , Humanos , Internet , Fenótipo , Terminologia como Assunto
11.
Nucleic Acids Res ; 48(W1): W313-W320, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32421816

RESUMO

Among the diverse location of the breakpoints (BPs) of structural variants (SVs), the breakpoints of fusion genes (FGs) are located in the gene bodies. This broken gene context provided the aberrant functional clues to study disease genesis. Many tumorigenic fusion genes have retained or lost functional or regulatory domains and these features impacted tumorigenesis. Full annotation of fusion genes aided by the visualization tool based on two gene bodies will be helpful to study the functional aspect of fusion genes. To date, a specialized tool with effective visualization of the functional features of fusion genes is not available. In this study, we built FGviewer, a tool for visualizing functional features of human fusion genes, which is available at https://ccsmweb.uth.edu/FGviewer. FGviewer gets the input of fusion gene symbols, breakpoint information, or structural variants from whole-genome sequence (WGS) data. For any combination of gene pairs/breakpoints to be involved in fusion genes, the users can search the functional/regulatory aspect of the fusion gene in the three bio-molecular levels (DNA-, RNA-, and protein-levels) and one clinical level (pathogenic-level). FGviewer will be a unique online tool in disease research communities.


Assuntos
Fusão Gênica , Software , Sequência de Aminoácidos , Gráficos por Computador , Doença/genética , Humanos , Fatores de Transcrição/metabolismo , Sequenciamento Completo do Genoma
12.
Nature ; 581(7809): 444-451, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32461652

RESUMO

Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.


Assuntos
Doença/genética , Variação Genética , Genética Médica/normas , Genética Populacional/normas , Genoma Humano/genética , Grupos de Populações Continentais/genética , Feminino , Testes Genéticos , Técnicas de Genotipagem , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Polimorfismo de Nucleotídeo Único/genética , Padrões de Referência , Seleção Genética , Sequenciamento Completo do Genoma
13.
Nature ; 581(7809): 452-458, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32461655

RESUMO

The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.


Assuntos
Doença/genética , Haploinsuficiência/genética , Mutação com Perda de Função/genética , Anotação de Sequência Molecular , Transcrição Genética , Transcriptoma/genética , Transtorno do Espectro Autista/genética , Conjuntos de Dados como Assunto , Deficiências do Desenvolvimento/genética , Éxons/genética , Feminino , Genótipo , Humanos , Deficiência Intelectual/genética , Masculino , Anotação de Sequência Molecular/normas , Distribuição de Poisson , RNA Mensageiro/análise , RNA Mensageiro/genética , Doenças Raras/diagnóstico , Doenças Raras/genética , Reprodutibilidade dos Testes , Sequenciamento Completo do Exoma
14.
Am J Hum Genet ; 106(6): 748-763, 2020 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-32442411

RESUMO

The identification of causal variants and mechanisms underlying complex disease traits in humans is important for the progress of human disease genetics; this requires finding strategies to detect functional regulatory variants in disease-relevant cell types. To achieve this, we collected genetic and transcriptomic data from the aortic endothelial cells of up to 157 donors and four epigenomic phenotypes in up to 44 human donors representing individuals of both sexes and three major ancestries. We found thousands of expression quantitative trait loci (eQTLs) at all ranges of effect sizes not detected by the Gene-Tissue Expression Project (GTEx) in human tissues, showing that novel biological relationships unique to endothelial cells (ECs) are enriched in this dataset. Epigenetic profiling enabled discovery of over 3,000 regulatory elements whose activity is modulated by genetic variants that most frequently mutated ETS, AP-1, and NF-kB binding motifs, implicating these motifs as governors of EC regulation. Using CRISPR interference (CRISPRi), allele-specific reporter assays, and chromatin conformation capture, we validated candidate enhancer variants located up to 750 kb from their target genes, VEGFC, FGD6, and KIF26B. Regulatory SNPs identified were enriched in coronary artery disease (CAD) loci, and this result has specific implications for PECAM-1, FES, and AXL. We also found significant roles for EC regulatory variants in modifying the traits pulse pressure, blood protein levels, and monocyte count. Lastly, we present two unlinked SNPs in the promoter of MFAP2 that exhibit pleiotropic effects on human disease traits. Together, this supports the possibility that genetic predisposition for complex disease is manifested through the endothelium.


Assuntos
Doença/genética , Células Endoteliais/metabolismo , Elementos Facilitadores Genéticos/genética , Regulação da Expressão Gênica/genética , Variação Genética/genética , Alelos , Epigênese Genética/genética , Feminino , Humanos , Cinesina/genética , Masculino , Mutação , NF-kappa B/metabolismo , Polimorfismo de Nucleotídeo Único/genética , Proteína Proto-Oncogênica c-ets-1/metabolismo , Locos de Características Quantitativas/genética , Fator de Transcrição AP-1/metabolismo , Regulador Transcricional ERG/metabolismo , Fator C de Crescimento do Endotélio Vascular/genética
15.
BMC Bioinformatics ; 21(1): 180, 2020 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-32393162

RESUMO

BACKGROUND: In recent years, increasing evidences have indicated that long non-coding RNAs (lncRNAs) are deeply involved in a wide range of human biological pathways. The mutations and disorders of lncRNAs are closely associated with many human diseases. Therefore, it is of great importance to predict potential associations between lncRNAs and complex diseases for the diagnosis and cure of complex diseases. However, the functional mechanisms of the majority of lncRNAs are still remain unclear. As a result, it remains a great challenge to predict potential associations between lncRNAs and diseases. RESULTS: Here, we proposed a new method to predict potential lncRNA-disease associations. First, we constructed a bipartite network based on known associations between diseases and lncRNAs/protein coding genes. Then the cluster association scores were calculated to evaluate the strength of the inner relationships between disease clusters and gene clusters. Finally, the gene-disease association scores are defined based on disease-gene cluster association scores and used to measure the strength for potential gene-disease associations. CONCLUSIONS: Leave-One Out Cross Validation (LOOCV) and 5-fold cross validation tests were implemented to evaluate the performance of our method. As a result, our method achieved reliable performance in the LOOCV (AUCs of 0.8169 and 0.8410 based on Yang's dataset and Lnc2cancer 2.0 database, respectively), and 5-fold cross validation (AUCs of 0.7573 and 0.8198 based on Yang's dataset and Lnc2cancer 2.0 database, respectively), which were significantly higher than the other three comparative methods. Furthermore, our method is simple and efficient. Only the known gene-disease associations are exploited in a graph manner and further new gene-disease associations can be easily incorporated in our model. The results for melanoma and ovarian cancer have been verified by other researches. The case studies indicated that our method can provide informative clues for further investigation.


Assuntos
Biologia Computacional/métodos , Doença/genética , Predisposição Genética para Doença , RNA Longo não Codificante/genética , Algoritmos , Área Sob a Curva , Análise por Conglomerados , Humanos , Neoplasias/genética
16.
Adv Exp Med Biol ; 1253: 3-55, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32445090

RESUMO

Epigenetic mechanisms, which include DNA methylation, histone modification, and microRNA (miRNA), can produce heritable phenotypic changes without a change in DNA sequence. Disruption of gene expression patterns which are governed by epigenetics can result in autoimmune diseases, cancers, and various other maladies. Mechanisms of epigenetics include DNA methylation (and demethylation), histone modifications, and non-coding RNAs such as microRNAs. Compared to numerous studies that have focused on the field of genetics, research on epigenetics is fairly recent. In contrast to genetic changes, which are difficult to reverse, epigenetic aberrations can be pharmaceutically reversible. The emerging tools of epigenetics can be used as preventive, diagnostic, and therapeutic markers. With the development of drugs that target the specific epigenetic mechanisms involved in the regulation of gene expression, development and utilization of epigenetic tools are an appropriate and effective approach that can be clinically applied to the treatment of various diseases.


Assuntos
Doença/genética , Epigênese Genética , Saúde , Animais , Metilação de DNA , Epigenômica , Código das Histonas , Humanos , MicroRNAs
17.
Adv Exp Med Biol ; 1253: 57-94, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32445091

RESUMO

The study of epigenetics has its roots in the study of organism change over time and response to environmental change, although over the past several decades the definition has been formalized to include heritable alterations in gene expression that are not a result of alterations in underlying DNA sequence. In this chapter, we discuss first the history and milestones in the 100+ years of epigenetic study, including early discoveries of DNA methylation, histone posttranslational modification, and noncoding RNA. We then discuss how epigenetics has changed the way that we think of both health and disease, offering as examples studies examining the epigenetic contributions to aging, including the recent development of an epigenetic "clock", and explore how antiaging therapies may work through epigenetic modifications. We then discuss a nonpathogenic role for epigenetics in the clinic: epigenetic biomarkers. We conclude by offering two examples of modern state-of-the-art integrated multi-omics studies of epigenetics in disease pathogenesis, one which sought to capture shared mechanisms among multiple diseases, and another which used epigenetic big data to better understand the pathogenesis of a single tissue from one disease.


Assuntos
Doença/genética , Epigênese Genética , Epigenômica , Animais , Metilação de DNA , Código das Histonas , Humanos , RNA não Traduzido
18.
BMC Bioinformatics ; 21(1): 176, 2020 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-32366225

RESUMO

BACKGROUND: As regulators of gene expression, microRNAs (miRNAs) are increasingly recognized as critical biomarkers of human diseases. Till now, a series of computational methods have been proposed to predict new miRNA-disease associations based on similarity measurements. Different categories of features in miRNAs are applied in these methods for miRNA-miRNA similarity calculation. Benchmarking tests on these miRNA similarity measures are warranted to assess their effectiveness and robustness. RESULTS: In this study, 5 categories of features, i.e. miRNA sequences, miRNA expression profiles in cell-lines, miRNA expression profiles in tissues, gene ontology (GO) annotations of miRNA target genes and Medical Subject Heading (MeSH) terms of miRNA-associated diseases, are collected and similarity values between miRNAs are quantified based on these feature spaces, respectively. We systematically compare the 5 similarities from multi-statistical views. Furthermore, we adopt a rule-based inference method to test their performance on miRNA-disease association predictions with the similarity measurements. Comprehensive comparison is made based on leave-one-out cross-validations and a case study. Experimental results demonstrate that the similarity measurement using MeSH terms performs best among the 5 measurements. It should be noted that the other 4 measurements can also achieve reliable prediction performance. The best-performed similarity measurement is used for new miRNA-disease association predictions and the inferred results are released for further biomedical screening. CONCLUSIONS: Our study suggests that all the 5 features, even though some are restricted by data availability, are useful information for inferring novel miRNA-disease associations. However, biased prediction results might be produced in GO- and MeSH-based similarity measurements due to incomplete feature spaces. Similarity fusion may help produce more reliable prediction results. We expect that future studies will provide more detailed information into the 5 feature spaces and widen our understanding about disease pathogenesis.


Assuntos
Doença/genética , MicroRNAs/genética , Algoritmos , Biomarcadores/análise , Biologia Computacional/métodos , Ontologia Genética , Humanos , MicroRNAs/metabolismo , Prognóstico
19.
PLoS Genet ; 16(4): e1008663, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32243438

RESUMO

Previous studies have surveyed the potential impact of loss-of-function (LoF) variants and identified LoF-tolerant protein-coding genes. However, the tolerance of human genomes to losing enhancers has not yet been evaluated. Here we present the catalog of LoF-tolerant enhancers using structural variants from whole-genome sequences. Using a conservative approach, we estimate that individual human genomes possess at least 28 LoF-tolerant enhancers on average. We assessed the properties of LoF-tolerant enhancers in a unified regulatory network constructed by integrating tissue-specific enhancers and gene-gene interactions. We find that LoF-tolerant enhancers tend to be more tissue-specific and regulate fewer and more dispensable genes relative to other enhancers. They are enriched in immune-related cells while enhancers with low LoF-tolerance are enriched in kidney and brain/neuronal stem cells. We developed a supervised learning approach to predict the LoF-tolerance of all enhancers, which achieved an area under the receiver operating characteristics curve (AUROC) of 98%. We predict 3,519 more enhancers would be likely tolerant to LoF and 129 enhancers that would have low LoF-tolerance. Our predictions are supported by a known set of disease enhancers and novel deletions from PacBio sequencing. The LoF-tolerance scores provided here will serve as an important reference for disease studies.


Assuntos
Elementos Facilitadores Genéticos/genética , Genoma Humano/genética , Mutação com Perda de Função , Sequência Conservada , Doença/genética , Regulação da Expressão Gênica , Predisposição Genética para Doença , Humanos , Especificidade de Órgãos/genética , Curva ROC , Reprodutibilidade dos Testes , Aprendizado de Máquina Supervisionado
20.
Am J Hum Genet ; 106(5): 611-622, 2020 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-32275883

RESUMO

Population-scale biobanks that combine genetic data and high-dimensional phenotyping for a large number of participants provide an exciting opportunity to perform genome-wide association studies (GWAS) to identify genetic variants associated with diverse quantitative traits and diseases. A major challenge for GWAS in population biobanks is ascertaining disease cases from heterogeneous data sources such as hospital records, digital questionnaire responses, or interviews. In this study, we use genetic parameters, including genetic correlation, to evaluate whether GWAS performed using cases in the UK Biobank ascertained from hospital records, questionnaire responses, and family history of disease implicate similar disease genetics across a range of effect sizes. We find that hospital record and questionnaire GWAS largely identify similar genetic effects for many complex phenotypes and that combining together both phenotyping methods improves power to detect genetic associations. We also show that family history GWAS using cases ascertained on family history of disease agrees with combined hospital record and questionnaire GWAS and that family history GWAS has better power to detect genetic associations for some phenotypes. Overall, this work demonstrates that digital phenotyping and unstructured phenotype data can be combined with structured data such as hospital records to identify cases for GWAS in biobanks and improve the ability of such studies to identify genetic associations.


Assuntos
Doença/genética , Estudo de Associação Genômica Ampla , Fenótipo , Asma/genética , Bases de Dados Factuais , Feminino , Genética Médica , Genótipo , Humanos , Masculino , Neoplasias/genética , Reino Unido
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA