Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Cell ; 186(20): 4386-4403.e29, 2023 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-37774678

RESUMO

Altered microglial states affect neuroinflammation, neurodegeneration, and disease but remain poorly understood. Here, we report 194,000 single-nucleus microglial transcriptomes and epigenomes across 443 human subjects and diverse Alzheimer's disease (AD) pathological phenotypes. We annotate 12 microglial transcriptional states, including AD-dysregulated homeostatic, inflammatory, and lipid-processing states. We identify 1,542 AD-differentially-expressed genes, including both microglia-state-specific and disease-stage-specific alterations. By integrating epigenomic, transcriptomic, and motif information, we infer upstream regulators of microglial cell states, gene-regulatory networks, enhancer-gene links, and transcription-factor-driven microglial state transitions. We demonstrate that ectopic expression of our predicted homeostatic-state activators induces homeostatic features in human iPSC-derived microglia-like cells, while inhibiting activators of inflammation can block inflammatory progression. Lastly, we pinpoint the expression of AD-risk genes in microglial states and differential expression of AD-risk genes and their regulators during AD progression. Overall, we provide insights underlying microglial states, including state-specific and AD-stage-specific microglial alterations at unprecedented resolution.


Assuntos
Doença de Alzheimer , Microglia , Humanos , Doença de Alzheimer/genética , Doença de Alzheimer/patologia , Regulação da Expressão Gênica , Inflamação/patologia , Microglia/metabolismo , Fatores de Transcrição/metabolismo , Transcriptoma , Epigenoma
2.
Cell ; 186(20): 4422-4437.e21, 2023 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-37774680

RESUMO

Recent work has identified dozens of non-coding loci for Alzheimer's disease (AD) risk, but their mechanisms and AD transcriptional regulatory circuitry are poorly understood. Here, we profile epigenomic and transcriptomic landscapes of 850,000 nuclei from prefrontal cortexes of 92 individuals with and without AD to build a map of the brain regulome, including epigenomic profiles, transcriptional regulators, co-accessibility modules, and peak-to-gene links in a cell-type-specific manner. We develop methods for multimodal integration and detecting regulatory modules using peak-to-gene linking. We show AD risk loci are enriched in microglial enhancers and for specific TFs including SPI1, ELF2, and RUNX1. We detect 9,628 cell-type-specific ATAC-QTL loci, which we integrate alongside peak-to-gene links to prioritize AD variant regulatory circuits. We report differential accessibility of regulatory modules in late AD in glia and in early AD in neurons. Strikingly, late-stage AD brains show global epigenome dysregulation indicative of epigenome erosion and cell identity loss.


Assuntos
Doença de Alzheimer , Encéfalo , Regulação da Expressão Gênica , Humanos , Doença de Alzheimer/genética , Doença de Alzheimer/patologia , Encéfalo/patologia , Epigenoma , Epigenômica , Estudo de Associação Genômica Ampla
3.
Nature ; 590(7845): 300-307, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33536621

RESUMO

Annotating the molecular basis of human disease remains an unsolved challenge, as 93% of disease loci are non-coding and gene-regulatory annotations are highly incomplete1-3. Here we present EpiMap, a compendium comprising 10,000 epigenomic maps across 800 samples, which we used to define chromatin states, high-resolution enhancers, enhancer modules, upstream regulators and downstream target genes. We used this resource to annotate 30,000 genetic loci that were associated with 540 traits4, predicting trait-relevant tissues, putative causal nucleotide variants in enriched tissue enhancers and candidate tissue-specific target genes for each. We partitioned multifactorial traits into tissue-specific contributing factors with distinct functional enrichments and disease comorbidity patterns, and revealed both single-factor monotropic and multifactor pleiotropic loci. Top-scoring loci frequently had multiple predicted driver variants, converging through multiple enhancers with a common target gene, multiple genes in common tissues, or multiple genes and multiple tissues, indicating extensive pleiotropy. Our results demonstrate the importance of dense, rich, high-resolution epigenomic annotations for the investigation of complex traits.


Assuntos
Doença/genética , Epigênese Genética/genética , Epigenômica , Redes Reguladoras de Genes/genética , Loci Gênicos/genética , Cromatina/genética , Elementos Facilitadores Genéticos/genética , Feminino , Estudo de Associação Genômica Ampla , Humanos , Masculino , Herança Multifatorial/genética , Especificidade de Órgãos/genética , Reprodutibilidade dos Testes
4.
Brief Bioinform ; 20(4): 1222-1237, 2019 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-29220512

RESUMO

MOTIVATION: Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. RESULTS: We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. AVAILABILITY: The source code of the benchmarking tool is available as Supplementary Materials.


Assuntos
Biologia Computacional/métodos , Modelos Estatísticos , Análise de Sequência de DNA/estatística & dados numéricos , Algoritmos , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Cadeias de Markov , Alinhamento de Sequência/estatística & dados numéricos
5.
Bioinformatics ; 36(2): 380-387, 2020 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-31287494

RESUMO

MOTIVATION: Simple tandem repeats, microsatellites in particular, have regulatory functions, links to several diseases and applications in biotechnology. There is an immediate need for an accurate tool for detecting microsatellites in newly sequenced genomes. The current available tools are either sensitive or specific but not both; some tools require adjusting parameters manually. RESULTS: We propose Look4TRs, the first application of self-supervised hidden Markov models to discovering microsatellites. Look4TRs adapts itself to the input genomes, balancing high sensitivity and low false positive rate. It auto-calibrates itself. We evaluated Look4TRs on 26 eukaryotic genomes. Based on F measure, which combines sensitivity and false positive rate, Look4TRs outperformed TRF and MISA-the most widely used tools-by 78 and 84%. Look4TRs outperformed the second and the third best tools, MsDetector and Tantan, by 17 and 34%. On eight bacterial genomes, Look4TRs outperformed the second and the third best tools by 27 and 137%. AVAILABILITY AND IMPLEMENTATION: https://github.com/TulsaBioinformaticsToolsmith/Look4TRs. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica , Eucariotos , Genoma Bacteriano , Repetições de Microssatélites , Software
6.
Nucleic Acids Res ; 46(14): e83, 2018 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-29718317

RESUMO

Sequence clustering is a fundamental step in analyzing DNA sequences. Widely-used software tools for sequence clustering utilize greedy approaches that are not guaranteed to produce the best results. These tools are sensitive to one parameter that determines the similarity among sequences in a cluster. Often times, a biologist may not know the exact sequence similarity. Therefore, clusters produced by these tools do not likely match the real clusters comprising the data if the provided parameter is inaccurate. To overcome this limitation, we adapted the mean shift algorithm, an unsupervised machine-learning algorithm, which has been used successfully thousands of times in fields such as image processing and computer vision. The theory behind the mean shift algorithm, unlike the greedy approaches, guarantees convergence to the modes, e.g. cluster centers. Here we describe the first application of the mean shift algorithm to clustering DNA sequences. MeShClust is one of few applications of the mean shift algorithm in bioinformatics. Further, we applied supervised machine learning to predict the identity score produced by global alignment using alignment-free methods. We demonstrate MeShClust's ability to cluster DNA sequences with high accuracy even when the sequence similarity parameter provided by the user is not very accurate.


Assuntos
Análise de Sequência de DNA/métodos , Software , Algoritmos , Análise por Conglomerados , Genoma Viral , Microbiota/genética
7.
Sci Transl Med ; 16(737): eadf4601, 2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38446899

RESUMO

Patients with cancer undergoing chemotherapy frequently experience a neurological condition known as chemotherapy-related cognitive impairment, or "chemobrain," which can persist for the remainder of their lives. Despite the growing prevalence of chemobrain, both its underlying mechanisms and treatment strategies remain poorly understood. Recent findings suggest that chemobrain shares several characteristics with neurodegenerative diseases, including chronic neuroinflammation, DNA damage, and synaptic loss. We investigated whether a noninvasive sensory stimulation treatment we term gamma entrainment using sensory stimuli (GENUS), which has been shown to alleviate aberrant immune and synaptic pathologies in mouse models of neurodegeneration, could also mitigate chemobrain phenotypes in mice administered a chemotherapeutic drug. When administered concurrently with the chemotherapeutic agent cisplatin, GENUS alleviated cisplatin-induced brain pathology, promoted oligodendrocyte survival, and improved cognitive function in a mouse model of chemobrain. These effects persisted for up to 105 days after GENUS treatment, suggesting the potential for long-lasting benefits. However, when administered to mice 90 days after chemotherapy, GENUS treatment only provided limited benefits, indicating that it was most effective when used to prevent the progression of chemobrain pathology. Furthermore, we demonstrated that the effects of GENUS in mice were not limited to cisplatin-induced chemobrain but also extended to methotrexate-induced chemobrain. Collectively, these findings suggest that GENUS may represent a versatile approach for treating chemobrain induced by different chemotherapy agents.


Assuntos
Comprometimento Cognitivo Relacionado à Quimioterapia , Disfunção Cognitiva , Humanos , Animais , Camundongos , Cisplatino/efeitos adversos , Cognição , Dano ao DNA , Modelos Animais de Doenças
8.
bioRxiv ; 2023 Nov 13.
Artigo em Inglês | MEDLINE | ID: mdl-38014075

RESUMO

Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.

9.
NAR Genom Bioinform ; 3(1): lqab001, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33554117

RESUMO

Pairwise global alignment is a fundamental step in sequence analysis. Optimal alignment algorithms are quadratic-slow especially on long sequences. In many applications that involve large sequence datasets, all what is needed is calculating the identity scores (percentage of identical nucleotides in an optimal alignment-including gaps-of two sequences); there is no need for visualizing how every two sequences are aligned. For these applications, we propose Identity, which produces global identity scores for a large number of pairs of DNA sequences using alignment-free methods and self-supervised general linear models. For the first time, the new tool can predict pairwise identity scores in linear time and space. On two large-scale sequence databases, Identity provided the best compromise between sensitivity and precision while being faster than BLAST, Mash, MUMmer4 and USEARCH by 2-80 times. Identity was the best performing tool when searching for low-identity matches. While constructing phylogenetic trees from about 6000 transcripts, the tree due to the scores reported by Identity was the closest to the reference tree (in contrast to andi, FSWM and Mash). Identity is capable of producing pairwise identity scores of millions-of-nucleotides-long bacterial genomes; this task cannot be accomplished by any global-alignment-based tool. Availability: https://github.com/BioinformaticsToolsmith/Identity.

10.
Genome Biol ; 20(1): 144, 2019 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-31345254

RESUMO

BACKGROUND: Alignment-free (AF) sequence comparison is attracting persistent interest driven by data-intensive applications. Hence, many AF procedures have been proposed in recent years, but a lack of a clearly defined benchmarking consensus hampers their performance assessment. RESULTS: Here, we present a community resource (http://afproject.org) to establish standards for comparing alignment-free approaches across different areas of sequence-based research. We characterize 74 AF methods available in 24 software tools for five research applications, namely, protein sequence classification, gene tree inference, regulatory element detection, genome-based phylogenetic inference, and reconstruction of species trees under horizontal gene transfer and recombination events. CONCLUSION: The interactive web service allows researchers to explore the performance of alignment-free tools relevant to their data types and analytical goals. It also allows method developers to assess their own algorithms and compare them with current state-of-the-art tools, accelerating the development of new, more accurate AF solutions.


Assuntos
Análise de Sequência , Benchmarking , Transferência Genética Horizontal , Internet , Filogenia , Sequências Reguladoras de Ácido Nucleico , Alinhamento de Sequência , Análise de Sequência de Proteína , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA