Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 179
Filtrar
1.
Int J Mol Sci ; 25(9)2024 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-38732207

RESUMEN

Prediction of binding sites for transcription factors is important to understand how the latter regulate gene expression and how this regulation can be modulated for therapeutic purposes. A consistent number of references address this issue with different approaches, Machine Learning being one of the most successful. Nevertheless, we note that many such approaches fail to propose a robust and meaningful method to embed the genetic data under analysis. We try to overcome this problem by proposing a bidirectional transformer-based encoder, empowered by bidirectional long-short term memory layers and with a capsule layer responsible for the final prediction. To evaluate the efficiency of the proposed approach, we use benchmark ChIP-seq datasets of five cell lines available in the ENCODE repository (A549, GM12878, Hep-G2, H1-hESC, and Hela). The results show that the proposed method can predict TFBS within the five different cell lines very well; moreover, cross-cell predictions provide satisfactory results as well. Experiments conducted across cell lines are reinforced by the analysis of five additional lines used only to test the model trained using the others. The results confirm that prediction across cell lines remains very high, allowing an extensive cross-transcription factor analysis to be performed from which several indications of interest for molecular biology may be drawn.


Asunto(s)
Aprendizaje Profundo , Factores de Transcripción , Humanos , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Sitios de Unión , Biología Computacional/métodos , Células HeLa , Unión Proteica , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Línea Celular
2.
Nat Commun ; 15(1): 3606, 2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38697975

RESUMEN

Amyotrophic Lateral Sclerosis (ALS), like many other neurodegenerative diseases, is highly heritable, but with only a small fraction of cases explained by monogenic disease alleles. To better understand sporadic ALS, we report epigenomic profiles, as measured by ATAC-seq, of motor neuron cultures derived from a diverse group of 380 ALS patients and 80 healthy controls. We find that chromatin accessibility is heavily influenced by sex, the iPSC cell type of origin, ancestry, and the inherent variance arising from sequencing. Once these covariates are corrected for, we are able to identify ALS-specific signals in the data. Additionally, we find that the ATAC-seq data is able to predict ALS disease progression rates with similar accuracy to methods based on biomarkers and clinical status. These results suggest that iPSC-derived motor neurons recapitulate important disease-relevant epigenomic changes.


Asunto(s)
Esclerosis Amiotrófica Lateral , Células Madre Pluripotentes Inducidas , Neuronas Motoras , Humanos , Esclerosis Amiotrófica Lateral/genética , Esclerosis Amiotrófica Lateral/patología , Esclerosis Amiotrófica Lateral/metabolismo , Células Madre Pluripotentes Inducidas/metabolismo , Neuronas Motoras/metabolismo , Neuronas Motoras/patología , Masculino , Femenino , Persona de Mediana Edad , Estudios de Casos y Controles , Cromatina/metabolismo , Cromatina/genética , Anciano , Epigenómica/métodos , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Progresión de la Enfermedad , Epigénesis Genética
3.
Sci Rep ; 14(1): 9275, 2024 04 23.
Artículo en Inglés | MEDLINE | ID: mdl-38654130

RESUMEN

Transcription factors (TFs) are crucial epigenetic regulators, which enable cells to dynamically adjust gene expression in response to environmental signals. Computational procedures like digital genomic footprinting on chromatin accessibility assays such as ATACseq can be used to identify bound TFs in a genome-wide scale. This method utilizes short regions of low accessibility signals due to steric hindrance of DNA bound proteins, called footprints (FPs), which are combined with motif databases for TF identification. However, while over 1600 TFs have been described in the human genome, only ~ 700 of these have a known binding motif. Thus, a substantial number of FPs without overlap to a known DNA motif are normally discarded from FP analysis. In addition, the FP method is restricted to organisms with a substantial number of known TF motifs. Here we present DENIS (DE Novo motIf diScovery), a framework to generate and systematically investigate the potential of de novo TF motif discovery from FPs. DENIS includes functionality (1) to isolate FPs without binding motifs, (2) to perform de novo motif generation and (3) to characterize novel motifs. Here, we show that the framework rediscovers artificially removed TF motifs, quantifies de novo motif usage during an early embryonic development example dataset, and is able to analyze and uncover TF activity in organisms lacking canonical motifs. The latter task is exemplified by an investigation of a scATAC-seq dataset in zebrafish which covers different cell types during hematopoiesis.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Motivos de Nucleótidos , Factores de Transcripción , Pez Cebra , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Animales , Pez Cebra/genética , Pez Cebra/metabolismo , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Humanos , Sitios de Unión , Unión Proteica , Huella de ADN/métodos , Biología Computacional/métodos , Cromatina/metabolismo , Cromatina/genética
4.
Nucleic Acids Res ; 52(9): e46, 2024 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-38647069

RESUMEN

SifiNet is a robust and accurate computational pipeline for identifying distinct gene sets, extracting and annotating cellular subpopulations, and elucidating intrinsic relationships among these subpopulations. Uniquely, SifiNet bypasses the cell clustering stage, commonly integrated into other cellular annotation pipelines, thereby circumventing potential inaccuracies in clustering that may compromise subsequent analyses. Consequently, SifiNet has demonstrated superior performance in multiple experimental datasets compared with other state-of-the-art methods. SifiNet can analyze both single-cell RNA and ATAC sequencing data, thereby rendering comprehensive multi-omic cellular profiles. It is conveniently available as an open-source R package.


Asunto(s)
Análisis de la Célula Individual , Programas Informáticos , Análisis de la Célula Individual/métodos , Humanos , Anotación de Secuencia Molecular , Algoritmos , Biología Computacional/métodos , Análisis de Secuencia de ARN/métodos , Perfilación de la Expresión Génica/métodos , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Análisis por Conglomerados
5.
Nucleic Acids Res ; 52(8): 4137-4150, 2024 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-38572749

RESUMEN

DNA motifs are crucial patterns in gene regulation. DNA-binding proteins (DBPs), including transcription factors, can bind to specific DNA motifs to regulate gene expression and other cellular activities. Past studies suggest that DNA shape features could be subtly involved in DNA-DBP interactions. Therefore, the shape motif annotations based on intrinsic DNA topology can deepen the understanding of DNA-DBP binding. Nevertheless, high-throughput tools for DNA shape motif discovery that incorporate multiple features altogether remain insufficient. To address it, we propose a series of methods to discover non-redundant DNA shape motifs with the generalization to multiple motifs in multiple shape features. Specifically, an existing Gibbs sampling method is generalized to multiple DNA motif discovery with multiple shape features. Meanwhile, an expectation-maximization (EM) method and a hybrid method coupling EM with Gibbs sampling are proposed and developed with promising performance, convergence capability, and efficiency. The discovered DNA shape motif instances reveal insights into low-signal ChIP-seq peak summits, complementing the existing sequence motif discovery works. Additionally, our modelling captures the potential interplays across multiple DNA shape features. We provide a valuable platform of tools for DNA shape motif discovery. An R package is built for open accessibility and long-lasting impact: https://zenodo.org/doi/10.5281/zenodo.10558980.


Asunto(s)
ADN , Motivos de Nucleótidos , ADN/química , ADN/genética , ADN/metabolismo , Proteínas de Unión al ADN/metabolismo , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/genética , Algoritmos , Conformación de Ácido Nucleico , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Sitios de Unión , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/química , Humanos , Unión Proteica
6.
Methods ; 226: 151-160, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38670416

RESUMEN

Chromatin loop is of crucial importance for the regulation of gene transcription. Cohesin is a type of chromatin-associated protein that mediates the interaction of chromatin through the loop extrusion. Cohesin-mediated chromatin interactions have strong cell-type specificity, posing a challenge for predicting chromatin loops. Existing computational methods perform poorly in predicting cell-type-specific chromatin loops. To address this issue, we propose a random forest model to predict cell-type-specific cohesin-mediated chromatin loops based on chromatin states identified by ChromHMM and the occupancy of related factors. Our results show that chromatin state is responsible for cell-type-specificity of loops. Using only chromatin states as features, the model achieved high accuracy in predicting cell-type-specific loops between two cell types and can be applied to different cell types. Furthermore, when chromatin states are combined with the occurrence frequency of CTCF, RAD21, YY1, and H3K27ac ChIP-seq peaks, more accurate prediction can be achieved. Our feature extraction method provides novel insights into predicting cell-type-specific chromatin loops and reveals the relationship between chromatin state and chromatin loop formation.


Asunto(s)
Factor de Unión a CCCTC , Proteínas de Ciclo Celular , Cromatina , Proteínas Cromosómicas no Histona , Cohesinas , Proteínas Cromosómicas no Histona/metabolismo , Proteínas Cromosómicas no Histona/genética , Proteínas de Ciclo Celular/metabolismo , Proteínas de Ciclo Celular/genética , Cromatina/metabolismo , Cromatina/genética , Humanos , Factor de Unión a CCCTC/metabolismo , Factor de Unión a CCCTC/genética , Factor de Transcripción YY1/metabolismo , Factor de Transcripción YY1/genética , Proteínas Nucleares/metabolismo , Proteínas Nucleares/genética , Biología Computacional/métodos , Proteínas de Unión al ADN/metabolismo , Proteínas de Unión al ADN/genética , Histonas/metabolismo , Histonas/genética , Fosfoproteínas/metabolismo , Fosfoproteínas/genética , Secuenciación de Inmunoprecipitación de Cromatina/métodos
7.
Nat Comput Sci ; 4(4): 285-298, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38600256

RESUMEN

The single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) technology provides insight into gene regulation and epigenetic heterogeneity at single-cell resolution, but cell annotation from scATAC-seq remains challenging due to high dimensionality and extreme sparsity within the data. Existing cell annotation methods mostly focus on the cell peak matrix without fully utilizing the underlying genomic sequence. Here we propose a method, SANGO, for accurate single-cell annotation by integrating genome sequences around the accessibility peaks within scATAC data. The genome sequences of peaks are encoded into low-dimensional embeddings, and then iteratively used to reconstruct the peak statistics of cells through a fully connected network. The learned weights are considered as regulatory modes to represent cells, and utilized to align the query cells and the annotated cells in the reference data through a graph transformer network for cell annotations. SANGO was demonstrated to consistently outperform competing methods on 55 paired scATAC-seq datasets across samples, platforms and tissues. SANGO was also shown to be able to detect unknown tumor cells through attention edge weights learned by the graph transformer. Moreover, from the annotated cells, we found cell-type-specific peaks that provide functional insights/biological signals through expression enrichment analysis, cis-regulatory chromatin interaction analysis and motif enrichment analysis.


Asunto(s)
Cromatina , Análisis de la Célula Individual , Humanos , Algoritmos , Cromatina/genética , Cromatina/metabolismo , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Biología Computacional/métodos , Genoma/genética , Genómica/métodos , Neoplasias/genética , Análisis de la Célula Individual/métodos , Transposasas/genética , Transposasas/metabolismo
8.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38493346

RESUMEN

Single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) data provided new insights into the understanding of epigenetic heterogeneity and transcriptional regulation. With the increasing abundance of dataset resources, there is an urgent need to extract more useful information through high-quality data analysis methods specifically designed for scATAC-seq. However, analyzing scATAC-seq data poses challenges due to its near binarization, high sparsity and ultra-high dimensionality properties. Here, we proposed a novel network diffusion-based computational method to comprehensively analyze scATAC-seq data, named Single-Cell ATAC-seq Analysis via Network Refinement with Peaks Location Information (SCARP). SCARP formulates the Network Refinement diffusion method under the graph theory framework to aggregate information from different network orders, effectively compensating for missing signals in the scATAC-seq data. By incorporating distance information between adjacent peaks on the genome, SCARP also contributes to depicting the co-accessibility of peaks. These two innovations empower SCARP to obtain lower-dimensional representations for both cells and peaks more effectively. We have demonstrated through sufficient experiments that SCARP facilitated superior analyses of scATAC-seq data. Specifically, SCARP exhibited outstanding cell clustering performance, enabling better elucidation of cell heterogeneity and the discovery of new biologically significant cell subpopulations. Additionally, SCARP was also instrumental in portraying co-accessibility relationships of accessible regions and providing new insight into transcriptional regulation. Consequently, SCARP identified genes that were involved in key Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways related to diseases and predicted reliable cis-regulatory interactions. To sum up, our studies suggested that SCARP is a promising tool to comprehensively analyze the scATAC-seq data.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Cromatina , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Cromatina/genética , Genoma , Epigenómica , Análisis de Datos
9.
Genome Biol ; 24(1): 244, 2023 10 24.
Artículo en Inglés | MEDLINE | ID: mdl-37875977

RESUMEN

BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) measures gene expression in single cells, while single-nucleus ATAC-sequencing (snATAC-seq) quantifies chromatin accessibility in single nuclei. These two data types provide complementary information for deciphering cell types and states. However, when analyzed individually, they sometimes produce conflicting results regarding cell type/state assignment. The power is compromised since the two modalities reflect the same underlying biology. Recently, it has become possible to measure both gene expression and chromatin accessibility from the same nucleus. Such paired data enable the direct modeling of the relationships between the two modalities. Given the availability of the vast amount of single-modality data, it is desirable to integrate the paired and unpaired single-modality datasets to gain a comprehensive view of the cellular complexity. RESULTS: We benchmark nine existing single-cell multi-omic data integration methods. Specifically, we evaluate to what extent the multiome data provide additional guidance for analyzing the existing single-modality data, and whether these methods uncover peak-gene associations from single-modality data. Our results indicate that multiome data are helpful for annotating single-modality data. However, we emphasize that the availability of an adequate number of nuclei in the multiome dataset is crucial for achieving accurate cell type annotation. Insufficient representation of nuclei may compromise the reliability of the annotations. Additionally, when generating a multiome dataset, the number of cells is more important than sequencing depth for cell type annotation. CONCLUSIONS: Seurat v4 is the best currently available platform for integrating scRNA-seq, snATAC-seq, and multiome data even in the presence of complex batch effects.


Asunto(s)
Benchmarking , Secuenciación de Inmunoprecipitación de Cromatina , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Reproducibilidad de los Resultados , Análisis de Expresión Génica de una Sola Célula , Algoritmos , Cromatina/genética , Análisis de la Célula Individual/métodos , Análisis de Secuencia de ARN
10.
Curr Protoc ; 3(10): e909, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37830781

RESUMEN

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a widely employed technique for investigating protein-DNA interactions. However, the absence of a standardized and clear workflow necessitates researchers to independently assemble methodologies from diverse resources. This lack of uniformity hampers reproducibility and makes version control a complex endeavor, thereby limiting the accessibility of ChIP-seq analyses to individuals with extensive training in bioinformatics. In light of these challenges, we have developed an executable protocol that addresses these limitations. Our protocol encompasses all aspects of ChIP-seq analysis, ranging from quality control of raw reads to peak calling and downstream functional analyses. We have implemented two distinct approaches for peak calling, providing researchers with flexibility to choose the most suitable method for their specific experimental needs. This protocol will contribute to the scientific community by providing a standardized and clear resource that will enhance the reproducibility and accessibility of ChIP-seq analyses. © 2023 Wiley Periodicals LLC. Basic Protocol: ChIP-seq analysis workflow Alternative Protocol: Call differentially enriched peaks by using MACS3.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Genómica , Humanos , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Reproducibilidad de los Resultados , Genómica/métodos , Inmunoprecipitación de Cromatina/métodos , ADN/genética
11.
Nat Commun ; 14(1): 6045, 2023 09 28.
Artículo en Inglés | MEDLINE | ID: mdl-37770437

RESUMEN

Single-cell multi-omics data integration aims to reduce the omics difference while keeping the cell type difference. However, it is daunting to model and distinguish the two differences due to cell heterogeneity. Namely, even cells of the same omics and type would have various features, making the two differences less significant. In this work, we reveal that instead of being an interference, cell heterogeneity could be exploited to improve data integration. Specifically, we observe that the omics difference varies in cells, and cells with smaller omics differences are easier to be integrated. Hence, unlike most existing works that homogeneously treat and integrate all cells, we propose a multi-omics data integration method (dubbed scBridge) that integrates cells in a heterogeneous manner. In brief, scBridge iterates between i) identifying reliable scATAC-seq cells that have smaller omics differences, and ii) integrating reliable scATAC-seq cells with scRNA-seq data to narrow the omics gap, thus benefiting the integration for the rest cells. Extensive experiments on seven multi-omics datasets demonstrate the superiority of scBridge compared with six representative baselines.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Análisis de Expresión Génica de una Sola Célula , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Análisis de la Célula Individual/métodos , Multiómica
12.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37497729

RESUMEN

Here, we present AtacAnnoR, a two-round annotation method for scATAC-seq data using well-annotated scRNA-seq data as reference. We evaluate AtacAnnoR's performance against six competing methods on 11 benchmark datasets. Our results show that AtacAnnoR achieves the highest mean accuracy and the highest mean balanced accuracy and performs particularly well when unpaired scRNA-seq data are used as the reference. Furthermore, AtacAnnoR implements a 'Combine and Discard' strategy to further improve annotation accuracy when annotations of multiple references are available. AtacAnnoR has been implemented in an R package and can be directly integrated into currently popular scATAC-seq analysis pipelines.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Análisis de la Célula Individual , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Análisis de la Célula Individual/métodos , Benchmarking , Agricultura , Secuenciación del Exoma , Análisis de Secuencia de ARN/métodos
13.
J Vis Exp ; (193)2023 03 17.
Artículo en Inglés | MEDLINE | ID: mdl-37010313

RESUMEN

Histone post-translational modifications (PTMs) and other epigenetic modifications regulate the chromatin accessibility of genes to the transcriptional machinery, thus affecting an organism's capacity to respond to environmental stimuli. Chromatin immunoprecipitation coupled with high-throughput sequencing (ChIP-seq) has been widely utilized to identify and map protein-DNA interactions in the fields of epigenetics and gene regulation. However, the field of cnidarian epigenetics is hampered by a lack of applicable protocols, partly due to the unique features of model organisms such as the symbiotic sea anemone Exaiptasia diaphana, whose high water content and mucus amounts obstruct molecular methods. Here, a specialized ChIP procedure is presented, which facilitates the investigation of protein-DNA interactions in E. diaphana gene regulation. The cross-linking and chromatin extraction steps were optimized for efficient immunoprecipitation and then validated by performing ChIP using an antibody against the histone mark H3K4me3. Subsequently, the specificity and effectiveness of the ChIP assay were confirmed by measuring the relative occupancy of H3K4me3 around several constitutively activated gene loci using quantitative PCR and by next-generation sequencing for genome-wide scale analysis. This optimized ChIP protocol for the symbiotic sea anemone E. diaphana facilitates the investigation of the protein-DNA interactions involved in organismal responses to environmental changes that affect symbiotic cnidarians, such as corals.


Asunto(s)
Anémonas de Mar , Animales , Anémonas de Mar/genética , Cromatina/genética , Inmunoprecipitación de Cromatina/métodos , Secuenciación de Inmunoprecipitación de Cromatina/métodos , ADN , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
14.
Nat Commun ; 14(1): 1864, 2023 04 03.
Artículo en Inglés | MEDLINE | ID: mdl-37012226

RESUMEN

Computational cell type identification is a fundamental step in single-cell omics data analysis. Supervised celltyping methods have gained increasing popularity in single-cell RNA-seq data because of the superior performance and the availability of high-quality reference datasets. Recent technological advances in profiling chromatin accessibility at single-cell resolution (scATAC-seq) have brought new insights to the understanding of epigenetic heterogeneity. With continuous accumulation of scATAC-seq datasets, supervised celltyping method specifically designed for scATAC-seq is in urgent need. Here we develop Cellcano, a computational method based on a two-round supervised learning algorithm to identify cell types from scATAC-seq data. The method alleviates the distributional shift between reference and target data and improves the prediction performance. After systematically benchmarking Cellcano on 50 well-designed celltyping tasks from various datasets, we show that Cellcano is accurate, robust, and computationally efficient. Cellcano is well-documented and freely available at https://marvinquiet.github.io/Cellcano/ .


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Cromatina , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Cromatina/genética , Algoritmos , Epigenómica , Análisis de la Célula Individual/métodos
15.
BMC Genomics ; 24(1): 171, 2023 Apr 04.
Artículo en Inglés | MEDLINE | ID: mdl-37016279

RESUMEN

Chromatin immunoprecipitation (ChIP) is an antibody-based approach that is frequently utilized in chromatin biology and epigenetics. The challenge in experimental variability by unpredictable nature of usable input amounts from samples and undefined antibody titer in ChIP reaction still remains to be addressed. Here, we introduce a simple and quick method to quantify chromatin inputs and demonstrate its utility for normalizing antibody amounts to the optimal titer in individual ChIP reactions. For a proof of concept, we utilized ChIP-seq validated antibodies against the key enhancer mark, acetylation of histone H3 on lysine 27 (H3K27ac), in the experiments. The results indicate that the titration-based normalization of antibody amounts improves assay outcomes including the consistency among samples both within and across experiments for a broad range of input amounts.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Histonas , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Inmunoprecipitación de Cromatina/métodos , Histonas/genética , Cromatina , Anticuerpos
16.
Int J Mol Sci ; 24(5)2023 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-36902216

RESUMEN

Recent advances in single-cell sequencing assays for the transposase-accessibility chromatin (scATAC-seq) technique have provided cell-specific chromatin accessibility landscapes of cis-regulatory elements, providing deeper insights into cellular states and dynamics. However, few research efforts have been dedicated to modeling the relationship between regulatory grammars and single-cell chromatin accessibility and incorporating different analysis scenarios of scATAC-seq data into the general framework. To this end, we propose a unified deep learning framework based on the ProdDep Transformer Encoder, dubbed PROTRAIT, for scATAC-seq data analysis. Specifically motivated by the deep language model, PROTRAIT leverages the ProdDep Transformer Encoder to capture the syntax of transcription factor (TF)-DNA binding motifs from scATAC-seq peaks for predicting single-cell chromatin accessibility and learning single-cell embedding. Based on cell embedding, PROTRAIT annotates cell types using the Louvain algorithm. Furthermore, according to the identified likely noises of raw scATAC-seq data, PROTRAIT denoises these values based on predated chromatin accessibility. In addition, PROTRAIT employs differential accessibility analysis to infer TF activity at single-cell and single-nucleotide resolution. Extensive experiments based on the Buenrostro2018 dataset validate the effeteness of PROTRAIT for chromatin accessibility prediction, cell type annotation, and scATAC-seq data denoising, therein outperforming current approaches in terms of different evaluation metrics. Besides, we confirm the consistency between the inferred TF activity and the literature review. We also demonstrate the scalability of PROTRAIT to analyze datasets containing over one million cells.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Aprendizaje Profundo , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Cromatina , Regulación de la Expresión Génica , Factores de Transcripción/metabolismo , Análisis de la Célula Individual/métodos
17.
STAR Protoc ; 4(1): 101991, 2023 03 17.
Artículo en Inglés | MEDLINE | ID: mdl-36607812

RESUMEN

Computational pipelines for chromatin immunoprecipitation sequencing analysis can neglect colocalization events that occur in a mere subset of the genome. Here, we detail a streamlined approach for assessing colocalization of chromatin-bound proteins using the bedGraph2Cluster and PanChIP algorithms. Using histone modifications as an example, bedGraph2Cluster performs clustering analysis on chromatin binding patterns of target proteins. PanChIP then compares these clusters with a reference library of chromatin binding patterns and measures the overlap in peaks, capturing the heterogeneity in chromatin binding and colocalization patterns. For complete details on the use and execution of this protocol, please refer to Sanidas et al. (2022).1.


Asunto(s)
Cromatina , Secuenciación de Nucleótidos de Alto Rendimiento , Cromatina/genética , Inmunoprecipitación de Cromatina/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Genoma
18.
BMC Genomics ; 24(1): 43, 2023 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-36698077

RESUMEN

BACKGROUND: Epigenomic profiling assays such as ChIP-seq have been widely used to map the genome-wide enrichment profiles of chromatin-associated proteins and posttranslational histone modifications. Sequencing depth is a key parameter in experimental design and quality control. However, due to variable sequencing depth requirements across experimental conditions, it can be challenging to determine optimal sequencing depth, particularly for projects involving multiple targets or cell types. RESULTS: We developed the peaksat R package to provide target read depth estimates for epigenomic experiments based on the analysis of peak saturation curves. We applied peaksat to establish the distinctive read depth requirements for ChIP-seq studies of histone modifications in different cell lines. Using peaksat, we were able to estimate the target read depth required per library to obtain high-quality peak calls for downstream analysis. In addition, peaksat was applied to other sequence-enrichment methods including CUT&RUN and ATAC-seq. CONCLUSION: peaksat addresses a need for researchers to make informed decisions about whether their sequencing data has been generated to an adequate depth and subsequently sufficient meaningful peaks, and failing that, how many more reads would be required per library. peaksat is applicable to other sequence-based methods that include calling peaks in their analysis.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Secuenciación de Nucleótidos de Alto Rendimiento , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Análisis de Secuencia de ADN/métodos , Biblioteca de Genes
19.
Methods Mol Biol ; 2614: 313-348, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36587133

RESUMEN

Cancer cells within a tumor exhibit phenotypic plasticity that allows adaptation and survival in hostile tumor microenvironments. Reprogramming of epigenetic landscapes can support tumor progression within a specific microenvironment by influencing chromatin accessibility and modulating cell identity. The profiling of epigenetic landscapes within various tumor cell populations has significantly improved our understanding of tumor progression and plasticity. This protocol describes an integrated approach using chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) optimized to profile genome-wide post-translational modifications of histone tails in tumors. Essential tools amenable to ChIP-seq to isolate tumor cell populations of interest from the tumor microenvironment are also presented to provide a comprehensive approach to perform heterogeneous epigenetic landscape profiling of the tumor microenvironment.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Neoplasias , Humanos , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Microambiente Tumoral/genética , Histonas/genética , Histonas/metabolismo , Cromatina/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Neoplasias/genética , Epigénesis Genética
20.
Nature ; 609(7926): 375-383, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-35978191

RESUMEN

Cellular function in tissue is dependent on the local environment, requiring new methods for spatial mapping of biomolecules and cells in the tissue context1. The emergence of spatial transcriptomics has enabled genome-scale gene expression mapping2-5, but the ability to capture spatial epigenetic information of tissue at the cellular level and genome scale is lacking. Here we describe a method for spatially resolved chromatin accessibility profiling of tissue sections using next-generation sequencing (spatial-ATAC-seq) by combining in situ Tn5 transposition chemistry6 and microfluidic deterministic barcoding5. Profiling mouse embryos using spatial-ATAC-seq delineated tissue-region-specific epigenetic landscapes and identified gene regulators involved in the development of the central nervous system. Mapping the accessible genome in the mouse and human brain revealed the intricate arealization of brain regions. Applying spatial-ATAC-seq to tonsil tissue resolved the spatially distinct organization of immune cell types and states in lymphoid follicles and extrafollicular zones. This technology progresses spatial biology by enabling spatially resolved chromatin accessibility profiling to improve our understanding of cell identity, cell state and cell fate decision in relation to epigenetic underpinnings in development and disease.


Asunto(s)
Ensamble y Desensamble de Cromatina , Secuenciación de Inmunoprecipitación de Cromatina , Cromatina , Animales , Encéfalo/metabolismo , Diferenciación Celular , Linaje de la Célula , Cromatina/genética , Cromatina/metabolismo , Ensamble y Desensamble de Cromatina/genética , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Epigenómica , Perfilación de la Expresión Génica , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Ratones , Tonsila Palatina/citología , Tonsila Palatina/inmunología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...