Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Mais filtros

Base de dados
Intervalo de ano de publicação
Methods Mol Biol ; 2351: 67-90, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34382184


The Cap Analysis of Gene Expression (CAGE) is a powerful method to identify Transcription Start Sites (TSSs) of capped RNAs while simultaneously measuring transcripts expression level. CAGE allows mapping at single nucleotide resolution at all active promoters and enhancers. Large CAGE datasets have been produced over the years from individual laboratories and consortia, including the Encyclopedia of DNA Elements (ENCODE) and Functional Annotation of the Mammalian Genome (FANTOM) consortia. These datasets constitute open resource for TSS annotations and gene expression analysis. Here, we provide an experimental protocol for the most recent CAGE method called Low Quantity (LQ) single strand (ss) CAGE "LQ-ssCAGE", which enables cost-effective profiling of low quantity RNA samples. LQ-ssCAGE is especially useful for samples derived from cells cultured in small volumes, cellular compartments such as nuclear RNAs or for samples from developmental stages. We demonstrate the reproducibility and effectiveness of the method by constructing 240 LQ-ssCAGE libraries from 50 ng of THP-1 cell extracted RNAs and discover lowly expressed novel enhancer and promoter-derived lncRNAs.

Biologia Computacional/métodos , Elementos Facilitadores Genéticos , Regiões Promotoras Genéticas , Capuzes de RNA , Sítio de Iniciação de Transcrição , Bases de Dados Genéticas , Regulação da Expressão Gênica , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Anotação de Sequência Molecular , Sequências Reguladoras de Ácido Nucleico , Reprodutibilidade dos Testes , Fluxo de Trabalho
Nat Commun ; 12(1): 3297, 2021 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-34078885


Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.

Repetições de Microssatélites , Redes Neurais de Computação , Doenças Neurodegenerativas/genética , Sítio de Iniciação de Transcrição , Iniciação da Transcrição Genética , Células A549 , Animais , Sequência de Bases , Biologia Computacional/métodos , Aprendizado Profundo , Elementos Facilitadores Genéticos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Camundongos , Doenças Neurodegenerativas/diagnóstico , Doenças Neurodegenerativas/metabolismo , Polimorfismo Genético , Regiões Promotoras Genéticas
Methods Mol Biol ; 2120: 277-301, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32124327


Cap analysis of gene expression (CAGE) is an approach to identify and monitor the activity (transcription initiation frequency) of transcription start sites (TSSs) at single base-pair resolution across the genome. It has been effectively used to identify active promoter and enhancer regions in cancer cells, with potential utility to identify key factors to immunotherapy. Here, we overview a series of CAGE protocols and describe detailed experimental steps of the latest protocol based on the Illumina sequencing platform; both experimental steps (see Subheadings 3.1-3.11) and computational processing steps (see Subheadings 3.12-3.20) are described.

Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sítio de Iniciação de Transcrição , Ativação Transcricional , Animais , Expressão Gênica , Humanos , Camundongos , Regiões Promotoras Genéticas
Sci Data ; 4: 170112, 2017 08 29.
Artigo em Inglês | MEDLINE | ID: mdl-28850106


In the FANTOM5 project, transcription initiation events across the human and mouse genomes were mapped at a single base-pair resolution and their frequencies were monitored by CAGE (Cap Analysis of Gene Expression) coupled with single-molecule sequencing. Approximately three thousands of samples, consisting of a variety of primary cells, tissues, cell lines, and time series samples during cell activation and development, were subjected to a uniform pipeline of CAGE data production. The analysis pipeline started by measuring RNA extracts to assess their quality, and continued to CAGE library production by using a robotic or a manual workflow, single molecule sequencing, and computational processing to generate frequencies of transcription initiation. Resulting data represents the consequence of transcriptional regulation in each analyzed state of mammalian cells. Non-overlapping peaks over the CAGE profiles, approximately 200,000 and 150,000 peaks for the human and mouse genomes, were identified and annotated to provide precise location of known promoters as well as novel ones, and to quantify their activities.

Perfilação da Expressão Gênica , Genoma , Animais , Regulação da Expressão Gênica , Humanos , Camundongos , Regiões Promotoras Genéticas , Especificidade da Espécie
Methods Mol Biol ; 1164: 67-85, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24927836


Cap analysis of gene expression (CAGE) provides accurate high-throughput measurement of RNA expression. By the large-scale analysis of 5' end of transcripts using CAGE method, it enables not only determination of the transcription start site but also prediction of promoter region. Here we provide a protocol for the construction of no-amplification non-tagging CAGE libraries for Illumina next-generation sequencers (nAnT-iCAGE). We have excluded the commonly used PCR amplification and cleavage of restriction enzyme to eliminate any potential biases. As a result, we achieved less biased simple preparation process.

DNA Complementar/genética , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA/genética , Sítio de Iniciação de Transcrição , Fosfatase Alcalina/metabolismo , Animais , Sequência de Bases , Biotinilação/métodos , DNA Complementar/isolamento & purificação , DNA Complementar/metabolismo , Exodesoxirribonucleases/metabolismo , Biblioteca Gênica , Humanos , Camundongos , Regiões Promotoras Genéticas , RNA/metabolismo , Transcrição Reversa , Ribonuclease Pancreático/metabolismo
Genome Res ; 24(4): 708-17, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24676093


CAGE (cap analysis gene expression) and RNA-seq are two major technologies used to identify transcript abundances as well as structures. They measure expression by sequencing from either the 5' end of capped molecules (CAGE) or tags randomly distributed along the length of a transcript (RNA-seq). Library protocols for clonally amplified (Illumina, SOLiD, 454 Life Sciences [Roche], Ion Torrent), second-generation sequencing platforms typically employ PCR preamplification prior to clonal amplification, while third-generation, single-molecule sequencers can sequence unamplified libraries. Although these transcriptome profiling platforms have been demonstrated to be individually reproducible, no systematic comparison has been carried out between them. Here we compare CAGE, using both second- and third-generation sequencers, and RNA-seq, using a second-generation sequencer based on a panel of RNA mixtures from two human cell lines to examine power in the discrimination of biological states, detection of differentially expressed genes, linearity of measurements, and quantification reproducibility. We found that the quantified levels of gene expression are largely comparable across platforms and conclude that CAGE and RNA-seq are complementary technologies that can be used to improve incomplete gene models. We also found systematic bias in the second- and third-generation platforms, which is likely due to steps such as linker ligation, cleavage by restriction enzymes, and PCR amplification. This study provides a perspective on the performance of these platforms, which will be a baseline in the design of further experiments to tackle complex transcriptomes uncovered in a wide range of cell types.

Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA/genética , Transcriptoma/genética , Perfilação da Expressão Gênica , Humanos , Análise de Sequência de RNA/métodos