Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 6 de 6
1.
NAR Genom Bioinform ; 4(2): lqac029, 2022 Jun.
Article En | MEDLINE | ID: mdl-35387384

Non-biting midges (Chironomidae) are known to inhabit a wide range of environments, and certain species can tolerate extreme conditions, where the rest of insects cannot survive. In particular, the sleeping chironomid Polypedilum vanderplanki is known for the remarkable ability of its larvae to withstand almost complete desiccation by entering a state called anhydrobiosis. Chromosome numbers in chironomids are higher than in other dipterans and this extra genomic resource might facilitate rapid adaptation to novel environments. We used improved sequencing strategies to assemble a chromosome-level genome sequence for P. vanderplanki for deep comparative analysis of genomic location of genes associated with desiccation tolerance. Using whole genome-based cross-species and intra-species analysis, we provide evidence for the unique functional specialization of Chromosome 4 through extensive acquisition of novel genes. In contrast to other insect genomes, in the sleeping chironomid a uniquely high degree of subfunctionalization in paralogous anhydrobiosis genes occurs in this chromosome, as well as pseudogenization in a highly duplicated gene family. Our findings suggest that the Chromosome 4 in Polypedilum is a site of high genetic turnover, allowing it to act as a 'sandbox' for evolutionary experiments, thus facilitating the rapid adaptation of midges to harsh environments.

3.
Nat Commun ; 12(1): 3297, 2021 06 02.
Article En | MEDLINE | ID: mdl-34078885

Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.


Microsatellite Repeats , Neural Networks, Computer , Neurodegenerative Diseases/genetics , Transcription Initiation Site , Transcription Initiation, Genetic , A549 Cells , Animals , Base Sequence , Computational Biology/methods , Deep Learning , Enhancer Elements, Genetic , Genome, Human , High-Throughput Nucleotide Sequencing , Humans , Mice , Neurodegenerative Diseases/diagnosis , Neurodegenerative Diseases/metabolism , Polymorphism, Genetic , Promoter Regions, Genetic
4.
Methods Mol Biol ; 2120: 277-301, 2020.
Article En | MEDLINE | ID: mdl-32124327

Cap analysis of gene expression (CAGE) is an approach to identify and monitor the activity (transcription initiation frequency) of transcription start sites (TSSs) at single base-pair resolution across the genome. It has been effectively used to identify active promoter and enhancer regions in cancer cells, with potential utility to identify key factors to immunotherapy. Here, we overview a series of CAGE protocols and describe detailed experimental steps of the latest protocol based on the Illumina sequencing platform; both experimental steps (see Subheadings 3.1-3.11) and computational processing steps (see Subheadings 3.12-3.20) are described.


Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Transcription Initiation Site , Transcriptional Activation , Animals , Gene Expression , Humans , Mice , Promoter Regions, Genetic
5.
Methods Mol Biol ; 1164: 67-85, 2014.
Article En | MEDLINE | ID: mdl-24927836

Cap analysis of gene expression (CAGE) provides accurate high-throughput measurement of RNA expression. By the large-scale analysis of 5' end of transcripts using CAGE method, it enables not only determination of the transcription start site but also prediction of promoter region. Here we provide a protocol for the construction of no-amplification non-tagging CAGE libraries for Illumina next-generation sequencers (nAnT-iCAGE). We have excluded the commonly used PCR amplification and cleavage of restriction enzyme to eliminate any potential biases. As a result, we achieved less biased simple preparation process.


DNA, Complementary/genetics , Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods , RNA/genetics , Transcription Initiation Site , Alkaline Phosphatase/metabolism , Animals , Base Sequence , Biotinylation/methods , DNA, Complementary/isolation & purification , DNA, Complementary/metabolism , Exodeoxyribonucleases/metabolism , Gene Library , Humans , Mice , Promoter Regions, Genetic , RNA/metabolism , Reverse Transcription , Ribonuclease, Pancreatic/metabolism
6.
Genome Res ; 24(4): 708-17, 2014 Apr.
Article En | MEDLINE | ID: mdl-24676093

CAGE (cap analysis gene expression) and RNA-seq are two major technologies used to identify transcript abundances as well as structures. They measure expression by sequencing from either the 5' end of capped molecules (CAGE) or tags randomly distributed along the length of a transcript (RNA-seq). Library protocols for clonally amplified (Illumina, SOLiD, 454 Life Sciences [Roche], Ion Torrent), second-generation sequencing platforms typically employ PCR preamplification prior to clonal amplification, while third-generation, single-molecule sequencers can sequence unamplified libraries. Although these transcriptome profiling platforms have been demonstrated to be individually reproducible, no systematic comparison has been carried out between them. Here we compare CAGE, using both second- and third-generation sequencers, and RNA-seq, using a second-generation sequencer based on a panel of RNA mixtures from two human cell lines to examine power in the discrimination of biological states, detection of differentially expressed genes, linearity of measurements, and quantification reproducibility. We found that the quantified levels of gene expression are largely comparable across platforms and conclude that CAGE and RNA-seq are complementary technologies that can be used to improve incomplete gene models. We also found systematic bias in the second- and third-generation platforms, which is likely due to steps such as linker ligation, cleavage by restriction enzymes, and PCR amplification. This study provides a perspective on the performance of these platforms, which will be a baseline in the design of further experiments to tackle complex transcriptomes uncovered in a wide range of cell types.


High-Throughput Nucleotide Sequencing/methods , RNA/genetics , Transcriptome/genetics , Gene Expression Profiling , Humans , Sequence Analysis, RNA/methods
...