RESUMEN
UNLABELLED: In response to increasing amounts of sequencing data, faster and faster aligners need to become available. Here, we introduce BRAT-nova, a completely rewritten and improved implementation of the mapping tool BRAT-BW for bisulfite-treated reads (BS-Seq). BRAT-nova is very fast and accurate. On the human genome, BRAT-nova is 2-7 times faster than state-of-the-art aligners, while maintaining the same percentage of uniquely mapped reads and space usage. On synthetic reads, BRAT-nova is 2-8 times faster than state-of-the-art aligners while maintaining similar mapping accuracy, methylation call accuracy, methylation level accuracy and space efficiency. AVAILABILITY AND IMPLEMENTATION: The software is available in the public domain at http://compbio.cs.ucr.edu/brat/ CONTACT: elenah@cs.ucr.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Alineación de Secuencia , Análisis de Secuencia de ADN , Programas Informáticos , Mapeo Cromosómico , Metilación de ADN , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , HumanosRESUMEN
The partial purification of mouse mammary gland stem cells (MaSCs) using combinatorial cell surface markers (Lin(-)CD24(+)CD29(h)CD49f(h)) has improved our understanding of their role in normal development and breast tumorigenesis. Despite the significant improvement in MaSC enrichment, there is presently no methodology that adequately isolates pure MaSCs. Seeking new markers of MaSCs, we characterized the stem-like properties and expression signature of label-retaining cells from the mammary gland of mice expressing a controllable H2b-GFP transgene. In this system, the transgene expression can be repressed in a doxycycline-dependent fashion, allowing isolation of slowly dividing cells with retained nuclear GFP signal. Here, we show that H2b-GFP(h) cells reside within the predicted MaSC compartment and display greater mammary reconstitution unit frequency compared with H2b-GFP(neg) MaSCs. According to their transcriptome profile, H2b-GFP(h) MaSCs are enriched for pathways thought to play important roles in adult stem cells. We found Cd1d, a glycoprotein expressed on the surface of antigen-presenting cells, to be highly expressed by H2b-GFP(h) MaSCs, and isolation of Cd1d(+) MaSCs further improved the mammary reconstitution unit enrichment frequency to nearly a single-cell level. Additionally, we functionally characterized a set of MaSC-enriched genes, discovering factors controlling MaSC survival. Collectively, our data provide tools for isolating a more precisely defined population of MaSCs and point to potentially critical factors for MaSC maintenance.
Asunto(s)
Biomarcadores/metabolismo , Diferenciación Celular , Glándulas Mamarias Animales/citología , Células Madre/citología , Células Madre/metabolismo , Animales , Antígenos CD1d/metabolismo , Membrana Celular/metabolismo , Separación Celular , Femenino , Perfilación de la Expresión Génica , Proteínas Fluorescentes Verdes/metabolismo , Histonas/metabolismo , Ratones , ARN Interferente Pequeño/metabolismo , Coloración y EtiquetadoRESUMEN
In eukaryotic cells, chromatin reorganizes within promoters of active genes to allow the transcription machinery and various transcription factors to access DNA. In this model, promoter-specific transcription factors bind DNA to initiate the production of mRNA in a tightly regulated manner. In the case of the human malaria parasite, Plasmodium falciparum, specific transcription factors are apparently underrepresented with regards to the size of the genome, and mechanisms underlying transcriptional regulation are controversial. Here, we investigate the modulation of DNA accessibility by chromatin remodeling during the parasite infection cycle. We have generated genome-wide maps of nucleosome occupancy across the parasite erythrocytic cycle using two complementary assays--the formaldehyde-assisted isolation of regulatory elements to extract protein-free DNA (FAIRE) and the MNase-mediated purification of mononucleosomes to extract histone-bound DNA (MAINE), both techniques being coupled to high-throughput sequencing. We show that chromatin architecture undergoes drastic upheavals throughout the parasite's cycle, contrasting with targeted chromatin reorganization usually observed in eukaryotes. Chromatin loosens after the invasion of the red blood cell and then repacks prior to the next cycle. Changes in nucleosome occupancy within promoter regions follow this genome-wide pattern, with a few exceptions such as the var genes involved in virulence and genes expressed at early stages of the cycle. We postulate that chromatin structure and nucleosome turnover control massive transcription during the erythrocytic cycle. Our results demonstrate that the processes driving gene expression in Plasmodium challenge the classical eukaryotic model of transcriptional regulation occurring mostly at the transcription initiation level.
Asunto(s)
Regulación de la Expresión Génica , Nucleosomas/genética , Plasmodium falciparum/genética , Transcripción Genética/genética , Ensamble y Desensamble de Cromatina/genética , Mapeo Cromosómico , ADN Protozoario/metabolismo , Eritrocitos/metabolismo , Eritrocitos/patología , Genoma de Protozoos , Humanos , Nucleosomas/metabolismo , Plasmodium falciparum/metabolismo , Regiones Promotoras GenéticasRESUMEN
SUMMARY: We introduce BRAT-BW, a fast, accurate and memory-efficient tool that maps bisulfite-treated short reads (BS-seq) to a reference genome using the FM-index (Burrows-Wheeler transform). BRAT-BW is significantly more memory efficient and faster on longer reads than current state-of-the-art tools for BS-seq data, without compromising on accuracy. BRAT-BW is a part of a software suite for genome-wide single base-resolution methylation data analysis that supports single and paired-end reads and includes a tool for estimation of methylation level at each cytosine. AVAILABILITY: The software is available in the public domain at http://compbio.cs.ucr.edu/brat/.
Asunto(s)
Metilación de ADN , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Sulfitos , Citosina/metabolismo , GenomaRESUMEN
BACKGROUND: Despite extensive efforts to discover transcription factors and their binding sites in the human malaria parasite Plasmodium falciparum, only a few transcription factor binding motifs have been experimentally validated to date. As a consequence, gene regulation in P. falciparum is still poorly understood. There is now evidence that the chromatin architecture plays an important role in transcriptional control in malaria. RESULTS: We propose a methodology for discovering cis-regulatory elements that uses for the first time exclusively dynamic chromatin remodeling data. Our method employs nucleosome positioning data collected at seven time points during the erythrocytic cycle of P. falciparum to discover putative DNA binding motifs and their transcription factor binding sites along with their associated clusters of target genes. Our approach results in 129 putative binding motifs within the promoter region of known genes. About 75% of those are novel, the remaining being highly similar to experimentally validated binding motifs. About half of the binding motifs reported show statistically significant enrichment in functional gene sets and strong positional bias in the promoter region. CONCLUSION: Experimental results establish the principle that dynamic chromatin remodeling data can be used in lieu of gene expression data to discover binding motifs and their transcription factor binding sites. Our approach can be applied using only dynamic nucleosome positioning data, independent from any knowledge of gene function or expression.
Asunto(s)
ADN/metabolismo , Plasmodium falciparum/fisiología , Animales , Eritrocitos/parasitología , Humanos , Plasmodium falciparum/metabolismoRESUMEN
SUMMARY: We present a new, accurate and efficient tool for mapping short reads obtained from the Illumina Genome Analyzer following sodium bisulfite conversion. Our tool, BRAT, supports single and paired-end reads and handles input files containing reads and mates of different lengths. BRAT is faster, maps more unique paired-end reads and has higher accuracy than existing programs. The software package includes tools to end-trim low-quality bases of the reads and to report nucleotide counts for mapped reads on the reference genome.
Asunto(s)
Genómica/métodos , Programas Informáticos , Sulfitos/química , Perfilación de la Expresión Génica/métodos , GenomaRESUMEN
Cytosine DNA methylation is an epigenetic mark in most eukaryotic cells that regulates numerous processes, including gene expression and stress responses. We performed a genome-wide analysis of DNA methylation in the human malaria parasite Plasmodium falciparum. We mapped the positions of methylated cytosines and identified a single functional DNA methyltransferase (Plasmodium falciparum DNA methyltransferase; PfDNMT) that may mediate these genomic modifications. These analyses revealed that the malaria genome is asymmetrically methylated and shares common features with undifferentiated plant and mammalian cells. Notably, core promoters are hypomethylated, and transcript levels correlate with intraexonic methylation. Additionally, there are sharp methylation transitions at nucleosome and exon-intron boundaries. These data suggest that DNA methylation could regulate virulence gene expression and transcription elongation. Furthermore, the broad range of action of DNA methylation and the uniqueness of PfDNMT suggest that the methylation pathway is a potential target for antimalarial strategies.
Asunto(s)
Metilación de ADN , ADN Protozoario/química , Genoma de Protozoos , Plasmodium falciparum/genética , Plasmodium falciparum/metabolismo , Cromatografía Liquida , ADN Protozoario/metabolismo , ADN-Citosina Metilasas/metabolismo , Epigénesis Genética , Eritrocitos/parasitología , Humanos , Plasmodium falciparum/enzimología , Espectrometría de Masas en TándemRESUMEN
Almost a decade after the publication of the complete sequence of the genome of the human malaria parasite Plasmodium falciparum, the mechanisms involved in gene regulation remain poorly understood. Like other eukaryotic organisms, P. falciparum's genomic DNA organizes into nucleosomes. Nucleosomes are the basic structural units of eukaryotic chromatin and their regulation is known to play a key role in regulation of gene expression. Despite its importance, the relationship between nucleosome positioning and gene regulation in the malaria parasite has only been investigated recently. Using two independent and complementary techniques followed by next-generation high-throughput sequencing, our laboratory recently generated a dynamic atlas of nucleosome-bound and nucleosome-free regions (NFRs) at single-nucleotide resolution throughout the parasite erythrocytic cycle. We have found evidences that genome-wide changes in nucleosome occupancy play a critical role in controlling the rigorous parasite replication in infected red blood cells. However, the role of nucleosome positioning at remarkable locations such as transcriptional start sites (TSS) was not investigated. Here we show that a study of NFR in experimentally determined TSS and in silico-predicted promoters can provide deeper insights of how a transcriptionally permissive organization of chromatin can control the parasite's progression through its life cycle. We find that NFRs found at TSS and core promoters are strongly associated with high levels of gene expression in asexual erythrocytic stages, whereas nucleosome-bound TSSs and promoters are associated with silent genes preferentially expressed in sexual stages. The implications in terms of regulatory evolution, adaptation of gene expression and their impact in the design of antimalarial strategies are discussed.