Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Data ; 4: 170112, 2017 08 29.
Artigo em Inglês | MEDLINE | ID: mdl-28850106

RESUMO

In the FANTOM5 project, transcription initiation events across the human and mouse genomes were mapped at a single base-pair resolution and their frequencies were monitored by CAGE (Cap Analysis of Gene Expression) coupled with single-molecule sequencing. Approximately three thousands of samples, consisting of a variety of primary cells, tissues, cell lines, and time series samples during cell activation and development, were subjected to a uniform pipeline of CAGE data production. The analysis pipeline started by measuring RNA extracts to assess their quality, and continued to CAGE library production by using a robotic or a manual workflow, single molecule sequencing, and computational processing to generate frequencies of transcription initiation. Resulting data represents the consequence of transcriptional regulation in each analyzed state of mammalian cells. Non-overlapping peaks over the CAGE profiles, approximately 200,000 and 150,000 peaks for the human and mouse genomes, were identified and annotated to provide precise location of known promoters as well as novel ones, and to quantify their activities.


Assuntos
Perfilação da Expressão Gênica , Genoma , Animais , Regulação da Expressão Gênica , Humanos , Camundongos , Regiões Promotoras Genéticas , Especificidade da Espécie
2.
Genome Res ; 24(4): 708-17, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24676093

RESUMO

CAGE (cap analysis gene expression) and RNA-seq are two major technologies used to identify transcript abundances as well as structures. They measure expression by sequencing from either the 5' end of capped molecules (CAGE) or tags randomly distributed along the length of a transcript (RNA-seq). Library protocols for clonally amplified (Illumina, SOLiD, 454 Life Sciences [Roche], Ion Torrent), second-generation sequencing platforms typically employ PCR preamplification prior to clonal amplification, while third-generation, single-molecule sequencers can sequence unamplified libraries. Although these transcriptome profiling platforms have been demonstrated to be individually reproducible, no systematic comparison has been carried out between them. Here we compare CAGE, using both second- and third-generation sequencers, and RNA-seq, using a second-generation sequencer based on a panel of RNA mixtures from two human cell lines to examine power in the discrimination of biological states, detection of differentially expressed genes, linearity of measurements, and quantification reproducibility. We found that the quantified levels of gene expression are largely comparable across platforms and conclude that CAGE and RNA-seq are complementary technologies that can be used to improve incomplete gene models. We also found systematic bias in the second- and third-generation platforms, which is likely due to steps such as linker ligation, cleavage by restriction enzymes, and PCR amplification. This study provides a perspective on the performance of these platforms, which will be a baseline in the design of further experiments to tackle complex transcriptomes uncovered in a wide range of cell types.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA/genética , Transcriptoma/genética , Perfilação da Expressão Gênica , Humanos , Análise de Sequência de RNA/métodos
3.
PLoS One ; 7(1): e30809, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22303458

RESUMO

BACKGROUND: Cap analysis of gene expression (CAGE) is a 5' sequence tag technology to globally determine transcriptional starting sites in the genome and their expression levels and has most recently been adapted to the HeliScope single molecule sequencer. Despite significant simplifications in the CAGE protocol, it has until now been a labour intensive protocol. METHODOLOGY: In this study we set out to adapt the protocol to a robotic workflow, which would increase throughput and reduce handling. The automated CAGE cDNA preparation system we present here can prepare 96 'HeliScope ready' CAGE cDNA libraries in 8 days, as opposed to 6 weeks by a manual operator.We compare the results obtained using the same RNA in manual libraries and across multiple automation batches to assess reproducibility. CONCLUSIONS: We show that the sequencing was highly reproducible and comparable to manual libraries with an 8 fold increase in productivity. The automated CAGE cDNA preparation system can prepare 96 CAGE sequencing samples simultaneously. Finally we discuss how the system could be used for CAGE on Illumina/SOLiD platforms, RNA-seq and full-length cDNA generation.


Assuntos
DNA Complementar/metabolismo , Regulação da Expressão Gênica , Análise de Sequência de DNA/instrumentação , Análise de Sequência de DNA/métodos , Fluxo de Trabalho , Animais , Automação , Sequência de Bases , DNA Complementar/genética , Biblioteca Gênica , Genoma Humano/genética , Humanos , Camundongos , Reprodutibilidade dos Testes
4.
PLoS One ; 6(10): e25391, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21984916

RESUMO

BACKGROUND: Mesothelioma is a highly malignant tumor that is primarily caused by occupational or environmental exposure to asbestos fibers. Despite worldwide restrictions on asbestos usage, further cases are expected as diagnosis is typically 20-40 years after exposure. Once diagnosed there is a very poor prognosis with a median survival rate of 9 months. Considering this the development of early pre clinical diagnostic markers may help improve clinical outcomes. METHODOLOGY: Microarray expression arrays on mesothelium and other tissues dissected from mice were used to identify candidate mesothelial lineage markers. Candidates were further tested by qRTPCR and in-situ hybridization across a mouse tissue panel. Two candidate biomarkers with the potential for secretion, uroplakin 3B (UPK3B), and leucine rich repeat neuronal 4 (LRRN4) and one commercialized mesothelioma marker, mesothelin (MSLN) were then chosen for validation across a panel of normal human primary cells, 16 established mesothelioma cell lines, 10 lung cancer lines, and a further set of 8 unrelated cancer cell lines. CONCLUSIONS: Within the primary cell panel, LRRN4 was only detected in primary mesothelial cells, but MSLN and UPK3B were also detected in other cell types. MSLN was detected in bronchial epithelial cells and alveolar epithelial cells and UPK3B was detected in retinal pigment epithelial cells and urothelial cells. Testing the cell line panel, MSLN was detected in 15 of the 16 mesothelioma cells lines, whereas LRRN4 was only detected in 8 and UPK3B in 6. Interestingly MSLN levels appear to be upregulated in the mesothelioma lines compared to the primary mesothelial cells, while LRRN4 and UPK3B, are either lost or down-regulated. Despite the higher fraction of mesothelioma lines positive for MSLN, it was also detected at high levels in 2 lung cancer lines and 3 other unrelated cancer lines derived from papillotubular adenocarcinoma, signet ring carcinoma and transitional cell carcinoma.


Assuntos
Células Epiteliais/metabolismo , Proteínas de Membrana/metabolismo , Proteínas do Tecido Nervoso/metabolismo , Animais , Anticorpos Antineoplásicos/imunologia , Biomarcadores/metabolismo , Linhagem da Célula , Células Cultivadas , Células Epiteliais/patologia , Epitélio/metabolismo , Regulação da Expressão Gênica , Humanos , Imuno-Histoquímica , Hibridização In Situ , Pulmão/citologia , Pulmão/metabolismo , Masculino , Proteínas de Membrana/genética , Mesotelina , Mesotelioma/genética , Mesotelioma/imunologia , Mesotelioma/patologia , Camundongos , Camundongos Endogâmicos C57BL , Proteínas do Tecido Nervoso/genética , Análise de Sequência com Séries de Oligonucleotídeos , Especificidade de Órgãos , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Uroplaquina III/genética , Uroplaquina III/metabolismo
5.
Genome Res ; 21(7): 1150-9, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21596820

RESUMO

We report the development of a simplified cap analysis of gene expression (CAGE) protocol adapted for single-molecule sequencers that avoids second strand synthesis, ligation, digestion, and PCR. HeliScopeCAGE directly sequences the 3' end of cap trapped first-strand cDNAs. As with previous versions of CAGE, we better define transcription start sites (TSS) than known models, identify novel regions of transcription and alternative promoters, and find two major classes of TSS signal, sharp peaks and broad regions. However, using this protocol, we observe reproducible evidence of regulation at the much finer level of individual TSS positions. The libraries are quantitative over 5 orders of magnitude and highly reproducible (Pearson's correlation coefficient of 0.987). We have also scaled down the sample requirement to 5 µg of total RNA for a standard HeliScopeCAGE library and 100 ng for a low-quantity version. When the same RNA was run as 5-µg and 100-ng versions, the 100 ng was still able to detect expression for ∼60% of the 13,468 loci detected by a 5-µg library using the same threshold, allowing comparative analysis of even rare cell populations. Testing the protocol for differential gene expression measurements on triplicate HeLa and THP-1 samples, we find that the log fold change compared to Illumina microarray measurements is highly correlated (0.871). In addition, HeliScopeCAGE finds differential expression for thousands more loci including those with probes on the array. Finally, although the majority of tags are 5' associated, we also observe a low level of signal on exons that is useful for defining gene structures.


Assuntos
Perfilação da Expressão Gênica/métodos , Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Mapeamento Cromossômico , DNA Complementar/genética , Éxons , Biblioteca Gênica , Células HeLa , Humanos , Reação em Cadeia da Polimerase , Regiões Promotoras Genéticas , Análise de Sequência de RNA/métodos , Sítio de Iniciação de Transcrição , Transcrição Gênica
6.
Mol Immunol ; 47(14): 2295-302, 2010 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-20573402

RESUMO

Gene regulatory networks in living cells are controlled by the interaction of multiple cell type-specific transcription regulators with DNA binding sites in target genes. Interferon regulatory factor 8 (IRF8), also known as interferon consensus sequence binding protein (ICSBP), is a transcription factor expressed predominantly in myeloid and lymphoid cell lineages. To find the functional direct target genes of IRF8, the gene expression profiles of siRNA knockdown samples and genome-wide binding locations by ChIP-chip were analyzed in THP-1 myelomonocytic leukemia cells. Consequently, 84 genes were identified as functional direct targets. The ETS family transcription factor PU.1, also known as SPI1, binds to IRF8 and regulates basal transcription in macrophages. Using the same approach, we identified 53 direct target genes of PU.1; these overlapped with 19 IRF8 targets. These 19 genes included key molecules of IFN signaling such as OAS1 and IRF9, but excluded other IFN-related genes amongst the IRF8 functional direct target genes. We suggest that IRF8 and PU.1 can have both combined, and independent actions on different promoters in myeloid cells.


Assuntos
Fatores Reguladores de Interferon/genética , Fatores Reguladores de Interferon/metabolismo , Sequência de Bases , Sítios de Ligação/genética , Linhagem Celular , Imunoprecipitação da Cromatina , Perfilação da Expressão Gênica , Técnicas de Silenciamento de Genes , Redes Reguladoras de Genes , Técnicas Genéticas , Humanos , Modelos Biológicos , Células Mieloides/metabolismo , Regiões Promotoras Genéticas , Proteínas Proto-Oncogênicas/genética , Proteínas Proto-Oncogênicas/metabolismo , RNA Interferente Pequeno/genética , Transdução de Sinais , Transativadores/genética , Transativadores/metabolismo
7.
Cell ; 140(5): 744-52, 2010 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-20211142

RESUMO

Combinatorial interactions among transcription factors are critical to directing tissue-specific gene expression. To build a global atlas of these combinations, we have screened for physical interactions among the majority of human and mouse DNA-binding transcription factors (TFs). The complete networks contain 762 human and 877 mouse interactions. Analysis of the networks reveals that highly connected TFs are broadly expressed across tissues, and that roughly half of the measured interactions are conserved between mouse and human. The data highlight the importance of TF combinations for determining cell fate, and they lead to the identification of a SMAD3/FLI1 complex expressed during development of immunity. The availability of large TF combinatorial networks in both human and mouse will provide many opportunities to study gene regulation, tissue differentiation, and mammalian evolution.


Assuntos
Regulação da Expressão Gênica , Redes Reguladoras de Genes , Fatores de Transcrição/metabolismo , Animais , Diferenciação Celular , Evolução Molecular , Humanos , Camundongos , Monócitos/citologia , Especificidade de Órgãos , Proteína Smad3/metabolismo , Transativadores/metabolismo
8.
Genome Res ; 20(2): 257-64, 2010 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-20051556

RESUMO

MicroRNAs (miRNAs) are short (20-23 nt) RNAs that are sequence-specific mediators of transcriptional and post-transcriptional regulation of gene expression. Modern high-throughput technologies enable deep sequencing of such RNA species on an unprecedented scale. We find that the analysis of small RNA deep-sequencing libraries can be affected by cross-mapping, in which RNA sequences originating from one locus are inadvertently mapped to another. Similar to cross-hybridization on microarrays, cross-mapping is prevalent among miRNAs, as they tend to occur in families, are similar or derived from repeat or structural RNAs, or are post-transcriptionally modified. Here, we develop a strategy to correct for cross-mapping, and apply it to the analysis of RNA editing in mature miRNAs. In contrast to previous reports, our analysis suggests that RNA editing in mature miRNAs is rare in animals.


Assuntos
Biblioteca Gênica , MicroRNAs/genética , Edição de RNA/genética , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Animais , Sequência de Bases , Ensaios de Triagem em Larga Escala , Humanos , Camundongos , MicroRNAs/metabolismo
9.
Nat Genet ; 41(5): 553-62, 2009 May.
Artigo em Inglês | MEDLINE | ID: mdl-19377474

RESUMO

Using deep sequencing (deepCAGE), the FANTOM4 study measured the genome-wide dynamics of transcription-start-site usage in the human monocytic cell line THP-1 throughout a time course of growth arrest and differentiation. Modeling the expression dynamics in terms of predicted cis-regulatory sites, we identified the key transcription regulators, their time-dependent activities and target genes. Systematic siRNA knockdown of 52 transcription factors confirmed the roles of individual factors in the regulatory network. Our results indicate that cellular states are constrained by complex networks involving both positive and negative regulatory interactions among substantial numbers of transcription factors and that no single transcription factor is both necessary and sufficient to drive the differentiation process.


Assuntos
Diferenciação Celular/genética , Proliferação de Células , Redes Reguladoras de Genes , Transcrição Gênica , Sequência de Bases , Linhagem Celular , Perfilação da Expressão Gênica , Humanos , Leucemia Mieloide/genética , Leucemia Mieloide/metabolismo , Modelos Genéticos , Dados de Sequência Molecular , Análise de Sequência com Séries de Oligonucleotídeos , Regiões Promotoras Genéticas , RNA Interferente Pequeno/metabolismo
10.
J Biol Chem ; 282(15): 11122-34, 2007 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-17308308

RESUMO

The survival of motor neuron (SMN) protein, responsible for the neurodegenerative disease spinal muscular atrophy (SMA), oligomerizes and forms a stable complex with seven other major components, the Gemin proteins. Besides the SMN protein, Gemin2 is a core protein that is essential for the formation of the SMN complex, although the mechanism by which it drives formation is unclear. We have found a novel interaction, a Gemin2 self-association, using the mammalian two-hybrid system and the in vitro pull-down assays. Using in vitro dissociation assays, we also found that the self-interaction of the amino-terminal SMN protein, which was confirmed in this study, became stable in the presence of Gemin2. In addition, Gemin2 knockdown using small interference RNA treatment revealed a drastic decrease in SMN oligomer formation and in the assembly activity of spliceosomal small nuclear ribonucleoprotein (snRNP). Taken together, these results indicate that Gemin2 plays an important role in snRNP assembly through the stabilization of the SMN oligomer/complex via novel self-interaction. Applying the results/techniques to amino-terminal SMN missense mutants that were recently identified from SMA patients, we successfully showed that amino-terminal self-association, Gemin2 binding, the stabilization effect of Gemin2, and snRNP assembly activity were all lowered in the mutant SMN(D44V), suggesting that instability of the amino-terminal SMN self-association may cause SMA in patients carrying this allele.


Assuntos
Proteína de Ligação ao Elemento de Resposta ao AMP Cíclico/metabolismo , Proteínas do Tecido Nervoso/metabolismo , Proteínas de Ligação a RNA/metabolismo , Animais , Proteína de Ligação ao Elemento de Resposta ao AMP Cíclico/genética , Células HeLa , Humanos , Camundongos , Mutação/genética , Proteínas do Tecido Nervoso/genética , Ligação Proteica , Proteínas de Ligação a RNA/genética , Ribonucleoproteínas Nucleares Pequenas/metabolismo , Proteínas do Complexo SMN
12.
PLoS Genet ; 2(4): e62, 2006 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-16683036

RESUMO

The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full-length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web-based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full-length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding (including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full-length cDNAs. The total number of distinct non-protein-coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.


Assuntos
DNA Complementar/genética , Bases de Dados Genéticas , Camundongos/genética , Transcrição Gênica , Animais , Automação , DNA Complementar/química , Genoma
13.
Nat Genet ; 38(6): 626-35, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16645617

RESUMO

Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.


Assuntos
Evolução Molecular , Regiões Promotoras Genéticas , Regiões 3' não Traduzidas , Animais , Sequência de Bases , DNA , Genoma , Proteoma , TATA Box
14.
Genome Biol ; 6(12): R98, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-16356270

RESUMO

BACKGROUND: Although 2,061 proteins of Pyrococcus horikoshii OT3, a hyperthermophilic archaeon, have been predicted from the recently completed genome sequence, the majority of proteins show no similarity to those from other organisms and are thus hypothetical proteins of unknown function. Because most proteins operate as parts of complexes to regulate biological processes, we systematically analyzed protein-protein interactions in Pyrococcus using the mammalian two-hybrid system to determine the function of the hypothetical proteins. RESULTS: We examined 960 soluble proteins from Pyrococcus and selected 107 interactions based on luciferase reporter activity, which was then evaluated using a computational approach to assess the reliability of the interactions. We also analyzed the expression of the assay samples by western blot, and a few interactions by in vitro pull-down assays. We identified 11 hetero-interactions that we considered to be located at the same operon, as observed in Helicobacter pylori. We annotated and classified proteins in the selected interactions according to their orthologous proteins. Many enzyme proteins showed self-interactions, similar to those seen in other organisms. CONCLUSION: We found 13 unannotated proteins that interacted with annotated proteins; this information is useful for predicting the functions of the hypothetical Pyrococcus proteins from the annotations of their interacting partners. Among the heterogeneous interactions, proteins were more likely to interact with proteins within the same ortholog class than with proteins of different classes. The analysis described here can provide global insights into the biological features of the protein-protein interactions in P. horikoshii.


Assuntos
Mapeamento de Interação de Proteínas , Pyrococcus horikoshii/metabolismo , Genes Arqueais/genética , Genoma Arqueal/genética , Família Multigênica/genética , Fases de Leitura Aberta/genética , Ligação Proteica , Pyrococcus horikoshii/classificação , Pyrococcus horikoshii/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...