Pesquisa | BVS Doenças Infecciosas e Parasitárias

1.

An atlas of combinatorial transcriptional regulation in mouse and man.

Ravasi, Timothy; Suzuki, Harukazu; Cannistraci, Carlo Vittorio; Katayama, Shintaro; Bajic, Vladimir B; Tan, Kai; Akalin, Altuna; Schmeier, Sebastian; Kanamori-Katayama, Mutsumi; Bertin, Nicolas; Carninci, Piero; Daub, Carsten O; Forrest, Alistair R R; Gough, Julian; Grimmond, Sean; Han, Jung-Hoon; Hashimoto, Takehiro; Hide, Winston; Hofmann, Oliver; Kamburov, Atanas; Kaur, Mandeep; Kawaji, Hideya; Kubosaki, Atsutaka; Lassmann, Timo; van Nimwegen, Erik; MacPherson, Cameron Ross; Ogawa, Chihiro; Radovanovic, Aleksandar; Schwartz, Ariel; Teasdale, Rohan D; Tegnér, Jesper; Lenhard, Boris; Teichmann, Sarah A; Arakawa, Takahiro; Ninomiya, Noriko; Murakami, Kayoko; Tagami, Michihira; Fukuda, Shiro; Imamura, Kengo; Kai, Chikatoshi; Ishihara, Ryoko; Kitazume, Yayoi; Kawai, Jun; Hume, David A; Ideker, Trey; Hayashizaki, Yoshihide.

Cell ; 140(5): 744-52, 2010 Mar 05.

Artigo em Inglês | MEDLINE | ID: mdl-20211142

RESUMO

Combinatorial interactions among transcription factors are critical to directing tissue-specific gene expression. To build a global atlas of these combinations, we have screened for physical interactions among the majority of human and mouse DNA-binding transcription factors (TFs). The complete networks contain 762 human and 877 mouse interactions. Analysis of the networks reveals that highly connected TFs are broadly expressed across tissues, and that roughly half of the measured interactions are conserved between mouse and human. The data highlight the importance of TF combinations for determining cell fate, and they lead to the identification of a SMAD3/FLI1 complex expressed during development of immunity. The availability of large TF combinatorial networks in both human and mouse will provide many opportunities to study gene regulation, tissue differentiation, and mammalian evolution.

Assuntos

Regulação da Expressão Gênica , Redes Reguladoras de Genes , Fatores de Transcrição/metabolismo , Animais , Diferenciação Celular , Evolução Molecular , Humanos , Camundongos , Monócitos/citologia , Especificidade de Órgãos , Proteína Smad3/metabolismo , Transativadores/metabolismo

2.

Genome-wide analysis of mammalian promoter architecture and evolution.

Carninci, Piero; Sandelin, Albin; Lenhard, Boris; Katayama, Shintaro; Shimokawa, Kazuro; Ponjavic, Jasmina; Semple, Colin A M; Taylor, Martin S; Engström, Pär G; Frith, Martin C; Forrest, Alistair R R; Alkema, Wynand B; Tan, Sin Lam; Plessy, Charles; Kodzius, Rimantas; Ravasi, Timothy; Kasukawa, Takeya; Fukuda, Shiro; Kanamori-Katayama, Mutsumi; Kitazume, Yayoi; Kawaji, Hideya; Kai, Chikatoshi; Nakamura, Mari; Konno, Hideaki; Nakano, Kenji; Mottagui-Tabar, Salim; Arner, Peter; Chesi, Alessandra; Gustincich, Stefano; Persichetti, Francesca; Suzuki, Harukazu; Grimmond, Sean M; Wells, Christine A; Orlando, Valerio; Wahlestedt, Claes; Liu, Edison T; Harbers, Matthias; Kawai, Jun; Bajic, Vladimir B; Hume, David A; Hayashizaki, Yoshihide.

Nat Genet ; 38(6): 626-35, 2006 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-16645617

RESUMO

Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.

Assuntos

Evolução Molecular , Regiões Promotoras Genéticas , Regiões 3' não Traduzidas , Animais , Sequência de Bases , DNA , Genoma , Proteoma , TATA Box

3.

Assessment of adaptive evolution between wheat and rice as deduced from full-length common wheat cDNA sequence data and expression patterns.

Kawaura, Kanako; Mochida, Keiichi; Enju, Akiko; Totoki, Yasushi; Toyoda, Atsushi; Sakaki, Yoshiyuki; Kai, Chikatoshi; Kawai, Jun; Hayashizaki, Yoshihide; Seki, Motoaki; Shinozaki, Kazuo; Ogihara, Yasunari.

BMC Genomics ; 10: 271, 2009 Jun 18.

Artigo em Inglês | MEDLINE | ID: mdl-19534823

RESUMO

BACKGROUND: Wheat is an allopolyploid plant that harbors a huge, complex genome. Therefore, accumulation of expressed sequence tags (ESTs) for wheat is becoming particularly important for functional genomics and molecular breeding. We prepared a comprehensive collection of ESTs from the various tissues that develop during the wheat life cycle and from tissues subjected to stress. We also examined their expression profiles in silico. As full-length cDNAs are indispensable to certify the collected ESTs and annotate the genes in the wheat genome, we performed a systematic survey and sequencing of the full-length cDNA clones. This sequence information is a valuable genetic resource for functional genomics and will enable carrying out comparative genomics in cereals. RESULTS: As part of the functional genomics and development of genomic wheat resources, we have generated a collection of full-length cDNAs from common wheat. By grouping the ESTs of recombinant clones randomly selected from the full-length cDNA library, we were able to sequence 6,162 independent clones with high accuracy. About 10% of the clones were wheat-unique genes, without any counterparts within the DNA database. Wheat clones that showed high homology to those of rice were selected in order to investigate their expression patterns in various tissues throughout the wheat life cycle and in response to abiotic-stress treatments. To assess the variability of genes that have evolved differently in wheat and rice, we calculated the substitution rate (Ka/Ks) of the counterparts in wheat and rice. Genes that were preferentially expressed in certain tissues or treatments had higher Ka/Ks values than those in other tissues and treatments, which suggests that the genes with the higher variability expressed in these tissues is under adaptive selection. CONCLUSION: We have generated a high-quality full-length cDNA resource for common wheat, which is essential for continuation of the ongoing curation and annotation of the wheat genome. The data for each clone's expression in various tissues and stress treatments and its variability in wheat and rice as a result of their diversification are valuable tools for functional genomics in wheat and for comparative genomics in cereals.

Assuntos

Adaptação Biológica/genética , Evolução Molecular , Oryza/genética , Plantas Tolerantes a Sal/genética , Triticum/genética , DNA Complementar/genética , DNA de Plantas/genética , Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Biblioteca Gênica , Genes de Plantas , Genômica , Análise de Sequência de DNA , Estresse Fisiológico

4.

A method for similarity search of genomic positional expression using CAGE.

Seno, Shigeto; Takenaka, Yoichi; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Matsuda, Hideo.

PLoS Genet ; 2(4): e44, 2006 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-16683027

RESUMO

With the advancement of genome research, it is becoming clear that genes are not distributed on the genome in random order. Clusters of genes distributed at localized genome positions have been reported in several eukaryotes. Various correlations have been observed between the expressions of genes in adjacent or nearby positions along the chromosomes depending on tissue type and developmental stage. Moreover, in several cases, their transcripts, which control epigenetic transcription via processes such as transcriptional interference and genomic imprinting, occur in clusters. It is reasonable that genomic regions that have similar mechanisms show similar expression patterns and that the characteristics of expression in the same genomic regions differ depending on tissue type and developmental stage. In this study, we analyzed gene expression patterns using the cap analysis gene expression (CAGE) method for exploring systematic views of the mouse transcriptome. Counting the number of mapped CAGE tags for fixed-length regions allowed us to determine genomic expression levels. These expression levels were normalized, quantified, and converted into four types of descriptors, allowing the expression patterns along the genome to be represented by character strings. We analyzed them using dynamic programming in the same manner as for sequence analysis. We have developed a novel algorithm that provides a novel view of the genome from the perspective of genomic positional expression. In a similarity search of expression patterns across chromosomes and tissues, we found regions that had clusters of genes that showed expression patterns similar to each other depending on tissue type. Our results suggest the possibility that the regions that have sense-antisense transcription show similar expression patterns between forward and reverse strands.

Assuntos

Mapeamento Cromossômico/métodos , Genoma , Camundongos/genética , Transcrição Gênica , Algoritmos , Animais , Composição de Bases , Regulação da Expressão Gênica , Genoma Humano , Humanos , Macrófagos/fisiologia , MicroRNAs/genética , Modelos Genéticos , RNA não Traduzido/genética

5.

Heterotachy in mammalian promoter evolution.

Taylor, Martin S; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Semple, Colin A M.

PLoS Genet ; 2(4): e30, 2006 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-16683025

RESUMO

We have surveyed the evolutionary trends of mammalian promoters and upstream sequences, utilising large sets of experimentally supported transcription start sites (TSSs). With 30,969 well-defined TSSs from mouse and 26,341 from human, there are sufficient numbers to draw statistically meaningful conclusions and to consider differences between promoter types. Unlike previous smaller studies, we have considered the effects of insertions, deletions, and transposable elements as well as nucleotide substitutions. The rate of promoter evolution relative to that of control sequences has not been consistent between lineages nor within lineages over time. The most pronounced manifestation of this heterotachy is the increased rate of evolution in primate promoters. This increase is seen across different classes of mutation, including substitutions and micro-indel events. We investigated the relationship between promoter and coding sequence selective constraint and suggest that they are generally uncorrelated. This analysis also identified a small number of mouse promoters associated with the immune response that are under positive selection in rodents. We demonstrate significant differences in divergence between functional promoter categories and identify a category of promoters, not associated with conventional protein-coding genes, that has the highest rates of divergence across mammals. We find that evolutionary rates vary both on a fine scale within mammalian promoters and also between different functional classes of promoters. The discovery of heterotachy in promoter evolution, in particular the accelerated evolution of primate promoters, has important implications for our understanding of human evolution and for strategies to detect primate-specific regulatory elements.

Assuntos

Evolução Molecular , Primatas/genética , Regiões Promotoras Genéticas , Transcrição Gênica , Animais , Sequência de Bases , Mapeamento Cromossômico , Elementos de DNA Transponíveis , Engenharia Genética , Variação Genética , Genoma , Humanos , Camundongos , Primatas/anatomia & histologia , Proteínas/genética , Análise de Sequência de DNA , Deleção de Sequência

6.

A simple physical model predicts small exon length variations.

Chern, Tzu-Ming; van Nimwegen, Erik; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Zavolan, Mihaela.

PLoS Genet ; 2(4): e45, 2006 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-16683028

RESUMO

One of the most common splice variations are small exon length variations caused by the use of alternative donor or acceptor splice sites that are in very close proximity on the pre-mRNA. Among these, three-nucleotide variations at so-called NAGNAG tandem acceptor sites have recently attracted considerable attention, and it has been suggested that these variations are regulated and serve to fine-tune protein forms by the addition or removal of a single amino acid. In this paper we first show that in-frame exon length variations are generally overrepresented and that this overrepresentation can be quantitatively explained by the effect of nonsense-mediated decay. Our analysis allows us to estimate that about 50% of frame-shifted coding transcripts are targeted by nonsense-mediated decay. Second, we show that a simple physical model that assumes that the splicing machinery stochastically binds to nearby splice sites in proportion to the affinities of the sites correctly predicts the relative abundances of different small length variations at both boundaries. Finally, using the same simple physical model, we show that for NAGNAG sites, the difference in affinities of the neighboring sites for the splicing machinery accurately predicts whether splicing will occur only at the first site, splicing will occur only at the second site, or three-nucleotide splice variants are likely to occur. Our analysis thus suggests that small exon length variations are the result of stochastic binding of the spliceosome at neighboring splice sites. Small exon length variations occur when there are nearby alternative splice sites that have similar affinity for the splicing machinery.

Assuntos

Éxons/genética , Variação Genética , Modelos Genéticos , Animais , Mapeamento Cromossômico , Regulação da Expressão Gênica , Masculino , Camundongos , Músculo Esquelético/fisiologia , Especificidade de Órgãos , Próstata/fisiologia , Transcrição Gênica

7.

Pseudo-messenger RNA: phantoms of the transcriptome.

Frith, Martin C; Wilming, Laurens G; Forrest, Alistair; Kawaji, Hideya; Tan, Sin Lam; Wahlestedt, Claes; Bajic, Vladimir B; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Bailey, Timothy L; Huminiecki, Lukasz.

PLoS Genet ; 2(4): e23, 2006 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-16683022

RESUMO

The mammalian transcriptome harbours shadowy entities that resist classification and analysis. In analogy with pseudogenes, we define pseudo-messenger RNA to be RNA molecules that resemble protein-coding mRNA, but cannot encode full-length proteins owing to disruptions of the reading frame. Using a rigorous computational pipeline, which rules out sequencing errors, we identify 10,679 pseudo-messenger RNAs (approximately half of which are transposon-associated) among the 102,801 FANTOM3 mouse cDNAs: just over 10% of the FANTOM3 transcriptome. These comprise not only transcribed pseudogenes, but also disrupted splice variants of otherwise protein-coding genes. Some may encode truncated proteins, only a minority of which appear subject to nonsense-mediated decay. The presence of an excess of transcripts whose only disruptions are opal stop codons suggests that there are more selenoproteins than currently estimated. We also describe compensatory frameshifts, where a segment of the gene has changed frame but remains translatable. In summary, we survey a large class of non-standard but potentially functional transcripts that are likely to encode genetic information and effect biological processes in novel ways. Many of these transcripts do not correspond cleanly to any identifiable object in the genome, implying fundamental limits to the goal of annotating all functional elements at the genome sequence level.

Assuntos

RNA Mensageiro/genética , Transcrição Gênica , Animais , Elementos de DNA Transponíveis , Evolução Molecular , Humanos , Camundongos , Regiões Promotoras Genéticas , Proteínas/genética , Pseudogenes , Reprodutibilidade dos Testes , Alinhamento de Sequência

8.

Clusters of internally primed transcripts reveal novel long noncoding RNAs.

Furuno, Masaaki; Pang, Ken C; Ninomiya, Noriko; Fukuda, Shiro; Frith, Martin C; Bult, Carol; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Mattick, John S; Suzuki, Harukazu.

PLoS Genet ; 2(4): e37, 2006 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-16683026

RESUMO

Non-protein-coding RNAs (ncRNAs) are increasingly being recognized as having important regulatory roles. Although much recent attention has focused on tiny 22- to 25-nucleotide microRNAs, several functional ncRNAs are orders of magnitude larger in size. Examples of such macro ncRNAs include Xist and Air, which in mouse are 18 and 108 kilobases (Kb), respectively. We surveyed the 102,801 FANTOM3 mouse cDNA clones and found that Air and Xist were present not as single, full-length transcripts but as a cluster of multiple, shorter cDNAs, which were unspliced, had little coding potential, and were most likely primed from internal adenine-rich regions within longer parental transcripts. We therefore conducted a genome-wide search for regional clusters of such cDNAs to find novel macro ncRNA candidates. Sixty-six regions were identified, each of which mapped outside known protein-coding loci and which had a mean length of 92 Kb. We detected several known long ncRNAs within these regions, supporting the basic rationale of our approach. In silico analysis showed that many regions had evidence of imprinting and/or antisense transcription. These regions were significantly associated with microRNAs and transcripts from the central nervous system. We selected eight novel regions for experimental validation by northern blot and RT-PCR and found that the majority represent previously unrecognized noncoding transcripts that are at least 10 Kb in size and predominantly localized in the nucleus. Taken together, the data not only identify multiple new ncRNAs but also suggest the existence of many more macro ncRNAs like Xist and Air.

Assuntos

RNA não Traduzido/genética , Transcrição Gênica , Animais , Biologia Computacional , DNA Complementar/genética , Etiquetas de Sequências Expressas , Regulação da Expressão Gênica , Genoma , Genoma Humano , Humanos , Camundongos , Família Multigênica , RNA Longo não Codificante , Reação em Cadeia da Polimerase Via Transcriptase Reversa

9.

Differential use of signal peptides and membrane domains is a common occurrence in the protein output of transcriptional units.

Davis, Melissa J; Hanson, Kelly A; Clark, Francis; Fink, J Lynn; Zhang, Fasheng; Kasukawa, Takeya; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Teasdale, Rohan D.

PLoS Genet ; 2(4): e46, 2006 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-16683029

RESUMO

Membrane organization describes the orientation of a protein with respect to the membrane and can be determined by the presence, or absence, and organization within the protein sequence of two features: endoplasmic reticulum signal peptides and alpha-helical transmembrane domains. These features allow protein sequences to be classified into one of five membrane organization categories: soluble intracellular proteins, soluble secreted proteins, type I membrane proteins, type II membrane proteins, and multi-spanning membrane proteins. Generation of protein isoforms with variable membrane organizations can change a protein's subcellular localization or association with the membrane. Application of MemO, a membrane organization annotation pipeline, to the FANTOM3 Isoform Protein Sequence mouse protein set revealed that within the 8,032 transcriptional units (TUs) with multiple protein isoforms, 573 had variation in their use of signal peptides, 1,527 had variation in their use of transmembrane domains, and 615 generated protein isoforms from distinct membrane organization classes. The mechanisms underlying these transcript variations were analyzed. While TUs were identified encoding all pairwise combinations of membrane organization categories, the most common was conversion of membrane proteins to soluble proteins. Observed within our high-confidence set were 156 TUs predicted to generate both extracellular soluble and membrane proteins, and 217 TUs generating both intracellular soluble and membrane proteins. The differential use of endoplasmic reticulum signal peptides and transmembrane domains is a common occurrence within the variable protein output of TUs. The generation of protein isoforms that are targeted to multiple subcellular locations represents a major functional consequence of transcript variation within the mouse transcriptome.

Assuntos

Proteínas de Membrana/genética , Sinais Direcionadores de Proteínas/genética , Transcrição Gênica , Animais , Variação Genética , Isoformas de Proteínas/genética

10.

The abundance of short proteins in the mammalian proteome.

Frith, Martin C; Forrest, Alistair R; Nourbakhsh, Ehsan; Pang, Ken C; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Bailey, Timothy L; Grimmond, Sean M.

PLoS Genet ; 2(4): e52, 2006 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-16683031

RESUMO

Short proteins play key roles in cell signalling and other processes, but their abundance in the mammalian proteome is unknown. Current catalogues of mammalian proteins exhibit an artefactual discontinuity at a length of 100 aa, so that protein abundance peaks just above this length and falls off sharply below it. To clarify the abundance of short proteins, we identify proteins in the FANTOM collection of mouse cDNAs by analysing synonymous and non-synonymous substitutions with the computer program CRITICA. This analysis confirms that there is no real discontinuity at length 100. Roughly 10% of mouse proteins are shorter than 100 aa, although the majority of these are variants of proteins longer than 100 aa. We identify many novel short proteins, including a "dark matter" subset containing ones that lack detectable homology to other known proteins. Translation assays confirm that some of these novel proteins can be translated and localised to the secretory pathway.

Assuntos

Camundongos/genética , Proteínas/genética , Proteoma , Sequência de Aminoácidos , Animais , Artefatos , DNA Complementar/genética , Variação Genética , Peso Molecular , Fases de Leitura Aberta , Biossíntese de Proteínas , Reprodutibilidade dos Testes , Homologia de Sequência de Aminoácidos

11.

Mice and men: their promoter properties.

Bajic, Vladimir B; Tan, Sin Lam; Christoffels, Alan; Schönbach, Christian; Lipovich, Leonard; Yang, Liang; Hofmann, Oliver; Kruger, Adele; Hide, Winston; Kai, Chikatoshi; Kawai, Jun; Hume, David A; Carninci, Piero; Hayashizaki, Yoshihide.

PLoS Genet ; 2(4): e54, 2006 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-16683032

RESUMO

Using the two largest collections of Mus musculus and Homo sapiens transcription start sites (TSSs) determined based on CAGE tags, ditags, full-length cDNAs, and other transcript data, we describe the compositional landscape surrounding TSSs with the aim of gaining better insight into the properties of mammalian promoters. We classified TSSs into four types based on compositional properties of regions immediately surrounding them. These properties highlighted distinctive features in the extended core promoters that helped us delineate boundaries of the transcription initiation domain space for both species. The TSS types were analyzed for associations with initiating dinucleotides, CpG islands, TATA boxes, and an extensive collection of statistically significant cis-elements in mouse and human. We found that different TSS types show preferences for different sets of initiating dinucleotides and cis-elements. Through Gene Ontology and eVOC categories and tissue expression libraries we linked TSS characteristics to expression. Moreover, we show a link of TSS characteristics to very specific genomic organization in an example of immune-response-related genes (GO:0006955). Our results shed light on the global properties of the two transcriptomes not revealed before and therefore provide the framework for better understanding of the transcriptional mechanisms in the two species, as well as a framework for development of new and more efficient promoter- and gene-finding tools.

Assuntos

Camundongos , Regiões Promotoras Genéticas , Transcrição Gênica , Animais , Camundongos/genética , Composição de Bases , Bases de Dados de Ácidos Nucleicos , Fosfatos de Dinucleosídeos , DNA Complementar/genética , Biblioteca Gênica , TATA Box , Humanos

12.

Complex Loci in human and mouse genomes.

Engström, Pär G; Suzuki, Harukazu; Ninomiya, Noriko; Akalin, Altuna; Sessa, Luca; Lavorgna, Giovanni; Brozzi, Alessandro; Luzi, Lucilla; Tan, Sin Lam; Yang, Liang; Kunarso, Galih; Ng, Edwin Lian-Chong; Batalov, Serge; Wahlestedt, Claes; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Wells, Christine; Bajic, Vladimir B; Orlando, Valerio; Reid, James F; Lenhard, Boris; Lipovich, Leonard.

PLoS Genet ; 2(4): e47, 2006 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-16683030

RESUMO

Mammalian genomes harbor a larger than expected number of complex loci, in which multiple genes are coupled by shared transcribed regions in antisense orientation and/or by bidirectional core promoters. To determine the incidence, functional significance, and evolutionary context of mammalian complex loci, we identified and characterized 5,248 cis-antisense pairs, 1,638 bidirectional promoters, and 1,153 chains of multiple cis-antisense and/or bidirectionally promoted pairs from 36,606 mouse transcriptional units (TUs), along with 6,141 cis-antisense pairs, 2,113 bidirectional promoters, and 1,480 chains from 42,887 human TUs. In both human and mouse, 25% of TUs resided in cis-antisense pairs, only 17% of which were conserved between the two organisms, indicating frequent species specificity of antisense gene arrangements. A sampling approach indicated that over 40% of all TUs might actually be in cis-antisense pairs, and that only a minority of these arrangements are likely to be conserved between human and mouse. Bidirectional promoters were characterized by variable transcriptional start sites and an identifiable midpoint at which overall sequence composition changed strand and the direction of transcriptional initiation switched. In microarray data covering a wide range of mouse tissues, genes in cis-antisense and bidirectionally promoted arrangement showed a higher probability of being coordinately expressed than random pairs of genes. In a case study on homeotic loci, we observed extensive transcription of nonconserved sequences on the noncoding strand, implying that the presence rather than the sequence of these transcripts is of functional importance. Complex loci are ubiquitous, host numerous nonconserved gene structures and lineage-specific exonification events, and may have a cis-regulatory impact on the member genes.

Assuntos

Mapeamento Cromossômico , Genoma , Camundongos , Animais , Camundongos/genética , Pareamento de Bases , Primers do DNA , Genoma Humano , Regiões Promotoras Genéticas , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Humanos

13.

Transcript annotation in FANTOM3: mouse gene catalog based on physical cDNAs.

Maeda, Norihiro; Kasukawa, Takeya; Oyama, Rieko; Gough, Julian; Frith, Martin; Engström, Pär G; Lenhard, Boris; Aturaliya, Rajith N; Batalov, Serge; Beisel, Kirk W; Bult, Carol J; Fletcher, Colin F; Forrest, Alistair R R; Furuno, Masaaki; Hill, David; Itoh, Masayoshi; Kanamori-Katayama, Mutsumi; Katayama, Shintaro; Katoh, Masaru; Kawashima, Tsugumi; Quackenbush, John; Ravasi, Timothy; Ring, Brian Z; Shibata, Kazuhiro; Sugiura, Koji; Takenaka, Yoichi; Teasdale, Rohan D; Wells, Christine A; Zhu, Yunxia; Kai, Chikatoshi; Kawai, Jun; Hume, David A; Carninci, Piero; Hayashizaki, Yoshihide.

PLoS Genet ; 2(4): e62, 2006 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-16683036

RESUMO

The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full-length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web-based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full-length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding (including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full-length cDNAs. The total number of distinct non-protein-coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.

Assuntos

DNA Complementar/genética , Bases de Dados Genéticas , Camundongos/genética , Transcrição Gênica , Animais , Automação , DNA Complementar/química , Genoma

14.

Hidden layers of human small RNAs.

Kawaji, Hideya; Nakamura, Mari; Takahashi, Yukari; Sandelin, Albin; Katayama, Shintaro; Fukuda, Shiro; Daub, Carsten O; Kai, Chikatoshi; Kawai, Jun; Yasuda, Jun; Carninci, Piero; Hayashizaki, Yoshihide.

BMC Genomics ; 9: 157, 2008 Apr 10.

Artigo em Inglês | MEDLINE | ID: mdl-18402656

RESUMO

BACKGROUND: Small RNA attracts increasing interest based on the discovery of RNA silencing and the rapid progress of our understanding of these phenomena. Although recent studies suggest the possible existence of yet undiscovered types of small RNAs in higher organisms, many studies to profile small RNA have focused on miRNA and/or siRNA rather than on the exploration of additional classes of RNAs. RESULTS: Here, we explored human small RNAs by unbiased sequencing of RNAs with sizes of 19-40 nt. We provide substantial evidences for the existence of independent classes of small RNAs. Our data shows that well-characterized non-coding RNA, such as tRNA, snoRNA, and snRNA are cleaved at sites specific to the class of ncRNA. In particular, tRNA cleavage is regulated depending on tRNA type and tissue expression. We also found small RNAs mapped to genomic regions that are transcribed in both directions by bidirectional promoters, indicating that the small RNAs are a product of dsRNA formation and their subsequent cleavage. Their partial similarity with ribosomal RNAs (rRNAs) suggests unrevealed functions of ribosomal DNA or interstitial rRNA. Further examination revealed six novel miRNAs. CONCLUSION: Our results underscore the complexity of the small RNA world and the biogenesis of small RNAs.

Assuntos

Evolução Molecular , RNA/genética , RNA/metabolismo , Pareamento de Bases , Sequência de Bases , Northern Blotting , Biblioteca Gênica , Humanos , Dados de Sequência Molecular , Família Multigênica/genética , RNA/classificação , Alinhamento de Sequência , Análise de Sequência de RNA

15.

CAGE Basic/Analysis Databases: the CAGE resource for comprehensive promoter analysis.

Kawaji, Hideya; Kasukawa, Takeya; Fukuda, Shiro; Katayama, Shintaro; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide.

Nucleic Acids Res ; 34(Database issue): D632-6, 2006 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-16381948

RESUMO

Cap-analysis gene expression (CAGE) Basic and Analysis Databases store an original resource produced by CAGE, which measures expression levels of transcription starting sites by sequencing large amounts of transcript 5' ends, termed CAGE tags. Millions of human and mouse high-quality CAGE tags derived from different conditions in >20 tissues consisting of >250 RNA samples are essential for identification of novel promoters and promoter characterization in the aspect of expression profile. CAGE Basic Database is a primary database of the CAGE resource, RNA samples, CAGE libraries, CAGE clone and tag sequences and so on. CAGE Analysis Database stores promoter related information, such as counts of related transcripts, CpG islands and conserved genome region. It also provides expression profiles at base pair and promoter levels. Both databases are based on the same framework, CAGE tag starting sites, tag clusters for defining promoters and transcriptional units (TUs). Their associations and TU attributes are available to find promoters of interest. These databases were provided for Functional Annotation Of Mouse 3 (FANTOM3), an international collaboration research project focusing on expanding the transcriptome and subsequent analyses. Now access is free for all users through the World Wide Web at http://fantom3.gsc.riken.jp/.

Assuntos

Bases de Dados de Ácidos Nucleicos , Regiões Promotoras Genéticas , Sítio de Iniciação de Transcrição , Transcrição Gênica , Animais , Sistemas de Gerenciamento de Base de Dados , Etiquetas de Sequências Expressas , Genômica , Humanos , Internet , Camundongos , Interface Usuário-Computador

16.

LOCATE: a mouse protein subcellular localization database.

Fink, J Lynn; Aturaliya, Rajith N; Davis, Melissa J; Zhang, Fasheng; Hanson, Kelly; Teasdale, Melvena S; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Teasdale, Rohan D.

Nucleic Acids Res ; 34(Database issue): D213-7, 2006 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-16381849

RESUMO

We present here LOCATE, a curated, web-accessible database that houses data describing the membrane organization and subcellular localization of proteins from the FANTOM3 Isoform Protein Sequence set. Membrane organization is predicted by the high-throughput, computational pipeline MemO. The subcellular locations of selected proteins from this set were determined by a high-throughput, immunofluorescence-based assay and by manually reviewing >1700 peer-reviewed publications. LOCATE represents the first effort to catalogue the experimentally verified subcellular location and membrane organization of mammalian proteins using a high-throughput approach and provides localization data for approximately 40% of the mouse proteome. It is available at http://locate.imb.uq.edu.au.

Assuntos

Bases de Dados de Proteínas , Proteínas de Membrana/análise , Proteoma/análise , Animais , Internet , Proteínas de Membrana/química , Camundongos , Isoformas de Proteínas/análise , Isoformas de Proteínas/química , Proteoma/química , Interface Usuário-Computador

17.

Splicing bypasses 3' end formation signals to allow complex gene architectures.

Frith, Martin C; Carninci, Piero; Kai, Chikatoshi; Kawai, Jun; Bailey, Timothy L; Hayashizaki, Yoshihide; Mattick, John S.

Gene ; 403(1-2): 188-93, 2007 Nov 15.

Artigo em Inglês | MEDLINE | ID: mdl-17897791

RESUMO

Many genes are arranged in complex overlapping and interlaced patterns in eukaryotic genomes. It is unclear whether or how such genes can avoid interference from each other's RNA processing signals and retain distinct identities. This puzzle applies particularly to 3' end formation sites, which inherently terminate the transcript, and thus act as boundaries between adjacent genes. We hypothesise that the transcript processing machinery can bypass 3' end formation sites by splicing out an intron surrounding the site. We confirm a prediction of this hypothesis: the likelihood of transcripts extending beyond 3' end sites depends on the strength of 3' end formation signals located in exons in the mature transcript, but not of those in introns that are spliced out of the transcript. This bypassing mechanism permits nested and interleaved gene architectures, as well as fusion transcripts that combine exons from adjacent genes.

Assuntos

Regiões 3' não Traduzidas/genética , Processamento Alternativo/genética , Modelos Genéticos , Animais , Cromossomos de Mamíferos , DNA Complementar , Éxons , Etiquetas de Sequências Expressas , Genoma , Íntrons , Camundongos , Poliadenilação/genética , RNA Mensageiro/metabolismo , Transcrição Gênica

18.

Diversity of Ca2+-activated K+ channel transcripts in inner ear hair cells.

Beisel, Kirk W; Rocha-Sanchez, Sonia M; Ziegenbein, Sylvia J; Morris, Ken A; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Davis, Robin L.

Gene ; 386(1-2): 11-23, 2007 Jan 15.

Artigo em Inglês | MEDLINE | ID: mdl-17097837

RESUMO

Hair cells express a complement of ion channels, representing shared and distinct channels that confer distinct electrophysiological signatures for each cell. This diversity is generated by the use of alternative splicing in the alpha subunit, formation of heterotetrameric channels, and combinatorial association with beta subunits. These channels are thought to play a role in the tonotopic gradient observed in the mammalian cochlea. Mouse Kcnma1 transcripts, 5' and 3' ESTs, and genomic sequences were examined for the utilization of alternative splicing in the mouse transcriptome. Comparative genomic analyses investigated the conservation of KCNMA1 splice sites. Genomes of mouse, rat, human, opossum, chicken, frog and zebrafish established that the exon-intron structure and mechanism of KCNMA1 alternative splicing were highly conserved with 6-7 splice sites being utilized. The murine Kcnma1 utilized 6 out of 7 potential splice sites. RT-PCR experiments using murine gene-specific oligonucleotide primers analyzed the scope and variety of Kcnma1 and Kcnmb1-4 expression profiles in the cochlea and inner ear hair cells. In the cochlea splice variants were present representing sites 3, 4, 6, and 7, while site 1 was insertionless and site 2 utilized only exon 10. However, site 5 was not present. Detection of KCNMA1 transcripts and protein exhibited a quantitative longitudinal gradient with a reciprocal gradient found between inner and outer hair cells. Differential expression was also observed in the usage of the long form of the carboxy-terminus tail. These results suggest that a diversity of splice variants exist in rodent cochlear hair cells and this diversity is similar to that observed for non-mammalian vertebrate hair cells, such as chicken and turtle.

Assuntos

Perfilação da Expressão Gênica , Variação Genética , Células Ciliadas Auditivas Internas/metabolismo , Subunidades alfa do Canal de Potássio Ativado por Cálcio de Condutância Alta/genética , Transcrição Gênica , Processamento Alternativo/genética , Animais , Sequência Conservada , Humanos , Hibridização In Situ , Subunidades alfa do Canal de Potássio Ativado por Cálcio de Condutância Alta/biossíntese , Camundongos , Ratos

19.

Computational promoter analysis of mouse, rat and human antimicrobial peptide-coding genes.

Brahmachary, Manisha; Schönbach, Christian; Yang, Liang; Huang, Enli; Tan, Sin Lam; Chowdhary, Rajesh; Krishnan, S P T; Lin, Chin-Yo; Hume, David A; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Bajic, Vladimir B.

BMC Bioinformatics ; 7 Suppl 5: S8, 2006 Dec 18.

Artigo em Inglês | MEDLINE | ID: mdl-17254313

RESUMO

BACKGROUND: Mammalian antimicrobial peptides (AMPs) are effectors of the innate immune response. A multitude of signals coming from pathways of mammalian pathogen/pattern recognition receptors and other proteins affect the expression of AMP-coding genes (AMPcgs). For many AMPcgs the promoter elements and transcription factors that control their tissue cell-specific expression have yet to be fully identified and characterized. RESULTS: Based upon the RIKEN full-length cDNA and public sequence data derived from human, mouse and rat, we identified 178 candidate AMP transcripts derived from 61 genes belonging to 29 AMP families. However, only for 31 mouse genes belonging to 22 AMP families we were able to determine true orthologous relationships with 30 human and 15 rat sequences. We screened the promoter regions of AMPcgs in the three species for motifs by an ab initio motif finding method and analyzed the derived promoter characteristics. Promoter models were developed for alpha-defensins, penk and zap AMP families. The results suggest a core set of transcription factors (TFs) that regulate the transcription of AMPcg families in mouse, rat and human. The three most frequent core TFs groups include liver-, nervous system-specific and nuclear hormone receptors (NHRs). Out of 440 motifs analyzed, we found that three represent potentially novel TF-binding motifs enriched in promoters of AMPcgs, while the other four motifs appear to be species-specific. CONCLUSION: Our large-scale computational analysis of promoters of 22 families of AMPcgs across three mammalian species suggests that their key transcriptional regulators are likely to be TFs of the liver-, nervous system-specific and NHR groups. The computationally inferred promoter elements and potential TF binding motifs provide a rich resource for targeted experimental validation of TF binding and signaling studies that aim at the regulation of mouse, rat or human AMPcgs.

Assuntos

Peptídeos Catiônicos Antimicrobianos/genética , Biologia Computacional/métodos , Regiões Promotoras Genéticas , Análise de Sequência de DNA/métodos , Animais , Sítios de Ligação , Proteínas de Transporte/genética , Encefalinas/genética , Humanos , Camundongos , Família Multigênica/genética , Precursores de Proteínas/genética , Proteínas de Ligação a RNA , Ratos , Fatores de Transcrição/metabolismo , alfa-Defensinas/genética

20.

PhosphoregDB: the tissue and sub-cellular distribution of mammalian protein kinases and phosphatases.

Forrest, Alistair R R; Taylor, Darrin F; Fink, J Lynn; Gongora, M Milena; Flegg, Cameron; Teasdale, Rohan D; Suzuki, Harukazu; Kanamori, Mutsumi; Kai, Chikatoshi; Hayashizaki, Yoshihide; Grimmond, Sean M.

BMC Bioinformatics ; 7: 82, 2006 Feb 20.

Artigo em Inglês | MEDLINE | ID: mdl-16504016

RESUMO

BACKGROUND: Protein kinases and protein phosphatases are the fundamental components of phosphorylation dependent protein regulatory systems. We have created a database for the protein kinase-like and phosphatase-like loci of mouse http://phosphoreg.imb.uq.edu.au that integrates protein sequence, interaction, classification and pathway information with the results of a systematic screen of their sub-cellular localization and tissue specific expression data mined from the GNF tissue atlas of mouse. RESULTS: The database lets users query where a specific kinase or phosphatase is expressed at both the tissue and sub-cellular levels. Similarly the interface allows the user to query by tissue, pathway or sub-cellular localization, to reveal which components are co-expressed or co-localized. A review of their expression reveals 30% of these components are detected in all tissues tested while 70% show some level of tissue restriction. Hierarchical clustering of the expression data reveals that expression of these genes can be used to separate the samples into tissues of related lineage, including 3 larger clusters of nervous tissue, developing embryo and cells of the immune system. By overlaying the expression, sub-cellular localization and classification data we examine correlations between class, specificity and tissue restriction and show that tyrosine kinases are more generally expressed in fewer tissues than serine/threonine kinases. CONCLUSION: Together these data demonstrate that cell type specific systems exist to regulate protein phosphorylation and that for accurate modelling and for determination of enzyme substrate relationships the co-location of components needs to be considered.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Proteínas , Regulação da Expressão Gênica , Monoéster Fosfórico Hidrolases/biossíntese , Mapeamento de Interação de Proteínas/métodos , Proteínas Quinases/biossíntese , Sequência de Aminoácidos , Animais , Ciclo Celular , Linhagem da Célula , Análise por Conglomerados , Citoplasma/metabolismo , Etiquetas de Sequências Expressas , Células HeLa , Humanos , Sistema Imunitário , Imunoprecipitação , Internet , Camundongos , Dados de Sequência Molecular , Fosforilação , Reação em Cadeia da Polimerase , Regiões Promotoras Genéticas , Transdução de Sinais , Especificidade por Substrato , Distribuição Tecidual , Transfecção

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA