Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Nucleic Acids Res ; 40(Database issue): D1202-10, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22140109

RESUMO

The Arabidopsis Information Resource (TAIR, http://arabidopsis.org) is a genome database for Arabidopsis thaliana, an important reference organism for many fundamental aspects of biology as well as basic and applied plant biology research. TAIR serves as a central access point for Arabidopsis data, annotates gene function and expression patterns using controlled vocabulary terms, and maintains and updates the A. thaliana genome assembly and annotation. TAIR also provides researchers with an extensive set of visualization and analysis tools. Recent developments include several new genome releases (TAIR8, TAIR9 and TAIR10) in which the A. thaliana assembly was updated, pseudogenes and transposon genes were re-annotated, and new data from proteomics and next generation transcriptome sequencing were incorporated into gene models and splice variants. Other highlights include progress on functional annotation of the genome and the release of several new tools including Textpresso for Arabidopsis which provides the capability to carry out full text searches on a large body of research literature.


Assuntos
Arabidopsis/genética , Bases de Dados Genéticas , Genes de Plantas , Anotação de Sequência Molecular , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Genoma de Planta , Software
2.
Nat Genet ; 34(1): 35-41, 2003 May.
Artigo em Inglês | MEDLINE | ID: mdl-12679813

RESUMO

To verify the genome annotation and to create a resource to functionally characterize the proteome, we attempted to Gateway-clone all predicted protein-encoding open reading frames (ORFs), or the 'ORFeome,' of Caenorhabditis elegans. We successfully cloned approximately 12,000 ORFs (ORFeome 1.1), of which roughly 4,000 correspond to genes that are untouched by any cDNA or expressed-sequence tag (EST). More than 50% of predicted genes needed corrections in their intron-exon structures. Notably, approximately 11,000 C. elegans proteins can now be expressed under many conditions and characterized using various high-throughput strategies, including large-scale interactome mapping. We suggest that similar ORFeome projects will be valuable for other organisms, including humans.


Assuntos
Caenorhabditis elegans/genética , Genoma , Processamento Alternativo , Animais , Clonagem Molecular , DNA Complementar/genética , DNA de Helmintos/genética , Bases de Dados Genéticas , Éxons , Etiquetas de Sequências Expressas , Expressão Gênica , Genes de Helmintos , Genômica , Proteínas de Helminto/genética , Humanos , Íntrons , Fases de Leitura Aberta , Proteoma , Proteômica
3.
Nature ; 437(7062): 1173-8, 2005 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-16189514

RESUMO

Systematic mapping of protein-protein interactions, or 'interactome' mapping, was initiated in model organisms, starting with defined biological processes and then expanding to the scale of the proteome. Although far from complete, such maps have revealed global topological and dynamic features of interactome networks that relate to known biological properties, suggesting that a human interactome map will provide insight into development and disease mechanisms at a systems level. Here we describe an initial version of a proteome-scale map of human binary protein-protein interactions. Using a stringent, high-throughput yeast two-hybrid system, we tested pairwise interactions among the products of approximately 8,100 currently available Gateway-cloned open reading frames and detected approximately 2,800 interactions. This data set, called CCSB-HI1, has a verification rate of approximately 78% as revealed by an independent co-affinity purification assay, and correlates significantly with other biological attributes. The CCSB-HI1 data set increases by approximately 70% the set of available binary interactions within the tested space and reveals more than 300 new connections to over 100 disease-associated proteins. This work represents an important step towards a systematic and comprehensive human interactome project.


Assuntos
Proteoma/metabolismo , Clonagem Molecular , Humanos , Fases de Leitura Aberta/genética , Ligação Proteica , Proteoma/genética , RNA/genética , RNA/metabolismo , Saccharomyces cerevisiae/genética , Técnicas do Sistema de Duplo-Híbrido
4.
Nucleic Acids Res ; 36(Database issue): D1009-14, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17986450

RESUMO

The Arabidopsis Information Resource (TAIR, http://arabidopsis.org) is the model organism database for the fully sequenced and intensively studied model plant Arabidopsis thaliana. Data in TAIR is derived in large part from manual curation of the Arabidopsis research literature and direct submissions from the research community. New developments at TAIR include the addition of the GBrowse genome viewer to the TAIR site, a redesigned home page, navigation structure and portal pages to make the site more intuitive and easier to use, the launch of several TAIR web services and a new genome annotation release (TAIR7) in April 2007. A combination of manual and computational methods were used to generate this release, which contains 27,029 protein-coding genes, 3889 pseudogenes or transposable elements and 1123 ncRNAs (32,041 genes in all, 37,019 gene models). A total of 681 new genes and 1002 new splice variants were added. Overall, 10,098 loci (one-third of all loci from the previous TAIR6 release) were updated for the TAIR7 release.


Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Bases de Dados Genéticas , Processamento Alternativo , Genes de Plantas , Genoma de Planta , Genômica , Internet , RNA não Traduzido/genética , Interface Usuário-Computador , Vocabulário Controlado
5.
Nucleic Acids Res ; 31(1): 237-40, 2003 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-12519990

RESUMO

WorfDB (Worm ORFeome DataBase; http://worfdb.dfci.harvard.edu) was created to integrate and disseminate the data from the cloning of complete set of approximately 19 000 predicted protein-encoding Open Reading Frames (ORFs) of Caenorhabditis elegans (also referred to as the 'worm ORFeome'). WorfDB serves as a central data repository enabling the scientific community to search for availability and quality of cloned ORFs. So far, ORF sequence tags (OSTs) obtained for all individual clones have allowed exon structure corrections for approximately 3400 ORFs originally predicted by the C. elegans sequencing consortium. In addition, we now have OSTs for approximately 4300 predicted genes for which no ESTs were available. The database contains this OST information along with data pertinent to the cloning process. WorfDB could serve as a model database for other metazoan ORFeome cloning projects.


Assuntos
Proteínas de Caenorhabditis elegans/genética , Caenorhabditis elegans/genética , Bases de Dados de Ácidos Nucleicos , Fases de Leitura Aberta , Animais , Proteínas de Caenorhabditis elegans/biossíntese , Genoma , Armazenamento e Recuperação da Informação
6.
Methods Mol Biol ; 1062: 65-96, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24057361

RESUMO

The volume of Arabidopsis information has increased enormously in recent years as a result of the sequencing of the reference genome and other large-scale functional genomics projects. Much of the data is stored in public databases, where data are organized, analyzed, and made freely accessible to the research community. These databases are resources that researchers can utilize for making predictions and developing testable hypotheses. The methods in this chapter describe ways to access and utilize Arabidopsis data and genomic resources found in databases and stock centers.


Assuntos
Arabidopsis/genética , Bases de Dados Genéticas , Sequência de Aminoácidos , Arabidopsis/metabolismo , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/genética , Mineração de Dados , Ontologia Genética , Genes de Plantas , Redes e Vias Metabólicas , Anotação de Sequência Molecular , Conformação Proteica , Sementes/genética
7.
Curr Protoc Bioinformatics ; Chapter 1: 1.11.1-1.11.51, 2010 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-20521243

RESUMO

The Arabidopsis Information Resource (TAIR; http://arabidopsis.org) is a comprehensive Web resource of Arabidopsis biology for plant scientists. TAIR curates and integrates information about genes, proteins, gene function, gene expression, mutant phenotypes, biological materials such as clones and seed stocks, genetic markers, genetic and physical maps, biochemical pathways, genome organization, images of mutant plants, protein sub-cellular localizations, publications, and the research community. The various data types are extensively interconnected and can be accessed through a variety of Web-based search and display tools. This unit primarily focuses on some basic methods for searching, browsing, visualizing, and analyzing information about Arabidopsis genes and describes several new tools such as a new TAIR genome browser (GBrowse), and the TAIR synteny viewer (GBrowse_syn). We also describe how to use AraCyc for mining plant metabolic pathways.


Assuntos
Arabidopsis/genética , Biologia Computacional , Bases de Dados Genéticas , Proteínas de Arabidopsis/genética , Mineração de Dados , Genes de Plantas
8.
Database (Oxford) ; 2010: baq001, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20428316

RESUMO

Efforts to annotate the genomes of a wide variety of model organisms are currently carried out by sequencing centers, model organism databases and academic/institutional laboratories around the world. Different annotation methods and tools have been developed over time to meet the needs of biologists faced with the task of annotating biological data. While standardized methods are essential for consistent curation within each annotation group, methods and tools can differ between groups, especially when the groups are curating different organisms. Biocurators from several institutes met at the Third International Biocuration Conference in Berlin, Germany, April 2009 and hosted the 'Best Practices in Genome Annotation: Inference from Evidence' workshop to share their strategies, pipelines, standards and tools. This article documents the material presented in the workshop.

9.
Genomics ; 89(3): 307-15, 2007 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-17207965

RESUMO

Complete sets of cloned protein-encoding open reading frames (ORFs), or ORFeomes, are essential tools for large-scale proteomics and systems biology studies. Here we describe human ORFeome version 3.1 (hORFeome v3.1), currently the largest publicly available resource of full-length human ORFs (available at ). Generated by Gateway recombinational cloning, this collection contains 12,212 ORFs, representing 10,214 human genes, and corresponds to a 51% expansion of the original hORFeome v1.1. An online human ORFeome database, hORFDB, was built and serves as the central repository for all cloned human ORFs (http://horfdb.dfci.harvard.edu). This expansion of the original ORFeome resource greatly increases the potential experimental search space for large-scale proteomics studies, which will lead to the generation of more comprehensive datasets.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genoma Humano , Fases de Leitura Aberta , Animais , Cromossomos Humanos , Clonagem Molecular/métodos , DNA Complementar , Predisposição Genética para Doença , Humanos , Internet , Proteômica , Análise de Sequência de DNA
10.
Hum Mol Genet ; 15 Spec No 1: R31-43, 2006 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-16651367

RESUMO

cDNA clones have long been valuable reagents for studying the structure and function of proteins. With recent access to the entire human genome sequence, it has become possible and highly productive to compare the sequences of mRNAs to their genes, in order to validate the sequences and protein-coding annotations of each (1,2). Thus, well-characterized collections of human cDNAs are now playing an essential role in defining the structure and function of human genes and proteins. In this review, we will summarize the major collections of human cDNA clones, discuss some limitations common to most of these collections and describe several noteworthy proteomics applications, focusing on the detection and analysis of protein-protein interactions (PPI). These human cDNA collections contain principally two types of cDNA clones. The largest collections comprise cDNAs with full-length protein coding sequences (FL-CDS). Some but not all of these cDNA clones may represent the entire mRNA sequence, but many are missing considerable non-coding UTR sequence, usually at the 5' end. A second type of cDNA clone, a 'full-ORF' (F-ORF) expression clone, is one where the annotated protein-coding sequence, excised of 5' UTR and 3' UTR sequence, has been transferred to a vector designed to facilitate transfer to other vectors for protein expression.


Assuntos
Clonagem Molecular/métodos , Expressão Gênica , Genoma Humano , Proteoma/metabolismo , DNA Complementar/química , DNA Complementar/genética , Humanos , Modelos Biológicos , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Fases de Leitura Aberta , Proteoma/genética
11.
Genome Res ; 15(4): 577-82, 2005 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-15805498

RESUMO

The genome of Caenorhabditis elegans was the first animal genome to be sequenced. Although considerable effort has been devoted to annotating it, the standard WormBase annotation contains thousands of predicted genes for which there is no cDNA or EST evidence. We hypothesized that a more complete experimental annotation could be obtained by creating a more accurate gene-prediction program and then amplifying and sequencing predicted genes. Our approach was to adapt the TWINSCAN gene prediction system to C. elegans and C. briggsae and to improve its splice site and intron-length models. The resulting system has 60% sensitivity and 58% specificity in exact prediction of open reading frames (ORFs), and hence, proteins-the best results we are aware of any multicellular organism. We then attempted to amplify, clone, and sequence 265 TWINSCAN-predicted ORFs that did not overlap WormBase gene annotations. The success rate was 55%, adding 146 genes that were completely absent from WormBase to the ORF clone collection (ORFeome). The same procedure had a 7% success rate on 90 Worm Base "predicted" genes that do not overlap TWINSCAN predictions. These results indicate that the accuracy of WormBase could be significantly increased by replacing its partially curated predicted genes with TWINSCAN predictions. The technology described in this study will continue to drive the C. elegans ORFeome toward completion and contribute to the annotation of the three Caenorhabditis species currently being sequenced. The results also suggest that this technology can significantly improve our knowledge of the "parts list" for even the best-studied model organisms.


Assuntos
Caenorhabditis elegans/genética , Clonagem Molecular/métodos , Biologia Computacional/métodos , Genes de Helmintos/genética , Fases de Leitura Aberta/genética , Animais , Bases de Dados Genéticas , Genoma , Genômica , Íntrons , Sensibilidade e Especificidade
12.
Genome Res ; 14(10B): 2064-9, 2004 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-15489327

RESUMO

The first version of the Caenorhabditis elegans ORFeome cloning project, based on release WS9 of Wormbase (August 1999), provided experimental verifications for approximately 55% of predicted protein-encoding open reading frames (ORFs). The remaining 45% of predicted ORFs could not be cloned, possibly as a result of mispredicted gene boundaries. Since the release of WS9, gene predictions have improved continuously. To test the accuracy of evolving predictions, we attempted to PCR-amplify from a highly representative worm cDNA library and Gateway-clone approximately 4200 ORFs missed earlier and for which new predictions are available in WS100 (May 2003). In this set we successfully cloned 63% of ORFs with supporting experimental data ("touched" ORFs), and 42% of ORFs with no supporting experimental evidence ("untouched" ORFs). Approximately 2000 full-length ORFs were cloned in-frame, 13% of which were corrected in their exon/intron structure relative to WS100 predictions. In total, approximately 12,500 C. elegans ORFs are now available as Gateway Entry clones for various reverse proteomics (ORFeome v3.1). This work illustrates why the cloning of a complete C. elegans ORFeome, and likely the ORFeomes of other multicellular organisms, needs to be an iterative process that requires multiple rounds of experimental validation together with gradually improving gene predictions.


Assuntos
Proteínas de Caenorhabditis elegans/genética , Caenorhabditis elegans/genética , Biologia Computacional/métodos , Genes de Helmintos/genética , Genoma , Fases de Leitura Aberta/genética , Animais , Proteínas de Caenorhabditis elegans/metabolismo , Clonagem Molecular , DNA Complementar/genética , Bases de Dados Genéticas , Éxons , Etiquetas de Sequências Expressas , Expressão Gênica , Genômica , Íntrons , Proteoma , Proteômica , Software
13.
Genome Res ; 14(10B): 2169-75, 2004 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-15489340

RESUMO

An important aspect of the development of systems biology approaches in metazoans is the characterization of expression patterns of nearly all genes predicted from genome sequences. Such "localizome" maps should provide information on where (in what cells or tissues) and when (at what stage of development or under what conditions) genes are expressed. They should also indicate in what cellular compartments the corresponding proteins are localized. Caenorhabditis elegans is particularly suited for the development of a localizome map since all its 959 adult somatic cells can be visualized by microscopy, and its cell lineage has been completely described. Here we address one of the challenges of C. elegans localizome mapping projects: that of obtaining a genome-wide resource of C. elegans promoters needed to generate transgenic animals expressing localization markers such as the green fluorescent protein (GFP). To ensure high flexibility for future uses, we utilized the newly developed MultiSite Gateway system. We generated and validated "version 1.1" of the Promoterome: a resource of approximately 6000 C. elegans promoters. These promoters can be transferred easily into various Gateway Destination vectors to drive expression of markers such as GFP, alone (promoter::GFP constructs), or in fusion with protein-encoding open reading frames available in ORFeome resources (promoter::ORF::GFP).


Assuntos
Caenorhabditis elegans/genética , Genes de Helmintos , Fases de Leitura Aberta/fisiologia , Regiões Promotoras Genéticas/genética , Fatores de Transcrição/fisiologia , Animais , Animais Geneticamente Modificados , Clonagem Molecular , Expressão Gênica , Proteínas de Fluorescência Verde , Proteínas Luminescentes/genética
14.
Genome Res ; 14(10B): 2128-35, 2004 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-15489335

RESUMO

The advent of systems biology necessitates the cloning of nearly entire sets of protein-encoding open reading frames (ORFs), or ORFeomes, to allow functional studies of the corresponding proteomes. Here, we describe the generation of a first version of the human ORFeome using a newly improved Gateway recombinational cloning approach. Using the Mammalian Gene Collection (MGC) resource as a starting point, we report the successful cloning of 8076 human ORFs, representing at least 7263 human genes, as mini-pools of PCR-amplified products. These were assembled into the human ORFeome version 1.1 (hORFeome v1.1) collection. After assessing the overall quality of this version, we describe the use of hORFeome v1.1 for heterologous protein expression in two different expression systems at proteome scale. The hORFeome v1.1 represents a central resource for the cloning of large sets of human ORFs in various settings for functional proteomics of many types, and will serve as the foundation for subsequent improved versions of the human ORFeome.


Assuntos
Clonagem Molecular , Genômica/métodos , Fases de Leitura Aberta/genética , Fases de Leitura Aberta/fisiologia , Proteômica , Expressão Gênica , Vetores Genéticos , Humanos , Proteínas Recombinantes/genética , Proteínas Recombinantes/isolamento & purificação , Proteínas Recombinantes/metabolismo
15.
Genome Res ; 14(10B): 2201-6, 2004 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-15489343

RESUMO

The bacteria of the Brucella genus are responsible for a worldwide zoonosis called brucellosis. They belong to the alpha-proteobacteria group, as many other bacteria that live in close association with a eukaryotic host. Importantly, the Brucellae are mainly intracellular pathogens, and the molecular mechanisms of their virulence are still poorly understood. Using the complete genome sequence of Brucella melitensis, we generated a database of protein-coding open reading frames (ORFs) and constructed an ORFeome library of 3091 Gateway Entry clones, each containing a defined ORF. This first version of the Brucella ORFeome (v1.1) provides the coding sequences in a user-friendly format amenable to high-throughput functional genomic and proteomic experiments, as the ORFs are conveniently transferable from the Entry clones to various Expression vectors by recombinational cloning. The cloning of the Brucella ORFeome v1.1 should help to provide a better understanding of the molecular mechanisms of virulence, including the identification of bacterial protein-protein interactions, but also interactions between bacterial effectors and their host's targets.


Assuntos
Proteínas de Bactérias/genética , Brucella melitensis/genética , Genoma Bacteriano , Fases de Leitura Aberta/fisiologia , Proteínas de Bactérias/metabolismo , Clonagem Molecular , Primers do DNA/química , Primers do DNA/genética , Expressão Gênica , Plasmídeos , Reação em Cadeia da Polimerase
16.
Science ; 303(5657): 540-3, 2004 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-14704431

RESUMO

To initiate studies on how protein-protein interaction (or "interactome") networks relate to multicellular functions, we have mapped a large fraction of the Caenorhabditis elegans interactome network. Starting with a subset of metazoan-specific proteins, more than 4000 interactions were identified from high-throughput, yeast two-hybrid (HT=Y2H) screens. Independent coaffinity purification assays experimentally validated the overall quality of this Y2H data set. Together with already described Y2H interactions and interologs predicted in silico, the current version of the Worm Interactome (WI5) map contains approximately 5500 interactions. Topological and biological features of this interactome network, as well as its integration with phenome and transcriptome data sets, lead to numerous biological hypotheses.


Assuntos
Proteínas de Caenorhabditis elegans/metabolismo , Caenorhabditis elegans/metabolismo , Proteoma/metabolismo , Animais , Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/genética , Biologia Computacional , Evolução Molecular , Genes de Helmintos , Genômica , Fases de Leitura Aberta , Fenótipo , Ligação Proteica , Transcrição Gênica , Técnicas do Sistema de Duplo-Híbrido
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa