Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nucleic Acids Res ; 50(D1): D898-D911, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34718728

RESUMEN

The Eukaryotic Pathogen, Vector and Host Informatics Resource (VEuPathDB, https://veupathdb.org) represents the 2019 merger of VectorBase with the EuPathDB projects. As a Bioinformatics Resource Center funded by the National Institutes of Health, with additional support from the Welllcome Trust, VEuPathDB supports >500 organisms comprising invertebrate vectors, eukaryotic pathogens (protists and fungi) and relevant free-living or non-pathogenic species or hosts. Designed to empower researchers with access to Omics data and bioinformatic analyses, VEuPathDB projects integrate >1700 pre-analysed datasets (and associated metadata) with advanced search capabilities, visualizations, and analysis tools in a graphic interface. Diverse data types are analysed with standardized workflows including an in-house OrthoMCL algorithm for predicting orthology. Comparisons are easily made across datasets, data types and organisms in this unique data mining platform. A new site-wide search facilitates access for both experienced and novice users. Upgraded infrastructure and workflows support numerous updates to the web interface, tools, searches and strategies, and Galaxy workspace where users can privately analyse their own data. Forthcoming upgrades include cloud-ready application architecture, expanded support for the Galaxy workspace, tools for interrogating host-pathogen interactions, and improved interactions with affiliated databases (ClinEpiDB, MicrobiomeDB) and other scientific resources, and increased interoperability with the Bacterial & Viral BRC.


Asunto(s)
Bases de Datos Factuales , Vectores de Enfermedades/clasificación , Interacciones Huésped-Patógeno/genética , Fenotipo , Interfaz Usuario-Computador , Animales , Apicomplexa/clasificación , Apicomplexa/genética , Apicomplexa/patogenicidad , Bacterias/clasificación , Bacterias/genética , Bacterias/patogenicidad , Enfermedades Transmisibles/microbiología , Enfermedades Transmisibles/parasitología , Enfermedades Transmisibles/patología , Enfermedades Transmisibles/transmisión , Biología Computacional/métodos , Minería de Datos/métodos , Diplomonadida/clasificación , Diplomonadida/genética , Diplomonadida/patogenicidad , Hongos/clasificación , Hongos/genética , Hongos/patogenicidad , Humanos , Insectos/clasificación , Insectos/genética , Insectos/patogenicidad , Internet , Nematodos/clasificación , Nematodos/genética , Nematodos/patogenicidad , Filogenia , Virulencia , Flujo de Trabajo
2.
Nucleic Acids Res ; 46(D1): D684-D691, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29106667

RESUMEN

MicrobiomeDB (http://microbiomeDB.org) is a data discovery and analysis platform that empowers researchers to fully leverage experimental variables to interrogate microbiome datasets. MicrobiomeDB was developed in collaboration with the Eukaryotic Pathogens Bioinformatics Resource Center (http://EuPathDB.org) and leverages the infrastructure and user interface of EuPathDB, which allows users to construct in silico experiments using an intuitive graphical 'strategy' approach. The current release of the database integrates microbial census data with sample details for nearly 14 000 samples originating from human, animal and environmental sources, including over 9000 samples from healthy human subjects in the Human Microbiome Project (http://portal.ihmpdcc.org/). Query results can be statistically analyzed and graphically visualized via interactive web applications launched directly in the browser, providing insight into microbial community diversity and allowing users to identify taxa associated with any experimental covariate.


Asunto(s)
Minería de Datos/métodos , Bases de Datos Genéticas , Microbiota , Biología de Sistemas , Animales , Simulación por Computador , Conjuntos de Datos como Asunto , Microbiología Ambiental , Variación Genética , Humanos , Internet , Aplicaciones Móviles , Interfaz Usuario-Computador , Flujo de Trabajo
3.
Nucleic Acids Res ; 45(D1): D581-D591, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27903906

RESUMEN

The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host-pathogen interactions.


Asunto(s)
Bases de Datos Genéticas , Eucariontes , Genómica/métodos , Interacciones Huésped-Patógeno/genética , Metagenoma , Metagenómica/métodos , Programas Informáticos , Biología Computacional/métodos , Variaciones en el Número de Copia de ADN , Perfilación de la Expresión Génica , Proteómica , Navegador Web
4.
Proteomics ; 15(15): 2618-28, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-25867681

RESUMEN

Proteomics data can supplement genome annotation efforts, for example being used to confirm gene models or correct gene annotation errors. Here, we present a large-scale proteogenomics study of two important apicomplexan pathogens: Toxoplasma gondii and Neospora caninum. We queried proteomics data against a panel of official and alternate gene models generated directly from RNASeq data, using several newly generated and some previously published MS datasets for this meta-analysis. We identified a total of 201 996 and 39 953 peptide-spectrum matches for T. gondii and N. caninum, respectively, at a 1% peptide FDR threshold. This equated to the identification of 30 494 distinct peptide sequences and 2921 proteins (matches to official gene models) for T. gondii, and 8911 peptides/1273 proteins for N. caninum following stringent protein-level thresholding. We have also identified 289 and 140 loci for T. gondii and N. caninum, respectively, which mapped to RNA-Seq-derived gene models used in our analysis and apparently absent from the official annotation (release 10 from EuPathDB) of these species. We present several examples in our study where the RNA-Seq evidence can help in correction of the current gene model and can help in discovery of potential new genes. The findings of this study have been integrated into the EuPathDB. The data have been deposited to the ProteomeXchange with identifiers PXD000297and PXD000298.


Asunto(s)
Genómica/métodos , Neospora/genética , Neospora/metabolismo , Proteómica/métodos , Toxoplasma/genética , Toxoplasma/metabolismo , Secuencia de Aminoácidos , Apicomplexa/genética , Apicomplexa/metabolismo , Bases de Datos Genéticas , Genes Protozoarios/genética , Anotación de Secuencia Molecular/métodos , Datos de Secuencia Molecular , Péptidos/genética , Péptidos/metabolismo , Proteoma/genética , Proteoma/metabolismo , Proteínas Protozoarias/genética , Proteínas Protozoarias/metabolismo , Análisis de Secuencia de ARN/métodos , Homología de Secuencia de Aminoácido , Espectrometría de Masas en Tándem/métodos
5.
Nucleic Acids Res ; 41(Database issue): D684-91, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23175615

RESUMEN

EuPathDB (http://eupathdb.org) resources include 11 databases supporting eukaryotic pathogen genomic and functional genomic data, isolate data and phylogenomics. EuPathDB resources are built using the same infrastructure and provide a sophisticated search strategy system enabling complex interrogations of underlying data. Recent advances in EuPathDB resources include the design and implementation of a new data loading workflow, a new database supporting Piroplasmida (i.e. Babesia and Theileria), the addition of large amounts of new data and data types and the incorporation of new analysis tools. New data include genome sequences and annotation, strand-specific RNA-seq data, splice junction predictions (based on RNA-seq), phosphoproteomic data, high-throughput phenotyping data, single nucleotide polymorphism data based on high-throughput sequencing (HTS) and expression quantitative trait loci data. New analysis tools enable users to search for DNA motifs and define genes based on their genomic colocation, view results from searches graphically (i.e. genes mapped to chromosomes or isolates displayed on a map) and analyze data from columns in result tables (word cloud and histogram summaries of column content). The manuscript herein describes updates to EuPathDB since the previous report published in NAR in 2010.


Asunto(s)
Bases de Datos Genéticas , Parásitos/genética , Animales , Genómica , Internet , Anotación de Secuencia Molecular , Fenotipo , Piroplasmida/genética , Polimorfismo de Nucleótido Simple , Proteómica , Sitios de Carácter Cuantitativo , Sitios de Empalme de ARN , Análisis de Secuencia de ARN , Programas Informáticos
6.
Nucleic Acids Res ; 40(Database issue): D675-81, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22064857

RESUMEN

FungiDB (http://FungiDB.org) is a functional genomic resource for pan-fungal genomes that was developed in partnership with the Eukaryotic Pathogen Bioinformatic resource center (http://EuPathDB.org). FungiDB uses the same infrastructure and user interface as EuPathDB, which allows for sophisticated and integrated searches to be performed using an intuitive graphical system. The current release of FungiDB contains genome sequence and annotation from 18 species spanning several fungal classes, including the Ascomycota classes, Eurotiomycetes, Sordariomycetes, Saccharomycetes and the Basidiomycota orders, Pucciniomycetes and Tremellomycetes, and the basal 'Zygomycete' lineage Mucormycotina. Additionally, FungiDB contains cell cycle microarray data, hyphal growth RNA-sequence data and yeast two hybrid interaction data. The underlying genomic sequence and annotation combined with functional data, additional data from the FungiDB standard analysis pipeline and the ability to leverage orthology provides a powerful resource for in silico experimentation.


Asunto(s)
Bases de Datos Genéticas , Genoma Fúngico , Genómica , Anotación de Secuencia Molecular , Programas Informáticos , Integración de Sistemas
7.
Nucleic Acids Res ; 40(Database issue): D98-108, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22116062

RESUMEN

GeneDB (http://www.genedb.org) is a genome database for prokaryotic and eukaryotic pathogens and closely related organisms. The resource provides a portal to genome sequence and annotation data, which is primarily generated by the Pathogen Genomics group at the Wellcome Trust Sanger Institute. It combines data from completed and ongoing genome projects with curated annotation, which is readily accessible from a web based resource. The development of the database in recent years has focused on providing database-driven annotation tools and pipelines, as well as catering for increasingly frequent assembly updates. The website has been significantly redesigned to take advantage of current web technologies, and improve usability. The current release stores 41 data sets, of which 17 are manually curated and maintained by biologists, who review and incorporate data from the scientific literature, as well as other sources. GeneDB is primarily a production and annotation database for the genomes of predominantly pathogenic organisms.


Asunto(s)
Bases de Datos Genéticas , Genómica , Anotación de Secuencia Molecular , Animales , Artrópodos/genética , Genoma Bacteriano , Genoma de los Helmintos , Genoma de Protozoos , Internet , Vocabulario Controlado
8.
Nucleic Acids Res ; 39(Database issue): D612-9, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-20974635

RESUMEN

AmoebaDB (http://AmoebaDB.org) and MicrosporidiaDB (http://MicrosporidiaDB.org) are new functional genomic databases serving the amoebozoa and microsporidia research communities, respectively. AmoebaDB contains the genomes of three Entamoeba species (E. dispar, E. invadens and E. histolityca) and microarray expression data for E. histolytica. MicrosporidiaDB contains the genomes of Encephalitozoon cuniculi, E. intestinalis and E. bieneusi. The databases belong to the National Institute of Allergy and Infectious Diseases (NIAID) funded EuPathDB (http://EuPathDB.org) Bioinformatics Resource Center family of integrated databases and assume the same architectural and graphical design as other EuPathDB resources such as PlasmoDB and TriTrypDB. Importantly they utilize the graphical strategy builder that affords a database user the ability to ask complex multi-data-type questions with relative ease and versatility. Genomic scale data can be queried based on BLAST searches, annotation keywords and gene ID searches, GO terms, sequence motifs, protein characteristics, phylogenetic relationships and functional data such as transcript (microarray and EST evidence) and protein expression data. Search strategies can be saved within a user's profile for future retrieval and may also be shared with other researchers using a unique strategy web address.


Asunto(s)
Bases de Datos Genéticas , Encephalitozoon/genética , Entamoeba/genética , Genoma Fúngico , Genoma de Protozoos , Genómica
9.
Bioinformatics ; 27(18): 2518-28, 2011 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-21775302

RESUMEN

MOTIVATION: A critical task in high-throughput sequencing is aligning millions of short reads to a reference genome. Alignment is especially complicated for RNA sequencing (RNA-Seq) because of RNA splicing. A number of RNA-Seq algorithms are available, and claim to align reads with high accuracy and efficiency while detecting splice junctions. RNA-Seq data are discrete in nature; therefore, with reasonable gene models and comparative metrics RNA-Seq data can be simulated to sufficient accuracy to enable meaningful benchmarking of alignment algorithms. The exercise to rigorously compare all viable published RNA-Seq algorithms has not been performed previously. RESULTS: We developed an RNA-Seq simulator that models the main impediments to RNA alignment, including alternative splicing, insertions, deletions, substitutions, sequencing errors and intron signal. We used this simulator to measure the accuracy and robustness of available algorithms at the base and junction levels. Additionally, we used reverse transcription-polymerase chain reaction (RT-PCR) and Sanger sequencing to validate the ability of the algorithms to detect novel transcript features such as novel exons and alternative splicing in RNA-Seq data from mouse retina. A pipeline based on BLAT was developed to explore the performance of established tools for this problem, and to compare it to the recently developed methods. This pipeline, the RNA-Seq Unified Mapper (RUM), performs comparably to the best current aligners and provides an advantageous combination of accuracy, speed and usability. AVAILABILITY: The RUM pipeline is distributed via the Amazon Cloud and for computing clusters using the Sun Grid Engine (http://cbil.upenn.edu/RUM). CONTACT: ggrant@pcbi.upenn.edu; epierce@mail.med.upenn.edu SUPPLEMENTARY INFORMATION: The RNA-Seq sequence reads described in the article are deposited at GEO, accession GSE26248.


Asunto(s)
Análisis de Secuencia de ARN/métodos , Algoritmos , Animales , Secuencia de Bases , Benchmarking , Análisis por Conglomerados , Exones , Biblioteca de Genes , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento , Ratones , Modelos Genéticos , Datos de Secuencia Molecular , ARN/genética , Empalme del ARN , Alineación de Secuencia , Programas Informáticos
10.
Nucleic Acids Res ; 38(Database issue): D415-9, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19914931

RESUMEN

EuPathDB (http://EuPathDB.org; formerly ApiDB) is an integrated database covering the eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera. The most recent release of EuPathDB includes updates and changes affecting data content, infrastructure and the user interface, improving data access and enhancing the user experience. EuPathDB currently supports more than 80 searches and the recently-implemented 'search strategy' system enables users to construct complex multi-step searches via a graphical interface. Search results are dynamically displayed as the strategy is constructed or modified, and can be downloaded, saved, revised, or shared with other database users.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Infecciones por Protozoos/parasitología , Proteínas Protozoarias/genética , Animales , Biología Computacional/tendencias , Bases de Datos de Proteínas , Genoma de Protozoos , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Estructura Terciaria de Proteína , Infecciones por Protozoos/genética , Programas Informáticos
11.
Nucleic Acids Res ; 38(Database issue): D457-62, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19843604

RESUMEN

TriTrypDB (http://tritrypdb.org) is an integrated database providing access to genome-scale datasets for kinetoplastid parasites, and supporting a variety of complex queries driven by research and development needs. TriTrypDB is a collaborative project, utilizing the GUS/WDK computational infrastructure developed by the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) to integrate genome annotation and analyses from GeneDB and elsewhere with a wide variety of functional genomics datasets made available by members of the global research community, often pre-publication. Currently, TriTrypDB integrates datasets from Leishmania braziliensis, L. infantum, L. major, L. tarentolae, Trypanosoma brucei and T. cruzi. Users may examine individual genes or chromosomal spans in their genomic context, including syntenic alignments with other kinetoplastid organisms. Data within TriTrypDB can be interrogated utilizing a sophisticated search strategy system that enables a user to construct complex queries combining multiple data types. All search strategies are stored, allowing future access and integrated searches. 'User Comments' may be added to any gene page, enhancing available annotation; such comments become immediately searchable via the text search, and are forwarded to curators for incorporation into the reference annotation when appropriate.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Leishmania/genética , Trypanosoma/genética , Animales , Biología Computacional/tendencias , Bases de Datos de Proteínas , Genoma de Protozoos , Almacenamiento y Recuperación de la Información/métodos , Internet , Estructura Terciaria de Proteína , Proteínas Protozoarias/genética , Programas Informáticos , Interfaz Usuario-Computador
12.
Nucleic Acids Res ; 37(Database issue): D539-43, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18957442

RESUMEN

PlasmoDB (http://PlasmoDB.org) is a functional genomic database for Plasmodium spp. that provides a resource for data analysis and visualization in a gene-by-gene or genome-wide scale. PlasmoDB belongs to a family of genomic resources that are housed under the EuPathDB (http://EuPathDB.org) Bioinformatics Resource Center (BRC) umbrella. The latest release, PlasmoDB 5.5, contains numerous new data types from several broad categories--annotated genomes, evidence of transcription, proteomics evidence, protein function evidence, population biology and evolution. Data in PlasmoDB can be queried by selecting the data of interest from a query grid or drop down menus. Various results can then be combined with each other on the query history page. Search results can be downloaded with associated functional data and registered users can store their query history for future retrieval or analysis.


Asunto(s)
Bases de Datos Genéticas , Genoma de Protozoos , Plasmodium/genética , Animales , Genómica , Plasmodium/crecimiento & desarrollo , Plasmodium/metabolismo , Proteínas Protozoarias/genética , Proteínas Protozoarias/fisiología , Transcripción Genética
13.
Nucleic Acids Res ; 37(Database issue): D526-30, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18824479

RESUMEN

GiardiaDB (http://GiardiaDB.org) and TrichDB (http://TrichDB.org) house the genome databases for Giardia lamblia and Trichomonas vaginalis, respectively, and represent the latest additions to the EuPathDB (http://EuPathDB.org) family of functional genomic databases. GiardiaDB and TrichDB employ the same framework as other EuPathDB sites (CryptoDB, PlasmoDB and ToxoDB), supporting fully integrated and searchable databases. Genomic-scale data available via these resources may be queried based on BLAST searches, annotation keywords and gene ID searches, GO terms, sequence motifs and other protein characteristics. Functional queries may also be formulated, based on transcript and protein expression data from a variety of platforms. Phylogenetic relationships may also be interrogated. The ability to combine the results from independent queries, and to store queries and query results for future use facilitates complex, genome-wide mining of functional genomic data.


Asunto(s)
Bases de Datos Genéticas , Giardia lamblia/genética , Trichomonas vaginalis/genética , Animales , Genoma de Protozoos , Genómica , Programas Informáticos , Integración de Sistemas
14.
Nucleic Acids Res ; 36(Database issue): D553-6, 2008 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18003657

RESUMEN

ToxoDB (http://ToxoDB.org) is a genome and functional genomic database for the protozoan parasite Toxoplasma gondii. It incorporates the sequence and annotation of the T. gondii ME49 strain, as well as genome sequences for the GT1, VEG and RH (Chr Ia, Chr Ib) strains. Sequence information is integrated with various other genomic-scale data, including community annotation, ESTs, gene expression and proteomics data. ToxoDB has matured significantly since its initial release. Here we outline the numerous updates with respect to the data and increased functionality available on the website.


Asunto(s)
Bases de Datos Genéticas , Genoma de Protozoos , Toxoplasma/genética , Animales , Expresión Génica , Genómica , Internet , Proteómica , Proteínas Protozoarias/química , Programas Informáticos , Integración de Sistemas , Toxoplasma/metabolismo
15.
Gates Open Res ; 3: 1661, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-32047873

RESUMEN

The concept of open data has been gaining traction as a mechanism to increase data use, ensure that data are preserved over time, and accelerate discovery. While epidemiology data sets are increasingly deposited in databases and repositories, barriers to access still remain. ClinEpiDB was constructed as an open-access online resource for clinical and epidemiologic studies by leveraging the extensive web toolkit and infrastructure of the Eukaryotic Pathogen Database Resources (EuPathDB; a collection of databases covering 170+ eukaryotic pathogens, relevant related species, and select hosts) combined with a unified semantic web framework. Here we present an intuitive point-and-click website that allows users to visualize and subset data directly in the ClinEpiDB browser and immediately explore potential associations. Supporting study documentation aids contextualization, and data can be downloaded for advanced analyses. By facilitating access and interrogation of high-quality, large-scale data sets, ClinEpiDB aims to spur collaboration and discovery that improves global health.

16.
Nat Commun ; 7: 10147, 2016 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-26738725

RESUMEN

Toxoplasma gondii is among the most prevalent parasites worldwide, infecting many wild and domestic animals and causing zoonotic infections in humans. T. gondii differs substantially in its broad distribution from closely related parasites that typically have narrow, specialized host ranges. To elucidate the genetic basis for these differences, we compared the genomes of 62 globally distributed T. gondii isolates to several closely related coccidian parasites. Our findings reveal that tandem amplification and diversification of secretory pathogenesis determinants is the primary feature that distinguishes the closely related genomes of these biologically diverse parasites. We further show that the unusual population structure of T. gondii is characterized by clade-specific inheritance of large conserved haploblocks that are significantly enriched in tandemly clustered secretory pathogenesis determinants. The shared inheritance of these conserved haploblocks, which show a different ancestry than the genome as a whole, may thus influence transmission, host range and pathogenicity.


Asunto(s)
Genoma de Protozoos , Toxoplasma/genética , Toxoplasma/patogenicidad , Secuencia Conservada , ADN Protozoario/genética , Regulación de la Expresión Génica/fisiología , Filogenia , Polimorfismo de Nucleótido Simple , Proteínas Protozoarias/genética , Proteínas Protozoarias/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Sintenía , Virulencia
17.
Curr Protoc Bioinformatics ; Chapter 6: 6.12.1-6.12.19, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21901743

RESUMEN

OrthoMCL is an algorithm for grouping proteins into ortholog groups based on their sequence similarity. OrthoMCL-DB is a public database that allows users to browse and view ortholog groups that were pre-computed using the OrthoMCL algorithm. Version 4 of this database contained 116,536 ortholog groups clustered from 1,270,853 proteins obtained from 88 eukaryotic genomes, 16 archaean genomes, and 34 bacterial genomes. Future versions of OrthoMCL-DB will include more proteomes as more genomes are sequenced. Here, we describe how you can group your proteins of interest into ortholog clusters using two different means provided by the OrthoMCL system. The OrthoMCL-DB Web site has a tool for uploading and grouping a set of protein sequences, typically representing a proteome. This method maps the uploaded proteins to existing groups in OrthoMCL-DB. Alternatively, if you have proteins from a set of genomes that need to be grouped, you can download, install, and run the stand-alone OrthoMCL software.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Bases de Datos Genéticas , Proteínas/clasificación , Proteoma/clasificación , Programas Informáticos , Proteínas/genética , Proteoma/genética , Especificidad de la Especie
18.
Database (Oxford) ; 2011: bar027, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21705364

RESUMEN

Web sites associated with the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) have recently introduced a graphical user interface, the Strategies WDK, intended to make advanced searching and set and interval operations easy and accessible to all users. With a design guided by usability studies, the system helps motivate researchers to perform dynamic computational experiments and explore relationships across data sets. For example, PlasmoDB users seeking novel therapeutic targets may wish to locate putative enzymes that distinguish pathogens from their hosts, and that are expressed during appropriate developmental stages. When a researcher runs one of the approximately 100 searches available on the site, the search is presented as a first step in a strategy. The strategy is extended by running additional searches, which are combined with set operators (union, intersect or minus), or genomic interval operators (overlap, contains). A graphical display uses Venn diagrams to make the strategy's flow obvious. The interface facilitates interactive adjustment of the component searches with changes propagating forward through the strategy. Users may save their strategies, creating protocols that can be shared with colleagues. The strategy system has now been deployed on all EuPathDB databases, and successfully deployed by other projects. The Strategies WDK uses a configurable MVC architecture that is compatible with most genomics and biological warehouse databases, and is available for download at code.google.com/p/strategies-wdk. Database URL: www.eupathdb.org.


Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Genómica/métodos , Interfaz Usuario-Computador , Internet
19.
PLoS One ; 3(10): e3563, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18974779

RESUMEN

Cryptochromes are blue light photoreceptors involved in development and circadian clock regulation. They are found in both eukaryotes and prokaryotes as light sensors. Long Hypocotyl in Far-Red 1 (HFR1) has been identified as a positive regulator and a possible transcription factor in both blue and far-red light signaling in plants. However, the gene targets that are regulated by HFR1 in cryptochrome 1 (cry1)-mediated blue light signaling have not been globally addressed. We examined the transcriptome profiles in a cry1- and HFR1-dependent manner in response to 1 hour of blue light. Strikingly, more than 70% of the genes induced by blue light in an HFR1-dependent manner were dependent on cry1, and vice versa. High overrepresentation of W-boxes and OCS elements were found in these genes, indicating that this strong cry1 and HFR1 co-regulation on gene expression is possibly through these two cis-elements. We also found that cry1 was required for maintaining the HFR1 protein level in blue light, and that the HFR1 protein level is strongly correlated with the global gene expression pattern. In summary, HFR1, which is fine-tuned by cry1, is crucial for regulating global gene expression in cry1-mediated early blue light signaling, especially for the function of genes containing W-boxes and OCS elements.


Asunto(s)
Aclimatación/genética , Proteínas de Arabidopsis/fisiología , Arabidopsis/genética , Proteínas de Unión al ADN/fisiología , Flavoproteínas/genética , Regulación de la Expresión Génica de las Plantas , Luz , Proteínas Nucleares/fisiología , Arabidopsis/crecimiento & desarrollo , Arabidopsis/fisiología , Secuencia de Bases , Color , Criptocromos , Elementos Transponibles de ADN/fisiología , Flavoproteínas/fisiología , Perfilación de la Expresión Génica , Genes de Plantas , Fototransducción/genética , Datos de Secuencia Molecular , Mutagénesis Insercional , NADPH-Ferrihemoproteína Reductasa/genética , Plantas Modificadas Genéticamente , Plantones/genética , Plantones/fisiología
20.
Genome Biol ; 9(7): R116, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18644147

RESUMEN

BACKGROUND: Although the genomes of many of the most important human and animal pathogens have now been sequenced, our understanding of the actual proteins expressed by these genomes and how well they predict protein sequence and expression is still deficient. We have used three complementary approaches (two-dimensional electrophoresis, gel-liquid chromatography linked tandem mass spectrometry and MudPIT) to analyze the proteome of Toxoplasma gondii, a parasite of medical and veterinary significance, and have developed a public repository for these data within ToxoDB, making for the first time proteomics data an integral part of this key genome resource. RESULTS: The draft genome for Toxoplasma predicts around 8,000 genes with varying degrees of confidence. Our data demonstrate how proteomics can inform these predictions and help discover new genes. We have identified nearly one-third (2,252) of all the predicted proteins, with 2,477 intron-spanning peptides providing supporting evidence for correct splice site annotation. Functional predictions for each protein and key pathways were determined from the proteome. Importantly, we show evidence for many proteins that match alternative gene models, or previously unpredicted genes. For example, approximately 15% of peptides matched more convincingly to alternative gene models. We also compared our data with existing transcriptional data in which we highlight apparent discrepancies between gene transcription and protein expression. CONCLUSION: Our data demonstrate the importance of protein data in expression profiling experiments and highlight the necessity of integrating proteomic with genomic data so that iterative refinements of both annotation and expression models are possible.


Asunto(s)
Genoma de Protozoos , Proteoma/genética , Proteínas Protozoarias/genética , Toxoplasma/genética , Animales , Cromatografía Liquida , Bases de Datos de Proteínas , Electroforesis en Gel Bidimensional , Etiquetas de Secuencia Expresada/metabolismo , Expresión Génica , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Proteoma/metabolismo , Proteómica , Proteínas Protozoarias/análisis , Proteínas Protozoarias/metabolismo , Espectrometría de Masas en Tándem , Toxoplasma/crecimiento & desarrollo , Toxoplasma/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA