Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
G3 (Bethesda) ; 3(3): 517-25, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-23450226

RESUMEN

Read mapping is a fundamental part of next-generation genomic research but is complicated by genome duplication in many plants. Categorizing DNA sequence reads into their respective genomes enables current methods to analyze polyploid genomes as if they were diploid. We present PolyCat-a pipeline for mapping and categorizing all types of next-generation sequence data produced from allopolyploid organisms. PolyCat uses GSNAP's single-nucleotide polymorphism (SNP)-tolerant mapping to minimize the mapping efficiency bias caused by SNPs between genomes. PolyCat then uses SNPs between genomes to categorize reads according to their respective genomes. Bisulfite-treated reads have a significant reduction in nucleotide complexity because nucleotide conversion events are confounded with transition substitutions. PolyCat includes special provisions to properly handle bisulfite-treated data. We demonstrate the functionality of PolyCat on allotetraploid cotton, Gossypium hirsutum, and create a functional SNP index for efficiently mapping sequence reads to the D-genome sequence of G. raimondii. PolyCat is appropriate for all allopolyploids and all types of next-generation genome analysis, including differential expression (RNA sequencing), differential methylation (bisulfite sequencing), differential DNA-protein binding (chromatin immunoprecipitation sequencing), and population diversity.


Asunto(s)
Mapeo Cromosómico/métodos , ADN de Plantas/análisis , Genoma de Planta , Poliploidía , ARN de Planta/análisis , Programas Informáticos , Alelos , Secuencia de Bases , ADN de Plantas/genética , Diploidia , Gossypium/genética , Filogenia , Polimorfismo de Nucleótido Simple , ARN de Planta/genética , Reproducibilidad de los Resultados
2.
Nucleic Acids Res ; 41(Database issue): D684-91, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23175615

RESUMEN

EuPathDB (http://eupathdb.org) resources include 11 databases supporting eukaryotic pathogen genomic and functional genomic data, isolate data and phylogenomics. EuPathDB resources are built using the same infrastructure and provide a sophisticated search strategy system enabling complex interrogations of underlying data. Recent advances in EuPathDB resources include the design and implementation of a new data loading workflow, a new database supporting Piroplasmida (i.e. Babesia and Theileria), the addition of large amounts of new data and data types and the incorporation of new analysis tools. New data include genome sequences and annotation, strand-specific RNA-seq data, splice junction predictions (based on RNA-seq), phosphoproteomic data, high-throughput phenotyping data, single nucleotide polymorphism data based on high-throughput sequencing (HTS) and expression quantitative trait loci data. New analysis tools enable users to search for DNA motifs and define genes based on their genomic colocation, view results from searches graphically (i.e. genes mapped to chromosomes or isolates displayed on a map) and analyze data from columns in result tables (word cloud and histogram summaries of column content). The manuscript herein describes updates to EuPathDB since the previous report published in NAR in 2010.


Asunto(s)
Bases de Datos Genéticas , Parásitos/genética , Animales , Genómica , Internet , Anotación de Secuencia Molecular , Fenotipo , Piroplasmida/genética , Polimorfismo de Nucleótido Simple , Proteómica , Sitios de Carácter Cuantitativo , Sitios de Empalme de ARN , Análisis de Secuencia de ARN , Programas Informáticos
3.
Nature ; 492(7429): 423-7, 2012 Dec 20.
Artículo en Inglés | MEDLINE | ID: mdl-23257886

RESUMEN

Polyploidy often confers emergent properties, such as the higher fibre productivity and quality of tetraploid cottons than diploid cottons bred for the same environments. Here we show that an abrupt five- to sixfold ploidy increase approximately 60 million years (Myr) ago, and allopolyploidy reuniting divergent Gossypium genomes approximately 1-2 Myr ago, conferred about 30-36-fold duplication of ancestral angiosperm (flowering plant) genes in elite cottons (Gossypium hirsutum and Gossypium barbadense), genetic complexity equalled only by Brassica among sequenced angiosperms. Nascent fibre evolution, before allopolyploidy, is elucidated by comparison of spinnable-fibred Gossypium herbaceum A and non-spinnable Gossypium longicalyx F genomes to one another and the outgroup D genome of non-spinnable Gossypium raimondii. The sequence of a G. hirsutum A(t)D(t) (in which 't' indicates tetraploid) cultivar reveals many non-reciprocal DNA exchanges between subgenomes that may have contributed to phenotypic innovation and/or other emergent properties such as ecological adaptation by polyploids. Most DNA-level novelty in G. hirsutum recombines alleles from the D-genome progenitor native to its New World habitat and the Old World A-genome progenitor in which spinnable fibre evolved. Coordinated expression changes in proximal groups of functionally distinct genes, including a nuclear mitochondrial DNA block, may account for clusters of cotton-fibre quantitative trait loci affecting diverse traits. Opportunities abound for dissecting emergent properties of other polyploids, particularly angiosperms, by comparison to diploid progenitors and outgroups.


Asunto(s)
Evolución Biológica , Fibra de Algodón , Genoma de Planta/genética , Gossypium/genética , Poliploidía , Alelos , Cacao/genética , Cromosomas de las Plantas/genética , Diploidia , Duplicación de Gen/genética , Genes de Plantas/genética , Gossypium/clasificación , Anotación de Secuencia Molecular , Filogenia , Vitis/genética
4.
Nucleic Acids Res ; 39(Database issue): D612-9, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-20974635

RESUMEN

AmoebaDB (http://AmoebaDB.org) and MicrosporidiaDB (http://MicrosporidiaDB.org) are new functional genomic databases serving the amoebozoa and microsporidia research communities, respectively. AmoebaDB contains the genomes of three Entamoeba species (E. dispar, E. invadens and E. histolityca) and microarray expression data for E. histolytica. MicrosporidiaDB contains the genomes of Encephalitozoon cuniculi, E. intestinalis and E. bieneusi. The databases belong to the National Institute of Allergy and Infectious Diseases (NIAID) funded EuPathDB (http://EuPathDB.org) Bioinformatics Resource Center family of integrated databases and assume the same architectural and graphical design as other EuPathDB resources such as PlasmoDB and TriTrypDB. Importantly they utilize the graphical strategy builder that affords a database user the ability to ask complex multi-data-type questions with relative ease and versatility. Genomic scale data can be queried based on BLAST searches, annotation keywords and gene ID searches, GO terms, sequence motifs, protein characteristics, phylogenetic relationships and functional data such as transcript (microarray and EST evidence) and protein expression data. Search strategies can be saved within a user's profile for future retrieval and may also be shared with other researchers using a unique strategy web address.


Asunto(s)
Bases de Datos Genéticas , Encephalitozoon/genética , Entamoeba/genética , Genoma Fúngico , Genoma de Protozoos , Genómica
5.
Nucleic Acids Res ; 38(Database issue): D415-9, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19914931

RESUMEN

EuPathDB (http://EuPathDB.org; formerly ApiDB) is an integrated database covering the eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera. The most recent release of EuPathDB includes updates and changes affecting data content, infrastructure and the user interface, improving data access and enhancing the user experience. EuPathDB currently supports more than 80 searches and the recently-implemented 'search strategy' system enables users to construct complex multi-step searches via a graphical interface. Search results are dynamically displayed as the strategy is constructed or modified, and can be downloaded, saved, revised, or shared with other database users.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Infecciones por Protozoos/parasitología , Proteínas Protozoarias/genética , Animales , Biología Computacional/tendencias , Bases de Datos de Proteínas , Genoma de Protozoos , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Estructura Terciaria de Proteína , Infecciones por Protozoos/genética , Programas Informáticos
6.
Nucleic Acids Res ; 38(Database issue): D457-62, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19843604

RESUMEN

TriTrypDB (http://tritrypdb.org) is an integrated database providing access to genome-scale datasets for kinetoplastid parasites, and supporting a variety of complex queries driven by research and development needs. TriTrypDB is a collaborative project, utilizing the GUS/WDK computational infrastructure developed by the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) to integrate genome annotation and analyses from GeneDB and elsewhere with a wide variety of functional genomics datasets made available by members of the global research community, often pre-publication. Currently, TriTrypDB integrates datasets from Leishmania braziliensis, L. infantum, L. major, L. tarentolae, Trypanosoma brucei and T. cruzi. Users may examine individual genes or chromosomal spans in their genomic context, including syntenic alignments with other kinetoplastid organisms. Data within TriTrypDB can be interrogated utilizing a sophisticated search strategy system that enables a user to construct complex queries combining multiple data types. All search strategies are stored, allowing future access and integrated searches. 'User Comments' may be added to any gene page, enhancing available annotation; such comments become immediately searchable via the text search, and are forwarded to curators for incorporation into the reference annotation when appropriate.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Leishmania/genética , Trypanosoma/genética , Animales , Biología Computacional/tendencias , Bases de Datos de Proteínas , Genoma de Protozoos , Almacenamiento y Recuperación de la Información/métodos , Internet , Estructura Terciaria de Proteína , Proteínas Protozoarias/genética , Programas Informáticos , Interfaz Usuario-Computador
7.
Nature ; 457(7229): 551-6, 2009 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-19189423

RESUMEN

Sorghum, an African grass related to sugar cane and maize, is grown for food, feed, fibre and fuel. We present an initial analysis of the approximately 730-megabase Sorghum bicolor (L.) Moench genome, placing approximately 98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information. Genetic recombination is largely confined to about one-third of the sorghum genome with gene order and density similar to those of rice. Retrotransposon accumulation in recombinationally recalcitrant heterochromatin explains the approximately 75% larger genome size of sorghum compared with rice. Although gene and repetitive DNA distributions have been preserved since palaeopolyploidization approximately 70 million years ago, most duplicated gene sets lost one member before the sorghum-rice divergence. Concerted evolution makes one duplicated chromosomal segment appear to be only a few million years old. About 24% of genes are grass-specific and 7% are sorghum-specific. Recent gene and microRNA duplications may contribute to sorghum's drought tolerance.


Asunto(s)
Evolución Molecular , Genoma de Planta/genética , Poaceae/genética , Sorghum/genética , Arabidopsis/genética , Cromosomas de las Plantas/genética , Duplicación de Gen , Genes de Plantas , Oryza/genética , Populus/genética , Recombinación Genética/genética , Alineación de Secuencia , Análisis de Secuencia de ADN , Eliminación de Secuencia/genética , Zea mays/genética
8.
Nucleic Acids Res ; 37(Database issue): D526-30, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18824479

RESUMEN

GiardiaDB (http://GiardiaDB.org) and TrichDB (http://TrichDB.org) house the genome databases for Giardia lamblia and Trichomonas vaginalis, respectively, and represent the latest additions to the EuPathDB (http://EuPathDB.org) family of functional genomic databases. GiardiaDB and TrichDB employ the same framework as other EuPathDB sites (CryptoDB, PlasmoDB and ToxoDB), supporting fully integrated and searchable databases. Genomic-scale data available via these resources may be queried based on BLAST searches, annotation keywords and gene ID searches, GO terms, sequence motifs and other protein characteristics. Functional queries may also be formulated, based on transcript and protein expression data from a variety of platforms. Phylogenetic relationships may also be interrogated. The ability to combine the results from independent queries, and to store queries and query results for future use facilitates complex, genome-wide mining of functional genomic data.


Asunto(s)
Bases de Datos Genéticas , Giardia lamblia/genética , Trichomonas vaginalis/genética , Animales , Genoma de Protozoos , Genómica , Programas Informáticos , Integración de Sistemas
9.
Nucleic Acids Res ; 37(Database issue): D539-43, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18957442

RESUMEN

PlasmoDB (http://PlasmoDB.org) is a functional genomic database for Plasmodium spp. that provides a resource for data analysis and visualization in a gene-by-gene or genome-wide scale. PlasmoDB belongs to a family of genomic resources that are housed under the EuPathDB (http://EuPathDB.org) Bioinformatics Resource Center (BRC) umbrella. The latest release, PlasmoDB 5.5, contains numerous new data types from several broad categories--annotated genomes, evidence of transcription, proteomics evidence, protein function evidence, population biology and evolution. Data in PlasmoDB can be queried by selecting the data of interest from a query grid or drop down menus. Various results can then be combined with each other on the query history page. Search results can be downloaded with associated functional data and registered users can store their query history for future retrieval or analysis.


Asunto(s)
Bases de Datos Genéticas , Genoma de Protozoos , Plasmodium/genética , Animales , Genómica , Plasmodium/crecimiento & desarrollo , Plasmodium/metabolismo , Proteínas Protozoarias/genética , Proteínas Protozoarias/fisiología , Transcripción Genética
10.
Theor Appl Genet ; 117(7): 1021-9, 2008 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-18633591

RESUMEN

Simple sequence repeats (SSRs) are abundant and frequently highly polymorphic in transcribed sequences and widely targeted for marker development in eukaryotes. Sunflower (Helianthus annuus) transcript assemblies were built and mined to identify SSRs and insertions-deletions (INDELs) for marker development, comparative mapping, and other genomics applications in sunflower. We describe the spectrum and frequency of SSRs identified in the sunflower EST database, a catalog of 16,643 EST-SSRs, a collection of 484 EST-SSR and 43 EST-INDEL markers developed from common sunflower ESTs, polymorphisms of the markers among the parents of several intraspecific and interspecific mapping populations, and the transferability of the markers to closely and distantly related species in the Compositae. Of 17,904 unigenes in the transcript assembly, 1,956 (10.9%) harbored one or more SSRs with repeat counts of n > or = 5. EST-SSR markers were 1.6-fold more polymorphic among exotic than elite genotypes and 0.7-fold less polymorphic than non-genic SSR markers. Of 466 EST-SSR or INDEL markers screened for cross-species amplification and polymorphisms, 413 (88.6%) amplified alleles from one or more wild species (H. argophyllus, H. tuberosus, H. anomalus, H. paradoxus, and H. deserticola), whereas 69 (14.8%) amplified alleles from safflower (Carthamus tinctorius) and 67 (14.4%) amplified alleles from lettuce (Lactuca sativa); hence, only a fraction were transferable to distantly related genera in the Compositae, whereas most were transferable to wild relatives of H. annuus. Several thousand additional SSRs were identified in the EST database and supply a wealth of templates for EST-SSR marker development in sunflower.


Asunto(s)
Etiquetas de Secuencia Expresada , Helianthus/genética , Mutación INDEL , Repeticiones de Minisatélite , Polimorfismo Genético , Asteraceae/clasificación , Biología Computacional , Bases de Datos Genéticas , Marcadores Genéticos , Especificidad de la Especie
12.
BMC Genomics ; 8: 81, 2007 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-17389046

RESUMEN

BACKGROUND: Microarrays offer a powerful tool for diverse applications plant biology and crop improvement. Recently, two comprehensive assemblies of cotton ESTs were constructed based on three Gossypium species. Using these assemblies as templates, we describe the design and creation and of a publicly available oligonucleotide array for cotton, useful for all four of the cultivated species. RESULTS: Synthetic oligonucleotide probes were generated from exemplar sequences of a global assembly of 211,397 cotton ESTs derived from >50 different cDNA libraries representing many different tissue types and tissue treatments. A total of 22,787 oligonucleotide probes are included on the arrays, optimized to target the diversity of the transcriptome and previously studied cotton genes, transcription factors, and genes with homology to Arabidopsis. A small portion of the oligonucleotides target unidentified protein coding sequences, thereby providing an element of gene discovery. Because many oligonucleotides were based on ESTs from fiber-specific cDNA libraries, the microarray has direct application for analysis of the fiber transcriptome. To illustrate the utility of the microarray, we hybridized labeled bud and leaf cDNAs from G. hirsutum and demonstrate technical consistency of results. CONCLUSION: The cotton oligonucleotide microarray provides a reproducible platform for transcription profiling in cotton, and is made publicly available through http://cottonevolution.info.


Asunto(s)
Perfilación de la Expresión Génica , Gossypium/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Etiquetas de Secuencia Expresada , Genes de Plantas , Hibridación de Ácido Nucleico
13.
Bioinformatics ; 21(23): 4307-8, 2005 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-16204347

RESUMEN

SUMMARY: OxfordGrid is a web application and database schema for storing and interactively displaying genetic map data in a comparative, dot-plot, fashion. Its display is composed of a matrix of cells, each representing a pairwise comparison of mapped probe data for two linkage groups or chromosomes. These are arranged along the axes with one forming grid columns and the other grid rows with the degree and pattern of synteny/colinearity between the two linkage groups manifested in the cell's dot density and structure. A mouse click over the selected grid cell launches an image map-based display for the selected cell. Both individual and linear groups of mapped probes can be selected and displayed. Also, configurable links can be used to access other web resources for mapped probe information. AVAILABILITY: OxfordGrid is implemented in C#/ASP.NET and the package, including MySQL schema creation scripts, is available at ftp://cggc.agtec.uga.edu/OxfordGrid/.


Asunto(s)
Mapeo Cromosómico/métodos , Biología Computacional/métodos , Algoritmos , Interpretación Estadística de Datos , Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Ligamiento Genético , Almacenamiento y Recuperación de la Información , Internet , Programas Informáticos , Interfaz Usuario-Computador
14.
Plant Physiol ; 139(2): 869-84, 2005 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-16169961

RESUMEN

Improved knowledge of the sorghum transcriptome will enhance basic understanding of how plants respond to stresses and serve as a source of genes of value to agriculture. Toward this goal, Sorghum bicolor L. Moench cDNA libraries were prepared from light- and dark-grown seedlings, drought-stressed plants, Colletotrichum-infected seedlings and plants, ovaries, embryos, and immature panicles. Other libraries were prepared with meristems from Sorghum propinquum (Kunth) Hitchc. that had been photoperiodically induced to flower, and with rhizomes from S. propinquum and johnsongrass (Sorghum halepense L. Pers.). A total of 117,682 expressed sequence tags (ESTs) were obtained representing both 3' and 5' sequences from about half that number of cDNA clones. A total of 16,801 unique transcripts, representing tentative UniScripts (TUs), were identified from 55,783 3' ESTs. Of these TUs, 9,032 are represented by two or more ESTs. Collectively, these libraries were predicted to contain a total of approximately 31,000 TUs. Individual libraries, however, were predicted to contain no more than about 6,000 to 9,000, with the exception of light-grown seedlings, which yielded an estimate of close to 13,000. In addition, each library exhibits about the same level of complexity with respect to both the number of TUs preferentially expressed in that library and the frequency with which two or more ESTs is found in only that library. These results indicate that the sorghum genome is expressed in highly selective fashion in the individual organs and in response to the environmental conditions surveyed here. Close to 2,000 differentially expressed TUs were identified among the cDNA libraries examined, of which 775 were differentially expressed at a confidence level of 98%. From these 775 TUs, signature genes were identified defining drought, Colletotrichum infection, skotomorphogenesis (etiolation), ovary, immature panicle, and embryo.


Asunto(s)
Genes de Plantas , Sorghum/genética , ADN Complementario/genética , ADN de Plantas/genética , Etiquetas de Secuencia Expresada , Enfermedades de las Plantas/genética , Sorghum/crecimiento & desarrollo , Transcripción Genética
15.
Bioinformatics ; 21(9): 2126-7, 2005 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-15657101

RESUMEN

SUMMARY: IntegratedMap is a Web application and database schema for storing and interactively displaying genetic map data. Its Web interface includes a menu for direct chromosome/linkage group selection, a search form for selection based on mapped object location and linkage group displays. An overview display provides convenient access to the full range of mapped and anchored object types with genetic locus details, such as numbers, types and names of mapped/anchored objects displayed in a compact scrollable list box that automatically updates based on selected map location and object type. Also, multilinkage group and localized map views are available along with links that can be configured for integration with other Web resources. AVAILABILITY: IntegratedMap is implemented in C#/ASP.NET and the package, including a MySQL schema creation script, is available from http://cggc.agtec.uga.edu/Data/download.asp


Asunto(s)
Mapeo Cromosómico/métodos , Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Internet , Desequilibrio de Ligamiento/genética , Programas Informáticos , Interfaz Usuario-Computador , Algoritmos , Gráficos por Computador , Análisis Mutacional de ADN/métodos , Frecuencia de los Genes , Genética de Población/métodos , Almacenamiento y Recuperación de la Información/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Polimorfismo de Nucleótido Simple/genética , Integración de Sistemas
16.
Bioinformatics ; 21(5): 669-70, 2005 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-15374864

RESUMEN

UNLABELLED: ESTminer is a Web application and database schema for interactive mining of expressed sequence tag (EST) contig and cluster datasets. The Web interface contains a query frame that allows the selection of contigs/clusters with specific cDNA library makeup or a threshold number of members. The results are displayed as color-coded tree nodes, where the color indicates the fractional size of each cDNA library component. The nodes are expandable, revealing library statistics as well as EST or contig members, with links to sequence data, GenBank records or user configurable links. Also, the interface allows 'queries within queries' where the result set of a query is further filtered by the subsequent query. AVAILABILITY: ESTminer is implemented in Java/JSP and the package, including MySQL and Oracle schema creation scripts, is available from http://cggc.agtec.uga.edu/Data/download.asp CONTACT: agingle@uga.edu.


Asunto(s)
Mapeo Contig/métodos , Bases de Datos de Ácidos Nucleicos , Etiquetas de Secuencia Expresada , Almacenamiento y Recuperación de la Información/métodos , Internet , Análisis de Secuencia de ADN/métodos , Interfaz Usuario-Computador , Algoritmos , Análisis por Conglomerados , Sistemas de Administración de Bases de Datos , Reconocimiento de Normas Patrones Automatizadas/métodos , Alineación de Secuencia/métodos , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...