Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Bioinformatics ; 37(7): 1037-1038, 2021 05 17.
Artículo en Inglés | MEDLINE | ID: mdl-32735312

RESUMEN

SUMMARY: Currently, gene information available for Oryza sativa species is located in various online heterogeneous data sources. Moreover, methods of access are also diverse, mostly web-based and sometimes query APIs, which might not always be straightforward for domain experts. The challenge is to collect information quickly from these applications and combine it logically, to facilitate scientific research. We developed a Python package named PyRice, a unified programing API to access all supported databases at the same time with consistent output. PyRice design is modular and implements a smart query system, which fits the computing resources to optimize the query speed. As a result, PyRice is easy to use and produces intuitive results. AVAILABILITY AND IMPLEMENTATION: https://github.com/SouthGreenPlatform/PyRice. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Oryza , Programas Informáticos , Bases de Datos Factuales , Almacenamiento y Recuperación de la Información , Oryza/genética
2.
Brief Bioinform ; 20(2): 565-571, 2019 03 25.
Artículo en Inglés | MEDLINE | ID: mdl-29659709

RESUMEN

Improving productivity of the staple crops wheat and rice is essential to feed the growing global population, particularly in the context of a changing climate. However, current rates of yield gain are insufficient to support the predicted population growth. New approaches are required to accelerate the breeding process, and many of these are driven by the application of large-scale crop data. To leverage the substantial volumes and types of data that can be applied for precision breeding, the wheat and rice research communities are working towards the development of integrated systems to access and standardize the dispersed, heterogeneous available data. Here, we outline the initiatives of the International Wheat Information System (WheatIS) and the International Rice Informatics Consortium (IRIC) to establish Web-based single-access systems and data mining tools to make the available resources more accessible, drive discovery and accelerate the production of new crop varieties. We discuss the progress of WheatIS and IRIC towards unifying specialized wheat and rice databases and building custom software platforms to manage and interrogate these data. Single-access crop information systems will strengthen scientific collaboration, optimize the use of public research funds and help achieve the required yield gains in the two most important global food crops.


Asunto(s)
Productos Agrícolas/crecimiento & desarrollo , Sistemas de Información , Oryza/crecimiento & desarrollo , Triticum/crecimiento & desarrollo
3.
Bioinformatics ; 35(20): 4147-4155, 2019 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-30903186

RESUMEN

MOTIVATION: Modern genomic breeding methods rely heavily on very large amounts of phenotyping and genotyping data, presenting new challenges in effective data management and integration. Recently, the size and complexity of datasets have increased significantly, with the result that data are often stored on multiple systems. As analyses of interest increasingly require aggregation of datasets from diverse sources, data exchange between disparate systems becomes a challenge. RESULTS: To facilitate interoperability among breeding applications, we present the public plant Breeding Application Programming Interface (BrAPI). BrAPI is a standardized web service API specification. The development of BrAPI is a collaborative, community-based initiative involving a growing global community of over a hundred participants representing several dozen institutions and companies. Development of such a standard is recognized as critical to a number of important large breeding system initiatives as a foundational technology. The focus of the first version of the API is on providing services for connecting systems and retrieving basic breeding data including germplasm, study, observation, and marker data. A number of BrAPI-enabled applications, termed BrAPPs, have been written, that take advantage of the emerging support of BrAPI by many databases. AVAILABILITY AND IMPLEMENTATION: More information on BrAPI, including links to the specification, test suites, BrAPPs, and sample implementations is available at https://brapi.org/. The BrAPI specification and the developer tools are provided as free and open source.


Asunto(s)
Fitomejoramiento , Programas Informáticos , Interfaz Usuario-Computador , Genómica
4.
Gigascience ; 132024 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-38832465

RESUMEN

BACKGROUND: As the number of genome-wide association study (GWAS) and quantitative trait locus (QTL) mappings in rice continues to grow, so does the already long list of genomic loci associated with important agronomic traits. Typically, loci implicated by GWAS/QTL analysis contain tens to hundreds to thousands of single-nucleotide polmorphisms (SNPs)/genes, not all of which are causal and many of which are in noncoding regions. Unraveling the biological mechanisms that tie the GWAS regions and QTLs to the trait of interest is challenging, especially since it requires collating functional genomics information about the loci from multiple, disparate data sources. RESULTS: We present RicePilaf, a web app for post-GWAS/QTL analysis, that performs a slew of novel bioinformatics analyses to cross-reference GWAS results and QTL mappings with a host of publicly available rice databases. In particular, it integrates (i) pangenomic information from high-quality genome builds of multiple rice varieties, (ii) coexpression information from genome-scale coexpression networks, (iii) ontology and pathway information, (iv) regulatory information from rice transcription factor databases, (v) epigenomic information from multiple high-throughput epigenetic experiments, and (vi) text-mining information extracted from scientific abstracts linking genes and traits. We demonstrate the utility of RicePilaf by applying it to analyze GWAS peaks of preharvest sprouting and genes underlying yield-under-drought QTLs. CONCLUSIONS: RicePilaf enables rice scientists and breeders to shed functional light on their GWAS regions and QTLs, and it provides them with a means to prioritize SNPs/genes for further experiments. The source code, a Docker image, and a demo version of RicePilaf are publicly available at https://github.com/bioinfodlsu/rice-pilaf.


Asunto(s)
Minería de Datos , Estudio de Asociación del Genoma Completo , Oryza , Sitios de Carácter Cuantitativo , Oryza/genética , Programas Informáticos , Epigenómica/métodos , Biología Computacional/métodos , Polimorfismo de Nucleótido Simple , Genómica/métodos , Genoma de Planta , Mapeo Cromosómico , Bases de Datos Genéticas
5.
BMC Bioinformatics ; 14: 126, 2013 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-23586394

RESUMEN

BACKGROUND: In recent years, a large amount of "-omics" data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. RESULTS: We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. CONCLUSIONS: BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic.


Asunto(s)
Bases de Datos Genéticas , Programas Informáticos , Algoritmos , Genoma de Planta , Internet , Oryza/genética , Semántica , Integración de Sistemas , Vocabulario Controlado
6.
BMC Plant Biol ; 13: 122, 2013 Aug 29.
Artículo en Inglés | MEDLINE | ID: mdl-23987653

RESUMEN

BACKGROUND: In crops, inflorescence complexity and the shape and size of the seed are among the most important characters that influence yield. For example, rice panicles vary considerably in the number and order of branches, elongation of the axis, and the shape and size of the seed. Manual low-throughput phenotyping methods are time consuming, and the results are unreliable. However, high-throughput image analysis of the qualitative and quantitative traits of rice panicles is essential for understanding the diversity of the panicle as well as for breeding programs. RESULTS: This paper presents P-TRAP software (Panicle TRAit Phenotyping), a free open source application for high-throughput measurements of panicle architecture and seed-related traits. The software is written in Java and can be used with different platforms (the user-friendly Graphical User Interface (GUI) uses Netbeans Platform 7.3). The application offers three main tools: a tool for the analysis of panicle structure, a spikelet/grain counting tool, and a tool for the analysis of seed shape. The three tools can be used independently or simultaneously for analysis of the same image. Results are then reported in the Extensible Markup Language (XML) and Comma Separated Values (CSV) file formats. Images of rice panicles were used to evaluate the efficiency and robustness of the software. Compared to data obtained by manual processing, P-TRAP produced reliable results in a much shorter time. In addition, manual processing is not repeatable because dry panicles are vulnerable to damage. The software is very useful, practical and collects much more data than human operators. CONCLUSIONS: P-TRAP is a new open source software that automatically recognizes the structure of a panicle and the seeds on the panicle in numeric images. The software processes and quantifies several traits related to panicle structure, detects and counts the grains, and measures their shape parameters. In short, P-TRAP offers both efficient results and a user-friendly environment for experiments. The experimental results showed very good accuracy compared to field operator, expert verification and well-known academic methods.


Asunto(s)
Oryza/anatomía & histología , Oryza/crecimiento & desarrollo , Programas Informáticos , Inflorescencia/anatomía & histología , Inflorescencia/crecimiento & desarrollo , Fenotipo , Sitios de Carácter Cuantitativo , Semillas/anatomía & histología , Semillas/crecimiento & desarrollo
7.
Plant Biotechnol J ; 10(5): 555-68, 2012 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-22369597

RESUMEN

We report here the molecular and phenotypic features of a library of 31,562 insertion lines generated in the model japonica cultivar Nipponbare of rice (Oryza sativa L.), called Oryza Tag Line (OTL). Sixteen thousand eight hundred and fourteen T-DNA and 12,410 Tos17 discrete insertion sites have been characterized in these lines. We estimate that 8686 predicted gene intervals--i.e. one-fourth to one-fifth of the estimated rice nontransposable element gene complement--are interrupted by sequence-indexed T-DNA (6563 genes) and/or Tos17 (2755 genes) inserts. Six hundred and forty-three genes are interrupted by both T-DNA and Tos17 inserts. High quality of the sequence indexation of the T2 seed samples was ascertained by several approaches. Field evaluation under agronomic conditions of 27,832 OTL has revealed that 18.2% exhibit at least one morphophysiological alteration in the T1 progeny plants. Screening 10,000 lines for altered response to inoculation by the fungal pathogen Magnaporthe oryzae allowed to observe 71 lines (0.7%) developing spontaneous lesions simulating disease mutants and 43 lines (0.4%) exhibiting an enhanced disease resistance or susceptibility. We show here that at least 3.5% (four of 114) of these alterations are tagged by the mutagens. The presence of allelic series of sequence-indexed mutations in a gene among OTL that exhibit a convergent phenotype clearly increases the chance of establishing a linkage between alterations and inserts. This convergence approach is illustrated by the identification of the rice ortholog of AtPHO2, the disruption of which causes a lesion-mimic phenotype owing to an over-accumulation of phosphate, in nine lines bearing allelic insertions.


Asunto(s)
ADN Bacteriano , Biblioteca de Genes , Mutagénesis Insercional , Oryza/genética , ADN de Plantas/genética , Genes de Plantas , Magnaporthe/patogenicidad , Fenotipo , Enfermedades de las Plantas/genética , Plásmidos , Análisis de Secuencia de ADN , Transformación Genética
8.
Methods Mol Biol ; 2443: 415-427, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35037218

RESUMEN

Next generation sequencing technologies enabled high-density genotyping for large numbers of samples. Nowadays SNP calling pipelines produce up to millions of such markers, but which need to be filtered in various ways according to the type of analyses. One of the main challenges still lies in the management of an increasing volume of genotyping files that are difficult to handle for many applications. Here, we provide a practical guide for efficiently managing large genomic variation data using Gigwa, a user-friendly, scalable and versatile application that may be deployed either remotely on web servers or on a local machine.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Genómica , Genotipo , Técnicas de Genotipaje , Polimorfismo de Nucleótido Simple
9.
Methods Mol Biol ; 2443: 527-540, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35037225

RESUMEN

Recent advances in high-throughput technologies have resulted in tremendous increase in the amount of data in the agronomic domain. There is an urgent need to effectively integrate complementary information to understand the biological system in its entirety. We have developed AgroLD, a knowledge graph that exploits the Semantic Web technology and some of the relevant standard domain ontologies, to integrate information on plant species and in this way facilitating the formulation of new scientific hypotheses. This chapter outlines some integration results of the project, which initially focused on genomics, proteomics and phenomics.


Asunto(s)
Genómica , Reconocimiento de Normas Patrones Automatizadas , Bases de Datos Factuales , Genómica/métodos , Plantas/genética , Proteómica
10.
Nucleic Acids Res ; 37(Database issue): D992-5, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19036791

RESUMEN

OryGenesDB (http://orygenesdb.cirad.fr/index.html) is a database developed for rice reverse genetics. OryGenesDB contains FSTs (flanking sequence tags) of various mutagens and functional genomics data, collected from both international insertion collections and the literature. The current release of OryGenesDB contains 171,000 FSTs, and annotations divided among 10 specific categories, totaling 78 annotation layers. Several additional tools have been added to the main interface; these tools enable the user to retrieve FSTs and design probes to analyze insertion lines. The major innovation of OryGenesDB 2008, besides updating the data and tools, is a new tool, Orylink, which was developed to speed up rice functional genomics by taking advantage of the resources developed in two related databases, Oryza Tag Line and GreenPhylDB. Orylink was designed to field complex queries across these three databases and store both the queries and their results in an intuitive manner. Orylink offers a simple and powerful virtual workbench for functional genomics. Alternatively, the Web services developed for Orylink can be used independently of its Web interface, increasing the interoperability between these different bioinformatics applications.


Asunto(s)
Bases de Datos Genéticas , Mutagénesis Insercional , Oryza/genética , Cartilla de ADN , Genoma de Planta , Genómica , Lugares Marcados de Secuencia
11.
Genomics Inform ; 19(3): e27, 2021 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-34638174

RESUMEN

Due to the rapid evolution of high-throughput technologies, a tremendous amount of data is being produced in the biological domain, which poses a challenging task for information extraction and natural language understanding. Biological named entity recognition (NER) and named entity normalisation (NEN) are two common tasks aiming at identifying and linking biologically important entities such as genes or gene products mentioned in the literature to biological databases. In this paper, we present an updated version of OryzaGP, a gene and protein dataset for rice species created to help natural language processing (NLP) tools in processing NER and NEN tasks. To create the dataset, we selected more than 15,000 abstracts associated with articles previously curated for rice genes. We developed four dictionaries of gene and protein names associated with database identifiers. We used these dictionaries to annotate the dataset. We also annotated the dataset using pre-trained NLP models. Finally, we analysed the annotation results and discussed how to improve OryzaGP.

12.
Nucleic Acids Res ; 36(Database issue): D1022-7, 2008 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17947330

RESUMEN

To organize data resulting from the phenotypic characterization of a library of 30,000 T-DNA enhancer trap (ET) insertion lines of rice (Oryza sativa L cv. Nipponbare), we developed the Oryza Tag Line (OTL) database (http://urgi.versailles.inra.fr/OryzaTagLine/). OTL structure facilitates forward genetic search for specific phenotypes, putatively resulting from gene disruption, and/or for GUSA or GFP reporter gene expression patterns, reflecting ET-mediated endogenous gene detection. In the latest version, OTL gathers the detailed morpho-physiological alterations observed during field evaluation and specific screens in a first set of 13,928 lines. Detection of GUS or GFP activity in specific organ/tissues in a subset of the library is also provided. Search in OTL can be achieved through trait ontology category, organ and/or developmental stage, keywords, expression of reporter gene in specific organ/tissue as well as line identification number. OTL now contains the description of 9721 mutant phenotypic traits observed in 2636 lines and 1234 GUS or GFP expression patterns. Each insertion line is documented through a generic passport data including production records, seed stocks and FST information. 8004 and 6101 of the 13,928 lines are characterized by at least one T-DNA and one Tos17 FST, respectively that OTL links to the rice genome browser OryGenesDB.


Asunto(s)
Bases de Datos Genéticas , Mutagénesis Insercional , Oryza/genética , Fenotipo , ADN Bacteriano/genética , Biblioteca de Genes , Genes Reporteros , Internet , Mutación , Lugares Marcados de Secuencia , Interfaz Usuario-Computador
13.
Genomics Inform ; 18(2): e19, 2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-32634873

RESUMEN

In semantic annotation, semantic concepts are linked to natural language. Semantic annotation helps in boosting the ability to search and access resources and can be used in information retrieval systems to augment the queries from the user. In the research described in this paper, we aimed to identify ontological concepts in scientific text contained in spreadsheets. We developed a tool that can handle various types of spreadsheets. Furthermore, we used the NCBO Annotator API provided by BioPortal to enhance the semantic annotation functionality to cover spreadsheet data. Table2Annotation has strengths in certain criteria such as speed, error handling, and complex concept matching.

14.
Front Public Health ; 8: 563247, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33072700

RESUMEN

Since its emergence in China, the COVID-19 pandemic has spread rapidly around the world. Faced with this unknown disease, public health authorities were forced to experiment, in a short period of time, with various combinations of interventions at different scales. However, as the pandemic progresses, there is an urgent need for tools and methodologies to quickly analyze the effectiveness of responses against COVID-19 in different communities and contexts. In this perspective, computer modeling appears to be an invaluable lever as it allows for the in silico exploration of a range of intervention strategies prior to the potential field implementation phase. More specifically, we argue that, in order to take into account important dimensions of policy actions, such as the heterogeneity of the individual response or the spatial aspect of containment strategies, the branch of computer modeling known as agent-based modeling is of immense interest. We present in this paper an agent-based modeling framework called COVID-19 Modeling Kit (COMOKIT), designed to be generic, scalable and thus portable in a variety of social and geographical contexts. COMOKIT combines models of person-to-person and environmental transmission, a model of individual epidemiological status evolution, an agenda-based 1-h time step model of human mobility, and an intervention model. It is designed to be modular and flexible enough to allow modelers and users to represent different strategies and study their impacts in multiple social, epidemiological or economic scenarios. Several large-scale experiments are analyzed in this paper and allow us to show the potentialities of COMOKIT in terms of analysis and comparison of the impacts of public health policies in a realistic case study.


Asunto(s)
COVID-19 , Pandemias , China/epidemiología , Ciudades , Humanos , SARS-CoV-2
15.
Genomics Inform ; 17(2): e17, 2019 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-31307132

RESUMEN

Text mining has become an important research method in biology, with its original purpose to extract biological entities, such as genes, proteins and phenotypic traits, to extend knowledge from scientific papers. However, few thorough studies on text mining and application development, for plant molecular biology data, have been performed, especially for rice, resulting in a lack of datasets available to solve named-entity recognition tasks for this species. Since there are rare benchmarks available for rice, we faced various difficulties in exploiting advanced machine learning methods for accurate analysis of the rice literature. To evaluate several approaches to automatically extract information from gene/protein entities, we built a new dataset for rice as a benchmark. This dataset is composed of a set of titles and abstracts, extracted from scientific papers focusing on the rice species, and is downloaded from PubMed. During the 5th Biomedical Linked Annotation Hackathon, a portion of the dataset was uploaded to PubAnnotation for sharing. Our ultimate goal is to offer a shared task of rice gene/protein name recognition through the BioNLP Open Shared Tasks framework using the dataset, to facilitate an open comparison and evaluation of different approaches to the task.

16.
Gigascience ; 8(5)2019 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-31077313

RESUMEN

BACKGROUND: The study of genetic variations is the basis of many research domains in biology. From genome structure to population dynamics, many applications involve the use of genetic variants. The advent of next-generation sequencing technologies led to such a flood of data that the daily work of scientists is often more focused on data management than data analysis. This mass of genotyping data poses several computational challenges in terms of storage, search, sharing, analysis, and visualization. While existing tools try to solve these challenges, few of them offer a comprehensive and scalable solution. RESULTS: Gigwa v2 is an easy-to-use, species-agnostic web application for managing and exploring high-density genotyping data. It can handle multiple databases and may be installed on a local computer or deployed as an online data portal. It supports various standard import and export formats, provides advanced filtering options, and offers means to visualize density charts or push selected data into various stand-alone or online tools. It implements 2 standard RESTful application programming interfaces, GA4GH, which is health-oriented, and BrAPI, which is breeding-oriented, thus offering wide possibilities of interaction with third-party applications. The project home page provides a list of live instances allowing users to test the system on public data (or reasonably sized user-provided data). CONCLUSIONS: This new version of Gigwa provides a more intuitive and more powerful way to explore large amounts of genotyping data by offering a scalable solution to search for genotype patterns, functional annotations, or more complex filtering. Furthermore, its user-friendliness and interoperability make it widely accessible to the life science community.


Asunto(s)
Biología Computacional , Genómica , Genotipo , Programas Informáticos , Bases de Datos Genéticas , Variación Genética/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Internet , Polimorfismo de Nucleótido Simple/genética , Interfaz Usuario-Computador
17.
Database (Oxford) ; 20192019 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-31508797

RESUMEN

MOTIVATION: With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems, including relational database management and columnar storage systems. RESULTS: We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix. AVAILABILITY: http://gobiin1.bti.cornell.edu:6083/projects/GBM/repos/benchmarking/browse.


Asunto(s)
Bases de Datos Genéticas , Genómica , Genotipo , Técnicas de Genotipaje , Almacenamiento y Recuperación de la Información , Programas Informáticos
18.
Gigascience ; 8(5)2019 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-31107941

RESUMEN

BACKGROUND: Rice molecular genetics, breeding, genetic diversity, and allied research (such as rice-pathogen interaction) have adopted sequencing technologies and high-density genotyping platforms for genome variation analysis and gene discovery. Germplasm collections representing rice diversity, improved varieties, and elite breeding materials are accessible through rice gene banks for use in research and breeding, with many having genome sequences and high-density genotype data available. Combining phenotypic and genotypic information on these accessions enables genome-wide association analysis, which is driving quantitative trait loci discovery and molecular marker development. Comparative sequence analyses across quantitative trait loci regions facilitate the discovery of novel alleles. Analyses involving DNA sequences and large genotyping matrices for thousands of samples, however, pose a challenge to non-computer savvy rice researchers. FINDINGS: The Rice Galaxy resource has shared datasets that include high-density genotypes from the 3,000 Rice Genomes project and sequences with corresponding annotations from 9 published rice genomes. The Rice Galaxy web server and deployment installer includes tools for designing single-nucleotide polymorphism assays, analyzing genome-wide association studies, population diversity, rice-bacterial pathogen diagnostics, and a suite of published genomic prediction methods. A prototype Rice Galaxy compliant to Open Access, Open Data, and Findable, Accessible, Interoperable, and Reproducible principles is also presented. CONCLUSIONS: Rice Galaxy is a freely available resource that empowers the plant research community to perform state-of-the-art analyses and utilize publicly available big datasets for both fundamental and applied science.


Asunto(s)
Bases de Datos Genéticas , Genómica/métodos , Oryza/genética , Fitomejoramiento/métodos , Programas Informáticos , Banco de Semillas
19.
PLoS One ; 13(11): e0198270, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30500839

RESUMEN

Recent advances in high-throughput technologies have resulted in a tremendous increase in the amount of omics data produced in plant science. This increase, in conjunction with the heterogeneity and variability of the data, presents a major challenge to adopt an integrative research approach. We are facing an urgent need to effectively integrate and assimilate complementary datasets to understand the biological system as a whole. The Semantic Web offers technologies for the integration of heterogeneous data and their transformation into explicit knowledge thanks to ontologies. We have developed the Agronomic Linked Data (AgroLD- www.agrold.org), a knowledge-based system relying on Semantic Web technologies and exploiting standard domain ontologies, to integrate data about plant species of high interest for the plant science community e.g., rice, wheat, arabidopsis. We present some integration results of the project, which initially focused on genomics, proteomics and phenomics. AgroLD is now an RDF (Resource Description Format) knowledge base of 100M triples created by annotating and integrating more than 50 datasets coming from 10 data sources-such as Gramene.org and TropGeneDB-with 10 ontologies-such as the Gene Ontology and Plant Trait Ontology. Our evaluation results show users appreciate the multiple query modes which support different use cases. AgroLD's objective is to offer a domain specific knowledge platform to solve complex biological and agronomical questions related to the implication of genes/proteins in, for instances, plant disease resistance or high yield traits. We expect the resolution of these questions to facilitate the formulation of new scientific hypotheses to be validated with a knowledge-oriented approach.


Asunto(s)
Agricultura , Genómica , Bases del Conocimiento , Proteómica , Genoma de Planta
20.
Curr Biol ; 28(14): 2274-2282.e6, 2018 07 23.
Artículo en Inglés | MEDLINE | ID: mdl-29983312

RESUMEN

African rice (Oryza glaberrima) was domesticated independently from Asian rice. The geographical origin of its domestication remains elusive. Using 246 new whole-genome sequences, we inferred the cradle of its domestication to be in the Inner Niger Delta. Domestication was preceded by a sharp decline of most wild populations that started more than 10,000 years ago. The wild population collapse occurred during the drying of the Sahara. This finding supports the hypothesis that depletion of wild resources in the Sahara triggered African rice domestication. African rice cultivation strongly expanded 2,000 years ago. During the last 5 centuries, a sharp decline of its cultivation coincided with the introduction of Asian rice in Africa. A gene, PROG1, associated with an erect plant architecture phenotype, showed convergent selection in two rice cultivated species, Oryza glaberrima from Africa and Oryza sativa from Asia. In contrast, a shattering gene, SH5, showed selection signature during African rice domestication, but not during Asian rice domestication. Overall, our genomic data revealed a complex history of African rice domestication influenced by important climatic changes in the Saharan area, by the expansion of African agricultural society, and by recent replacement by another domesticated species.


Asunto(s)
Productos Agrícolas/genética , Domesticación , Genoma de Planta , Oryza/genética , África , Cambio Climático , Dinámica Poblacional
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA