Pesquisa | Biblioteca Virtual em Saúde

Breedbase: a digital ecosystem for modern plant breeding.

Morales, Nicolas; Ogbonna, Alex C; Ellerbrock, Bryan J; Bauchet, Guillaume J; Tantikanjana, Titima; Tecle, Isaak Y; Powell, Adrian F; Lyon, David; Menda, Naama; Simoes, Christiano C; Saha, Surya; Hosmani, Prashant; Flores, Mirella; Panitz, Naftali; Preble, Ryan S; Agbona, Afolabi; Rabbi, Ismail; Kulakow, Peter; Peteti, Prasad; Kawuki, Robert; Esuma, Williams; Kanaabi, Micheal; Chelangat, Doreen M; Uba, Ezenwanyi; Olojede, Adeyemi; Onyeka, Joseph; Shah, Trushar; Karanja, Margaret; Egesi, Chiedozie; Tufan, Hale; Paterne, Agre; Asfaw, Asrat; Jannink, Jean-Luc; Wolfe, Marnin; Birkett, Clay L; Waring, David J; Hershberger, Jenna M; Gore, Michael A; Robbins, Kelly R; Rife, Trevor; Courtney, Chaney; Poland, Jesse; Arnaud, Elizabeth; Laporte, Marie-Angélique; Kulembeka, Heneriko; Salum, Kasele; Mrema, Emmanuel; Brown, Allan; Bayo, Stanley; Uwimana, Brigitte.

G3 (Bethesda) ; 12(7)2022 07 06.

Artigo em Inglês | MEDLINE | ID: mdl-35385099

RESUMO

Modern breeding methods integrate next-generation sequencing and phenomics to identify plants with the best characteristics and greatest genetic merit for use as parents in subsequent breeding cycles to ultimately create improved cultivars able to sustain high adoption rates by farmers. This data-driven approach hinges on strong foundations in data management, quality control, and analytics. Of crucial importance is a central database able to (1) track breeding materials, (2) store experimental evaluations, (3) record phenotypic measurements using consistent ontologies, (4) store genotypic information, and (5) implement algorithms for analysis, prediction, and selection decisions. Because of the complexity of the breeding process, breeding databases also tend to be complex, difficult, and expensive to implement and maintain. Here, we present a breeding database system, Breedbase (https://breedbase.org/, last accessed 4/18/2022). Originally initiated as Cassavabase (https://cassavabase.org/, last accessed 4/18/2022) with the NextGen Cassava project (https://www.nextgencassava.org/, last accessed 4/18/2022), and later developed into a crop-agnostic system, it is presently used by dozens of different crops and projects. The system is web based and is available as open source software. It is available on GitHub (https://github.com/solgenomics/, last accessed 4/18/2022) and packaged in a Docker image for deployment (https://hub.docker.com/u/breedbase, last accessed 4/18/2022). The Breedbase system enables breeding programs to better manage and leverage their data for decision making within a fully integrated digital ecosystem.

Assuntos

Ecossistema , Melhoramento Vegetal , Algoritmos , Produtos Agrícolas/genética , Software

High density genotype storage for plant breeding in the Chado schema of Breedbase.

Morales, Nicolas; Bauchet, Guillaume J; Tantikanjana, Titima; Powell, Adrian F; Ellerbrock, Bryan J; Tecle, Isaak Y; Mueller, Lukas A.

PLoS One ; 15(11): e0240059, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33175872

RESUMO

Modern breeding programs routinely use genome-wide information for selecting individuals to advance. The large volumes of genotypic information required present a challenge for data storage and query efficiency. Major use cases require genotyping data to be linked with trait phenotyping data. In contrast to phenotyping data that are often stored in relational database schemas, next-generation genotyping data are traditionally stored in non-relational storage systems due to their extremely large scope. This study presents a novel data model implemented in Breedbase (https://breedbase.org/) for uniting relational phenotyping data and non-relational genotyping data within the open-source PostgreSQL database engine. Breedbase is an open-source, web-database designed to manage all of a breeder's informatics needs: management of field experiments, phenotypic and genotypic data collection and storage, and statistical analyses. The genotyping data is stored in a PostgreSQL data-type known as binary JavaScript Object Notation (JSONb), where the JSON structures closely follow the Variant Call Format (VCF) data model. The Breedbase genotyping data model can handle different ploidy levels, structural variants, and any genotype encoded in VCF. JSONb is both compressed and indexed, resulting in a space and time efficient system. Furthermore, file caching maximizes data retrieval performance. Integration of all breeding data within the Chado database schema retains referential integrity that may be lost when genotyping and phenotyping data are stored in separate systems. Benchmarking demonstrates that the system is fast enough for computation of a genomic relationship matrix (GRM) and genome wide association study (GWAS) for datasets involving 1,325 diploid Zea mays, 314 triploid Musa acuminata, and 924 diploid Manihot esculenta samples genotyped with 955,690, 142,119, and 287,952 genotype-by-sequencing (GBS) markers, respectively.

Assuntos

Bases de Dados Genéticas , Manihot/genética , Musa/genética , Zea mays/genética , Análise de Dados , Genótipo , Melhoramento Vegetal , Plantas

The Sol Genomics Network (SGN)--from genotype to phenotype to breeding.

Fernandez-Pozo, Noe; Menda, Naama; Edwards, Jeremy D; Saha, Surya; Tecle, Isaak Y; Strickler, Susan R; Bombarely, Aureliano; Fisher-York, Thomas; Pujar, Anuradha; Foerster, Hartmut; Yan, Aimin; Mueller, Lukas A.

Nucleic Acids Res ; 43(Database issue): D1036-41, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25428362

RESUMO

The Sol Genomics Network (SGN, http://solgenomics.net) is a web portal with genomic and phenotypic data, and analysis tools for the Solanaceae family and close relatives. SGN hosts whole genome data for an increasing number of Solanaceae family members including tomato, potato, pepper, eggplant, tobacco and Nicotiana benthamiana. The database also stores loci and phenotype data, which researchers can upload and edit with user-friendly web interfaces. Tools such as BLAST, GBrowse and JBrowse for browsing genomes, expression and map data viewers, a locus community annotation system and a QTL analysis tools are available. A new tool was recently implemented to improve Virus-Induced Gene Silencing (VIGS) constructs called the SGN VIGS tool. With the growing genomic and phenotypic data in the database, SGN is now advancing to develop new web-based breeding tools and implement the code and database structure for other species or clade-specific databases.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genoma de Planta , Solanaceae/genética , Cruzamento , Cruzamentos Genéticos , Genômica , Genótipo , Internet , Fenótipo , Solanaceae/metabolismo

solGS: a web-based tool for genomic selection.

Tecle, Isaak Y; Edwards, Jeremy D; Menda, Naama; Egesi, Chiedozie; Rabbi, Ismail Y; Kulakow, Peter; Kawuki, Robert; Jannink, Jean-Luc; Mueller, Lukas A.

BMC Bioinformatics ; 15: 398, 2014 Dec 14.

Artigo em Inglês | MEDLINE | ID: mdl-25495537

RESUMO

BACKGROUND: Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, analysis, and sharing. A bioinformatics infrastructure for data storage and access, and user-friendly web-based tool for analysis and sharing output is needed to make GS more practical for breeders. RESULTS: We have developed a web-based tool, called solGS, for predicting genomic estimated breeding values (GEBVs) of individuals, using a Ridge-Regression Best Linear Unbiased Predictor (RR-BLUP) model. It has an intuitive web-interface for selecting a training population for modeling and estimating genomic estimated breeding values of selection candidates. It estimates phenotypic correlation and heritability of traits and selection indices of individuals. Raw data is stored in a generic database schema, Chado Natural Diversity, co-developed by multiple database groups. Analysis output is graphically visualized and can be interactively explored online or downloaded in text format. An instance of its implementation can be accessed at the NEXTGEN Cassava breeding database, http://cassavabase.org/solgs. CONCLUSIONS: solGS enables breeders to store raw data and estimate GEBVs of individuals online, in an intuitive and interactive workflow. It can be adapted to any breeding program.

Assuntos

Cruzamento , Manihot/genética , Software , Genômica , Internet , Manihot/fisiologia , Locos de Características Quantitativas

The Sol Genomics Network (solgenomics.net): growing tomatoes using Perl.

Bombarely, Aureliano; Menda, Naama; Tecle, Isaak Y; Buels, Robert M; Strickler, Susan; Fischer-York, Thomas; Pujar, Anuradha; Leto, Jonathan; Gosselin, Joseph; Mueller, Lukas A.

Nucleic Acids Res ; 39(Database issue): D1149-55, 2011 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-20935049

RESUMO

The Sol Genomics Network (SGN; http://solgenomics.net/) is a clade-oriented database (COD) containing biological data for species in the Solanaceae and their close relatives, with data types ranging from chromosomes and genes to phenotypes and accessions. SGN hosts several genome maps and sequences, including a pre-release of the tomato (Solanum lycopersicum cv Heinz 1706) reference genome. A new transcriptome component has been added to store RNA-seq and microarray data. SGN is also an open source software project, continuously developing and improving a complex system for storing, integrating and analyzing data. All code and development work is publicly visible on GitHub (http://github.com). The database architecture combines SGN-specific schemas and the community-developed Chado schema (http://gmod.org/wiki/Chado) for compatibility with other genome databases. The SGN curation model is community-driven, allowing researchers to add and edit information using simple web tools. Currently, over a hundred community annotators help curate the database. SGN can be accessed at http://solgenomics.net/.

Assuntos

Bases de Dados Genéticas , Genoma de Planta , Solanum lycopersicum/genética , Perfilação da Expressão Gênica , Genômica , Solanum lycopersicum/crescimento & desenvolvimento , Solanum lycopersicum/metabolismo , Proteínas de Plantas/genética , Software

solQTL: a tool for QTL analysis, visualization and linking to genomes at SGN database.

Tecle, Isaak Y; Menda, Naama; Buels, Robert M; van der Knaap, Esther; Mueller, Lukas A.

BMC Bioinformatics ; 11: 525, 2010 Oct 21.

Artigo em Inglês | MEDLINE | ID: mdl-20964836

RESUMO

BACKGROUND: A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL). Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort. Furthermore, as entire genomes are being sequenced and an increasing amount of genetic and expression data are being generated, a challenge remains: linking phenotypic variation to the underlying genomic variation. To identify candidate genes and understand the molecular basis underlying the phenotypic variation of traits, bioinformatic approaches are needed to exploit information such as genetic map, expression and whole genome sequence data of organisms in biological databases. DESCRIPTION: The Sol Genomics Network (SGN, http://solgenomics.net) is a primary repository for phenotypic, genetic, genomic, expression and metabolic data for the Solanaceae family and other related Asterids species and houses a variety of bioinformatics tools. SGN has implemented a new approach to QTL data organization, storage, analysis, and cross-links with other relevant data in internal and external databases. The new QTL module, solQTL, http://solgenomics.net/qtl/, employs a user-friendly web interface for uploading raw phenotype and genotype data to the database, R/QTL mapping software for on-the-fly QTL analysis and algorithms for online visualization and cross-referencing of QTLs to relevant datasets and tools such as the SGN Comparative Map Viewer and Genome Browser. Here, we describe the development of the solQTL module and demonstrate its application. CONCLUSIONS: solQTL allows Solanaceae researchers to upload raw genotype and phenotype data to SGN, perform QTL analysis and dynamically cross-link to relevant genetic, expression and genome annotations. Exploration and synthesis of the relevant data is expected to help facilitate identification of candidate genes underlying phenotypic variation and markers more closely linked to QTLs. solQTL is freely available on SGN and can be used in private or public mode.

Assuntos

Genoma de Planta , Genômica/métodos , Locos de Características Quantitativas/genética , Software , Algoritmos , Bases de Dados Factuais , Bases de Dados Genéticas , Fenótipo , Solanaceae/genética

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA