Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

A PostgreSQL Tripal solution for large-scale genotypic and phenotypic data.

Sanderson, Lacey-Anne; Caron, Carolyn T; Tan, Reynold L; Bett, Kirstin E.

Database (Oxford) ; 20212021 08 14.

Artigo em Inglês | MEDLINE | ID: mdl-34389844

RESUMO

Researchers are seeking cost-effective solutions for management and analysis of large-scale genotypic and phenotypic data. Open-source software is uniquely positioned to fill this need through user-focused, crowd-sourced development. Tripal, an open-source toolkit for developing biological data web portals, uses the GMOD Chado database schema to achieve flexible, ontology-driven storage in PostgreSQL. Tripal also aids research-focused web portals in providing data according to findable, accessible, interoperable, reusable (FAIR) principles. We describe here a fully relational PostgreSQL solution to handle large-scale genotypic and phenotypic data that is implemented as a collection of freely available, open-source modules. These Tripal extension modules provide a holistic approach for importing, storage, display and analysis within a relational database schema. Furthermore, they embody the Tripal approach to FAIR data by providing multiple search tools and ensuring metadata is fully described and interoperable. Our solution focuses on data integrity, as well as optimizing performance to provide a fully functional system that is currently being used in the production of Tripal portals for crop species. We fully describe the implementation of our solution and discuss why a PostgreSQL-powered web portal provides an efficient environment for researcher-driven genotypic and phenotypic data analysis.

Assuntos

Bases de Dados Genéticas , Software , Genótipo , Metadados

2.

Tripal, a community update after 10 years of supporting open source, standards-based genetic, genomic and breeding databases.

Staton, Margaret; Cannon, Ethalinda; Sanderson, Lacey-Anne; Wegrzyn, Jill; Anderson, Tavis; Buehler, Sean; Cobo-Simón, Irene; Faaberg, Kay; Grau, Emily; Guignon, Valentin; Gunoskey, Jessica; Inderski, Blake; Jung, Sook; Lager, Kelly; Main, Dorrie; Poelchau, Monica; Ramnath, Risharde; Richter, Peter; West, Joe; Ficklin, Stephen.

Brief Bioinform ; 22(6)2021 11 05.

Artigo em Inglês | MEDLINE | ID: mdl-34251419

RESUMO

Online, open access databases for biological knowledge serve as central repositories for research communities to store, find and analyze integrated, multi-disciplinary datasets. With increasing volumes, complexity and the need to integrate genomic, transcriptomic, metabolomic, proteomic, phenomic and environmental data, community databases face tremendous challenges in ongoing maintenance, expansion and upgrades. A common infrastructure framework using community standards shared by many databases can reduce development burden, provide interoperability, ensure use of common standards and support long-term sustainability. Tripal is a mature, open source platform built to meet this need. With ongoing improvement since its first release in 2009, Tripal provides full functionality for searching, browsing, loading and curating numerous types of data and is a primary technology powering at least 31 publicly available databases spanning plants, animals and human data, primarily storing genomics, genetics and breeding data. Tripal software development is managed by a shared, inclusive governance structure including both project management and advisory teams. Here, we report on the most important and innovative aspects of Tripal after 11 years development, including integration of diverse types of biological data, successful collaborative projects across member databases, and support for implementing FAIR principles.

Assuntos

Cruzamento , Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Plantas/genética , Software , Produtos Agrícolas/genética , Variação Genética , Filogenia , Plantas/metabolismo , Proteômica , Navegador

3.

KnowPulse: A Web-Resource Focused on Diversity Data for Pulse Crop Improvement.

Sanderson, Lacey-Anne; Caron, Carolyn T; Tan, Reynold; Shen, Yichao; Liu, Ruobin; Bett, Kirstin E.

Front Plant Sci ; 10: 965, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31428111

RESUMO

KnowPulse (https://knowpulse.usask.ca) is a breeder-focused web portal for pulse breeders and geneticists. With a focus on diversity data, KnowPulse provides information on genetic markers, sequence variants, phenotypic traits and germplasm for chickpea, common bean, field pea, faba bean, and lentil. Genotypic data is accessible through the genotype matrix tool, displayed as a marker-by-germplasm table of genotype calls specific to germplasm chosen by the researcher. It is also summarized on genetic marker and sequence variant pages. Phenotypic data is visualized in trait distribution plots: violin plots for quantitative data and histograms for qualitative data. These plots are accessible through trait, germplasm, and experiment pages, as well as through a single page search tool. KnowPulse is built using the open-source Tripal toolkit and utilizes open-source tools including, but not limited to, species-specific JBrowse instances, a BLAST interface, and whole-genome CViTjs visualizations. KnowPulse is constantly evolving with data and tools added as they become available. Full integration of genetic maps and quantitative trait loci is imminent, and development of tools exploring structural variation is being explored.

4.

Tripal v3: an ontology-based toolkit for construction of FAIR biological community databases.

Spoor, Shawna; Cheng, Chun-Huai; Sanderson, Lacey-Anne; Condon, Bradford; Almsaeed, Abdullah; Chen, Ming; Bretaudeau, Anthony; Rasche, Helena; Jung, Sook; Main, Dorrie; Bett, Kirstin; Staton, Margaret; Wegrzyn, Jill L; Feltus, F Alex; Ficklin, Stephen P.

Database (Oxford) ; 20192019 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-31328773

RESUMO

Community biological databases provide an important online resource for both public and private data, analysis tools and community engagement. These sites house genomic, transcriptomic, genetic, breeding and ancillary data for specific species, families or clades. Due to the complexity and increasing quantities of these data, construction of online resources is increasingly difficult especially with limited funding and access to technical expertise. Furthermore, online repositories are expected to promote FAIR data principles (findable, accessible, interoperable and reusable) that presents additional challenges. The open-source Tripal database toolkit seeks to mitigate these challenges by creating both the software and an interactive community of developers for construction of online community databases. Additionally, through coordinated, distributed co-development, Tripal sites encourage community-wide sustainability. Here, we report the release of Tripal version 3 that improves data accessibility and data sharing through systematic use of controlled vocabularies (CVs). Tripal uses the community-developed Chado database as a default data store, but now provides tools to support other data stores, while ensuring that CVs remain the central organizational structure for the data. A new site developer can use Tripal to develop a basic site with little to no programming, with the ability to integrate other data types using extension modules and the Tripal application programming interface. A thorough online User's Guide and Developer's Handbook are available at http://tripal.info, providing download, installation and step-by-step setup instructions.

Assuntos

Biota/genética , Bases de Dados Genéticas , Disseminação de Informação , Internet , Software , Transcriptoma , Genômica

5.

AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture.

Harper, Lisa; Campbell, Jacqueline; Cannon, Ethalinda K S; Jung, Sook; Poelchau, Monica; Walls, Ramona; Andorf, Carson; Arnaud, Elizabeth; Berardini, Tanya Z; Birkett, Clayton; Cannon, Steve; Carson, James; Condon, Bradford; Cooper, Laurel; Dunn, Nathan; Elsik, Christine G; Farmer, Andrew; Ficklin, Stephen P; Grant, David; Grau, Emily; Herndon, Nic; Hu, Zhi-Liang; Humann, Jodi; Jaiswal, Pankaj; Jonquet, Clement; Laporte, Marie-Angélique; Larmande, Pierre; Lazo, Gerard; McCarthy, Fiona; Menda, Naama; Mungall, Christopher J; Munoz-Torres, Monica C; Naithani, Sushma; Nelson, Rex; Nesdill, Daureen; Park, Carissa; Reecy, James; Reiser, Leonore; Sanderson, Lacey-Anne; Sen, Taner Z; Staton, Margaret; Subramaniam, Sabarinath; Tello-Ruiz, Marcela Karey; Unda, Victor; Unni, Deepak; Wang, Liya; Ware, Doreen; Wegrzyn, Jill; Williams, Jason; Woodhouse, Margaret.

Database (Oxford) ; 20182018 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-30239679

RESUMO

The future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require data management plans for publicly funded research. Furthermore, the value of data increases exponentially when they are properly stored, described, integrated and shared, so that they can be easily utilized in future analyses. AgBioData (https://www.agbiodata.org) is a consortium of people working at agricultural biological databases, data archives and knowledgbases who strive to identify common issues in database development, curation and management, with the goal of creating database products that are more Findable, Accessible, Interoperable and Reusable. We strive to promote authentic, detailed, accurate and explicit communication between all parties involved in scientific data. As a step toward this goal, we present the current state of biocuration, ontologies, metadata and persistence, database platforms, programmatic (machine) access to data, communication and sustainability with regard to data curation. Each section describes challenges and opportunities for these topics, along with recommendations and best practices.

Assuntos

Agricultura , Bases de Dados Genéticas , Genômica , Cruzamento , Ontologia Genética , Metadados , Inquéritos e Questionários

6.

Gene-based SNP discovery in tepary bean (Phaseolus acutifolius) and common bean (P. vulgaris) for diversity analysis and comparative mapping.

Gujaria-Verma, Neha; Ramsay, Larissa; Sharpe, Andrew G; Sanderson, Lacey-Anne; Debouck, Daniel G; Tar'an, Bunyamin; Bett, Kirstin E.

BMC Genomics ; 17: 239, 2016 Mar 15.

Artigo em Inglês | MEDLINE | ID: mdl-26979462

RESUMO

BACKGROUND: Common bean (Phaseolus vulgaris) is an important grain legume and there has been a recent resurgence in interest in its relative, tepary bean (P. acutifolius), owing to this species' ability to better withstand abiotic stresses. Genomic resources are scarce for this minor crop species and a better knowledge of the genome-level relationship between these two species would facilitate improvement in both. High-throughput genotyping has facilitated large-scale single nucleotide polymorphism (SNP) identification leading to the development of molecular markers with associated sequence information that can be used to place them in the context of a full genome assembly. RESULTS: Transcript-based SNPs were identified from six common bean and two tepary bean accessions and a subset were used to generate a 768-SNP Illumina GoldenGate assay for each species. The tepary bean assay was used to assess diversity in wild and cultivated tepary bean and to generate the first gene-based map of the tepary bean genome. Genotypic analyses of the diversity panel showed a clear separation between domesticated and cultivated tepary beans, two distinct groups within the domesticated types, and P. parvifolius was confirmed to be distinct. The genetic map of tepary bean was compared to the common bean genome assembly to demonstrate high levels of collinearity between the two species with differences limited to a few intra-chromosomal rearrangements. CONCLUSIONS: The development of the first set of genomic resources specifically for tepary bean has allowed for greater insight into the structure of this species and its relationship to its agriculturally more prominent relative, common bean. These resources will be helpful in the development of efficient breeding strategies for both species and will facilitate the introgression of agriculturally important traits from one crop into the other.

Assuntos

Mapeamento Cromossômico , Phaseolus/genética , Polimorfismo de Nucleotídeo Único , Biblioteca Gênica , Genes de Plantas , Técnicas de Genotipagem , Phaseolus/classificação , Filogenia

7.

Gene-based SNP discovery and genetic mapping in pea.

Sindhu, Anoop; Ramsay, Larissa; Sanderson, Lacey-Anne; Stonehouse, Robert; Li, Rong; Condie, Janet; Shunmugam, Arun S K; Liu, Yong; Jha, Ambuj B; Diapari, Marwan; Burstin, Judith; Aubert, Gregoire; Tar'an, Bunyamin; Bett, Kirstin E; Warkentin, Thomas D; Sharpe, Andrew G.

Theor Appl Genet ; 127(10): 2225-41, 2014 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-25119872

RESUMO

KEY MESSAGE: Gene-based SNPs were identified and mapped in pea using five recombinant inbred line populations segregating for traits of agronomic importance. Pea (Pisum sativum L.) is one of the world's oldest domesticated crops and has been a model system in plant biology and genetics since the work of Gregor Mendel. Pea is the second most widely grown pulse crop in the world following common bean. The importance of pea as a food crop is growing due to its combination of moderate protein concentration, slowly digestible starch, high dietary fiber concentration, and its richness in micronutrients; however, pea has lagged behind other major crops in harnessing recent advances in molecular biology, genomics and bioinformatics, partly due to its large genome size with a large proportion of repetitive sequence, and to the relatively limited investment in research in this crop globally. The objective of this research was the development of a genome-wide transcriptome-based pea single-nucleotide polymorphism (SNP) marker platform using next-generation sequencing technology. A total of 1,536 polymorphic SNP loci selected from over 20,000 non-redundant SNPs identified using deep transcriptome sequencing of eight diverse Pisum accessions were used for genotyping in five RIL populations using an Illumina GoldenGate assay. The first high-density pea SNP map defining all seven linkage groups was generated by integrating with previously published anchor markers. Syntenic relationships of this map with the model legume Medicago truncatula and lentil (Lens culinaris Medik.) maps were established. The genic SNP map establishes a foundation for future molecular breeding efforts by enabling both the identification and tracking of introgression of genomic regions harbouring QTLs related to agronomic and seed quality traits.

Assuntos

Mapeamento Cromossômico , Pisum sativum/genética , Polimorfismo de Nucleotídeo Único , DNA de Plantas/genética , Biblioteca Gênica , Genoma de Planta , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Lens (Planta)/genética , Medicago truncatula/genética , Análise de Sequência de DNA , Sintenia , Transcriptoma

8.

Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases.

Sanderson, Lacey-Anne; Ficklin, Stephen P; Cheng, Chun-Huai; Jung, Sook; Feltus, Frank A; Bett, Kirstin E; Main, Dorrie.

Database (Oxford) ; 2013: bat075, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24163125

RESUMO

Tripal is an open-source freely available toolkit for construction of online genomic and genetic databases. It aims to facilitate development of community-driven biological websites by integrating the GMOD Chado database schema with Drupal, a popular website creation and content management software. Tripal provides a suite of tools for interaction with a Chado database and display of content therein. The tools are designed to be generic to support the various ways in which data may be stored in Chado. Previous releases of Tripal have supported organisms, genomic libraries, biological stocks, stock collections and genomic features, their alignments and annotations. Also, Tripal and its extension modules provided loaders for commonly used file formats such as FASTA, GFF, OBO, GAF, BLAST XML, KEGG heir files and InterProScan XML. Default generic templates were provided for common views of biological data, which could be customized using an open Application Programming Interface to change the way data are displayed. Here, we report additional tools and functionality that are part of release v1.1 of Tripal. These include (i) a new bulk loader that allows a site curator to import data stored in a custom tab delimited format; (ii) full support of every Chado table for Drupal Views (a powerful tool allowing site developers to construct novel displays and search pages); (iii) new modules including 'Feature Map', 'Genetic', 'Publication', 'Project', 'Contact' and the 'Natural Diversity' modules. Tutorials, mailing lists, download and set-up instructions, extension modules and other documentation can be found at the Tripal website located at http://tripal.info. DATABASE URL: http://tripal.info/.

Assuntos

Bases de Dados Genéticas , Genoma/genética , Genômica/métodos , Genômica/normas , Internet , Software , Variação Genética , Genótipo , Plantas/genética , Publicações , Padrões de Referência , Sementes/genética , Interface Usuário-Computador

9.

Ancient orphan crop joins modern era: gene-based SNP discovery and mapping in lentil.

Sharpe, Andrew G; Ramsay, Larissa; Sanderson, Lacey-Anne; Fedoruk, Michael J; Clarke, Wayne E; Li, Rong; Kagale, Sateesh; Vijayan, Perumal; Vandenberg, Albert; Bett, Kirstin E.

BMC Genomics ; 14: 192, 2013 Mar 18.

Artigo em Inglês | MEDLINE | ID: mdl-23506258

RESUMO

BACKGROUND: The genus Lens comprises a range of closely related species within the galegoid clade of the Papilionoideae family. The clade includes other important crops (e.g. chickpea and pea) as well as a sequenced model legume (Medicago truncatula). Lentil is a global food crop increasing in importance in the Indian sub-continent and elsewhere due to its nutritional value and quick cooking time. Despite this importance there has been a dearth of genetic and genomic resources for the crop and this has limited the application of marker-assisted selection strategies in breeding. RESULTS: We describe here the development of a deep and diverse transcriptome resource for lentil using next generation sequencing technology. The generation of data in multiple cultivated (L. culinaris) and wild (L. ervoides) genotypes together with the utilization of a bioinformatics workflow enabled the identification of a large collection of SNPs and the subsequent development of a genotyping platform that was used to establish the first comprehensive genetic map of the L. culinaris genome. Extensive collinearity with M. truncatula was evident on the basis of sequence homology between mapped markers and the model genome and large translocations and inversions relative to M. truncatula were identified. An estimate for the time divergence of L. culinaris from L. ervoides and of both from M. truncatula was also calculated. CONCLUSIONS: The availability of the genomic and derived molecular marker resources presented here will help change lentil breeding strategies and lead to increased genetic gain in the future.

Assuntos

Lens (Planta)/genética , Ligação Genética , Genômica , Genótipo , Medicago truncatula/genética , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA

10.

The Chado Natural Diversity module: a new generic database schema for large-scale phenotyping and genotyping data.

Jung, Sook; Menda, Naama; Redmond, Seth; Buels, Robert M; Friesen, Maren; Bendana, Yuri; Sanderson, Lacey-Anne; Lapp, Hilmar; Lee, Taein; MacCallum, Bob; Bett, Kirstin E; Cain, Scott; Clements, Dave; Mueller, Lukas A; Main, Dorrie.

Database (Oxford) ; 2011: bar051, 2011.

Artigo em Inglês | MEDLINE | ID: mdl-22120662

RESUMO

Linking phenotypic with genotypic diversity has become a major requirement for basic and applied genome-centric biological research. To meet this need, a comprehensive database backend for efficiently storing, querying and analyzing large experimental data sets is necessary. Chado, a generic, modular, community-based database schema is widely used in the biological community to store information associated with genome sequence data. To meet the need to also accommodate large-scale phenotyping and genotyping projects, a new Chado module called Natural Diversity has been developed. The module strictly adheres to the Chado remit of being generic and ontology driven. The flexibility of the new module is demonstrated in its capacity to store any type of experiment that either uses or generates specimens or stock organisms. Experiments may be grouped or structured hierarchically, whereas any kind of biological entity can be stored as the observed unit, from a specimen to be used in genotyping or phenotyping experiments, to a group of species collected in the field that will undergo further lab analysis. We describe details of the Natural Diversity module, including the design approach, the relational schema and use cases implemented in several databases.

Assuntos

Biodiversidade , Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Animais , Genótipo , Internet , Fenótipo , Plantas

11.

Tripal: a construction toolkit for online genome databases.

Ficklin, Stephen P; Sanderson, Lacey-Anne; Cheng, Chun-Huai; Staton, Margaret E; Lee, Taein; Cho, Il-Hyung; Jung, Sook; Bett, Kirstin E; Main, Doreen.

Database (Oxford) ; 2011: bar044, 2011.

Artigo em Inglês | MEDLINE | ID: mdl-21959868

RESUMO

As the availability, affordability and magnitude of genomics and genetics research increases so does the need to provide online access to resulting data and analyses. Availability of a tailored online database is the desire for many investigators or research communities; however, managing the Information Technology infrastructure needed to create such a database can be an undesired distraction from primary research or potentially cost prohibitive. Tripal provides simplified site development by merging the power of Drupal, a popular web Content Management System with that of Chado, a community-derived database schema for storage of genomic, genetic and other related biological data. Tripal provides an interface that extends the content management features of Drupal to the data housed in Chado. Furthermore, Tripal provides a web-based Chado installer, genomic data loaders, web-based editing of data for organisms, genomic features, biological libraries, controlled vocabularies and stock collections. Also available are Tripal extensions that support loading and visualizations of NCBI BLAST, InterPro, Kyoto Encyclopedia of Genes and Genomes and Gene Ontology analyses, as well as an extension that provides integration of Tripal with GBrowse, a popular GMOD tool. An Application Programming Interface is available to allow creation of custom extensions by site developers, and the look-and-feel of the site is completely customizable through Drupal-based PHP template files. Addition of non-biological content and user-management is afforded through Drupal. Tripal is an open source and freely available software package found at http://tripal.sourceforge.net.

Assuntos

Biologia Computacional , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Genoma , Internet , Mineração de Dados

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA