Pesquisa | BVS Aleitamento Materno

PanEffect: a pan-genome visualization tool for variant effects in maize.

Andorf, Carson M; Haley, Olivia C; Hayford, Rita K; Portwood, John L; Harding, Stephen; Sen, Shatabdi; Cannon, Ethalinda K; Gardiner, Jack M; Kim, Hye-Seon; Woodhouse, Margaret R.

Bioinformatics ; 40(2)2024 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-38337024

RESUMO

SUMMARY: Understanding the effects of genetic variants is crucial for accurately predicting traits and functional outcomes. Recent approaches have utilized artificial intelligence and protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 50 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the Evolutionary Scale Modeling (ESM) protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to potential functional consequences. In addition, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement. AVAILABILITY AND IMPLEMENTATION: The PanEffect code is freely available on GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/PanEffect). A maize implementation of PanEffect and underlying datasets are available at MaizeGDB (https://www.maizegdb.org/effect/maize/).

Assuntos

Bases de Dados Genéticas , Zea mays , Zea mays/genética , Inteligência Artificial , Genoma de Planta , Fenótipo , Software

qTeller: a tool for comparative multi-genomic gene expression analysis.

Woodhouse, Margaret R; Sen, Shatabdi; Schott, David; Portwood, John L; Freeling, Michael; Walley, Justin W; Andorf, Carson M; Schnable, James C.

Bioinformatics ; 38(1): 236-242, 2021 12 22.

Artigo em Inglês | MEDLINE | ID: mdl-34406385

RESUMO

MOTIVATION: Over the last decade, RNA-Seq whole-genome sequencing has become a widely used method for measuring and understanding transcriptome-level changes in gene expression. Since RNA-Seq is relatively inexpensive, it can be used on multiple genomes to evaluate gene expression across many different conditions, tissues and cell types. Although many tools exist to map and compare RNA-Seq at the genomics level, few web-based tools are dedicated to making data generated for individual genomic analysis accessible and reusable at a gene-level scale for comparative analysis between genes, across different genomes and meta-analyses. RESULTS: To address this challenge, we revamped the comparative gene expression tool qTeller to take advantage of the growing number of public RNA-Seq datasets. qTeller allows users to evaluate gene expression data in a defined genomic interval and also perform two-gene comparisons across multiple user-chosen tissues. Though previously unpublished, qTeller has been cited extensively in the scientific literature, demonstrating its importance to researchers. Our new version of qTeller now supports multiple genomes for intergenomic comparisons, and includes capabilities for both mRNA and protein abundance datasets. Other new features include support for additional data formats, modernized interface and back-end database and an optimized framework for adoption by other organisms' databases. AVAILABILITY AND IMPLEMENTATION: The source code for qTeller is open-source and available through GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/qTeller). A maize instance of qTeller is available at the Maize Genetics and Genomics database (MaizeGDB) (https://qteller.maizegdb.org/), where we have mapped over 200 unique datasets from GenBank across 27 maize genomes. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Genoma , Genômica , Software , Bases de Dados de Ácidos Nucleicos , Zea mays/genética , Perfilação da Expressão Gênica

Maize Feature Store: A centralized resource to manage and analyze curated maize multi-omics features for machine learning applications.

Sen, Shatabdi; Woodhouse, Margaret R; Portwood, John L; Andorf, Carson M.

Database (Oxford) ; 20232023 11 06.

Artigo em Inglês | MEDLINE | ID: mdl-37935586

RESUMO

The big-data analysis of complex data associated with maize genomes accelerates genetic research and improves agronomic traits. As a result, efforts have increased to integrate diverse datasets and extract meaning from these measurements. Machine learning models are a powerful tool for gaining knowledge from large and complex datasets. However, these models must be trained on high-quality features to succeed. Currently, there are no solutions to host maize multi-omics datasets with end-to-end solutions for evaluating and linking features to target gene annotations. Our work presents the Maize Feature Store (MFS), a versatile application that combines features built on complex data to facilitate exploration, modeling and analysis. Feature stores allow researchers to rapidly deploy machine learning applications by managing and providing access to frequently used features. We populated the MFS for the maize reference genome with over 14 000 gene-based features based on published genomic, transcriptomic, epigenomic, variomic and proteomics datasets. Using the MFS, we created an accurate pan-genome classification model with an AUC-ROC score of 0.87. The MFS is publicly available through the maize genetics and genomics database. Database URL https://mfs.maizegdb.org/.

Assuntos

Multiômica , Zea mays , Zea mays/genética , Bases de Dados Genéticas , Genômica , Aprendizado de Máquina

Maize protein structure resources at the maize genetics and genomics database.

Woodhouse, Margaret R; Portwood, John L; Sen, Shatabdi; Hayford, Rita K; Gardiner, Jack M; Cannon, Ethalinda K; Harper, Lisa C; Andorf, Carson M.

Genetics ; 224(1)2023 05 04.

Artigo em Inglês | MEDLINE | ID: mdl-36755109

RESUMO

Protein structures play an important role in bioinformatics, such as in predicting gene function or validating gene model annotation. However, determining protein structure was, until now, costly and time-consuming, which resulted in a structural biology bottleneck. With the release of such programs AlphaFold and ESMFold, this bottleneck has been reduced by several orders of magnitude, permitting protein structural comparisons of entire genomes within reasonable timeframes. MaizeGDB has leveraged this technological breakthrough by offering several new tools to accelerate protein structural comparisons between maize and other plants as well as human and yeast outgroups. MaizeGDB also offers bulk downloads of these comparative protein structure data, along with predicted functional annotation information. In this way, MaizeGDB is poised to assist maize researchers in assessing functional homology, gene model annotation quality, and other information unavailable to maize scientists even a few years ago.

Assuntos

Interface Usuário-Computador , Zea mays , Humanos , Zea mays/genética , Zea mays/metabolismo , Bases de Dados Genéticas , Biologia Computacional/métodos , Genoma de Planta , Anotação de Sequência Molecular , Genômica/métodos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA