Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
BMC Genomics ; 24(1): 107, 2023 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-36899307

RESUMO

BACKGROUND: The advancement of sequencing technologies today has made a plethora of whole-genome re-sequenced (WGRS) data publicly available. However, research utilizing the WGRS data without further configuration is nearly impossible. To solve this problem, our research group has developed an interactive Allele Catalog Tool to enable researchers to explore the coding region allelic variation present in over 1,000 re-sequenced accessions each for soybean, Arabidopsis, and maize. RESULTS: The Allele Catalog Tool was designed originally with soybean genomic data and resources. The Allele Catalog datasets were generated using our variant calling pipeline (SnakyVC) and the Allele Catalog pipeline (AlleleCatalog). The variant calling pipeline is developed to parallelly process raw sequencing reads to generate the Variant Call Format (VCF) files, and the Allele Catalog pipeline takes VCF files to perform imputations, functional effect predictions, and assemble alleles for each gene to generate curated Allele Catalog datasets. Both pipelines were utilized to generate the data panels (VCF files and Allele Catalog files) in which the accessions of the WGRS datasets were collected from various sources, currently representing over 1,000 diverse accessions for soybean, Arabidopsis, and maize individually. The main features of the Allele Catalog Tool include data query, visualization of results, categorical filtering, and download functions. Queries are performed from user input, and results are a tabular format of summary results by categorical description and genotype results of the alleles for each gene. The categorical information is specific to each species; additionally, available detailed meta-information is provided in modal popups. The genotypic information contains the variant positions, reference or alternate genotypes, the functional effect classes, and the amino-acid changes of each accession. Besides that, the results can also be downloaded for other research purposes. CONCLUSIONS: The Allele Catalog Tool is a web-based tool that currently supports three species: soybean, Arabidopsis, and maize. The Soybean Allele Catalog Tool is hosted on the SoyKB website ( https://soykb.org/SoybeanAlleleCatalogTool/ ), while the Allele Catalog Tool for Arabidopsis and maize is hosted on the KBCommons website ( https://kbcommons.org/system/tools/AlleleCatalogTool/Zmays and https://kbcommons.org/system/tools/AlleleCatalogTool/Athaliana ). Researchers can use this tool to connect variant alleles of genes with meta-information of species.


Assuntos
Alelos , Arabidopsis , Mineração de Dados , Conjuntos de Dados como Assunto , Glycine max , Internet , Software , Zea mays , Mutação , Glycine max/genética , Zea mays/genética , Arabidopsis/genética , Visualização de Dados , Genes de Plantas/genética , Pigmentação/genética , Dormência de Plantas/genética , Frequência do Gene , Substituição de Aminoácidos , Genótipo , Metadados , Mineração de Dados/métodos
2.
Plant Physiol ; 188(1): 111-133, 2022 01 20.
Artigo em Inglês | MEDLINE | ID: mdl-34618082

RESUMO

Maize (Zea mays) seeds are a good source of protein, despite being deficient in several essential amino acids. However, eliminating the highly abundant but poorly balanced seed storage proteins has revealed that the regulation of seed amino acids is complex and does not rely on only a handful of proteins. In this study, we used two complementary omics-based approaches to shed light on the genes and biological processes that underlie the regulation of seed amino acid composition. We first conducted a genome-wide association study to identify candidate genes involved in the natural variation of seed protein-bound amino acids. We then used weighted gene correlation network analysis to associate protein expression with seed amino acid composition dynamics during kernel development and maturation. We found that almost half of the proteome was significantly reduced during kernel development and maturation, including several translational machinery components such as ribosomal proteins, which strongly suggests translational reprogramming. The reduction was significantly associated with a decrease in several amino acids, including lysine and methionine, pointing to their role in shaping the seed amino acid composition. When we compared the candidate gene lists generated from both approaches, we found a nonrandom overlap of 80 genes. A functional analysis of these genes showed a tight interconnected cluster dominated by translational machinery genes, especially ribosomal proteins, further supporting the role of translation dynamics in shaping seed amino acid composition. These findings strongly suggest that seed biofortification strategies that target the translation machinery dynamics should be considered and explored further.


Assuntos
Aminoácidos/metabolismo , Biossíntese de Proteínas/efeitos dos fármacos , Proteínas de Armazenamento de Sementes/genética , Proteínas de Armazenamento de Sementes/metabolismo , Sementes/metabolismo , Zea mays/genética , Zea mays/metabolismo , Aminoácidos/genética , Produtos Agrícolas/genética , Produtos Agrícolas/metabolismo , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Variação Genética , Estudo de Associação Genômica Ampla , Genômica , Genótipo , Metabolômica , Fenótipo , Sementes/genética
3.
Physiol Plant ; 174(2): e13672, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-35297059

RESUMO

Advances in next-generation sequencing and other high-throughput technologies have facilitated multiomics research, such as genomics, epigenomics, transcriptomics, proteomics, metabolomics, and phenomics. The resultant emerging multiomics data have brought new challenges as well as opportunities, as seen in the plant and agriculture science domains. We reviewed several bioinformatic and computational methods, models, and platforms, and we have highlighted some of our in-house developed efforts aimed at multiomics data analysis, integration, and management issues faced by the research community. A case study using multiomics datasets generated from our studies of maize nodal root growth under water deficit stress demonstrates the power of these datasets and some other publicly available tools. This analysis also sheds light on the landscape of such applied bioinformatic tools currently available for plant and crop science studies and introduces emerging trends and how they may affect the future.


Assuntos
Biologia Computacional , Zea mays , Agricultura , Biologia Computacional/métodos , Genômica/métodos , Plantas , Água , Zea mays/genética
4.
Bioinformatics ; 36(17): 4655-4657, 2020 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-32579187

RESUMO

MOTIVATION: Advanced publicly available sequencing data from large populations have enabled informative genome-wide association studies (GWAS) that associate SNPs with phenotypic traits of interest. Many publicly available tools able to perform GWAS have been developed in response to increased demand. However, these tools lack a comprehensive pipeline that includes both pre-GWAS analysis, such as outlier removal, data transformation and calculation of Best Linear Unbiased Predictions or Best Linear Unbiased Estimates. In addition, post-GWAS analysis, such as haploblock analysis and candidate gene identification, is lacking. RESULTS: Here, we present Holistic Analysis with Pre- and Post-Integration (HAPPI) GWAS, an open-source GWAS tool able to perform pre-GWAS, GWAS and post-GWAS analysis in an automated pipeline using the command-line interface. AVAILABILITY AND IMPLEMENTATION: HAPPI GWAS is written in R for any Unix-like operating systems and is available on GitHub (https://github.com/Angelovici-Lab/HAPPI.GWAS.git). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Estudo de Associação Genômica Ampla , Software , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
5.
Plant Physiol ; 183(2): 483-500, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32317360

RESUMO

Gln is a key player in plant metabolism. It is one of the major free amino acids that is transported into the developing seed and is central for nitrogen metabolism. However, Gln natural variation and its regulation and interaction with other metabolic processes in seeds remain poorly understood. To investigate the latter, we performed a metabolic genome-wide association study (mGWAS) of Gln-related traits measured from the dry seeds of the Arabidopsis (Arabidopsis thaliana) diversity panel using all potential ratios between Gln and the other members of the Glu family as traits. This semicombinatorial approach yielded multiple candidate genes that, upon further analysis, revealed an unexpected association between the aliphatic glucosinolates (GLS) and the Gln-related traits. This finding was confirmed by an independent quantitative trait loci mapping and statistical analysis of the relationships between the Gln-related traits and the presence of specific GLS in seeds. Moreover, an analysis of Arabidopsis mutants lacking GLS showed an extensive seed-specific impact on Gln levels and composition that manifested early in seed development. The elimination of GLS in seeds was associated with a large effect on seed nitrogen and sulfur homeostasis, which conceivably led to the Gln response. This finding indicates that both Gln and GLS play key roles in shaping the seed metabolic homeostasis. It also implies that select secondary metabolites might have key functions in primary seed metabolism. Finally, our study shows that an mGWAS performed on dry seeds can uncover key metabolic interactions that occur early in seed development.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Glucosinolatos/metabolismo , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Regulação da Expressão Gênica de Plantas/genética , Regulação da Expressão Gênica de Plantas/fisiologia , Fenótipo , Locos de Características Quantitativas/genética
6.
Genes (Basel) ; 14(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36672864

RESUMO

The genome-wide association study (GWAS) is a popular genomic approach that identifies genomic regions associated with a phenotype and, thus, aims to discover causative mutations (CM) in the genes underlying the phenotype. However, GWAS discoveries are limited by many factors and typically identify associated genomic regions without the further ability to compare the viability of candidate genes and actual CMs. Therefore, the current methodology is limited to CM identification. In our recent work, we presented a novel approach to an empowered "GWAS to Genes" strategy that we named Synthetic phenotype to causative mutation (SP2CM). We established this strategy to identify CMs in soybean genes and developed a web-based tool for accuracy calculation (AccuTool) for a reference panel of soybean accessions. Here, we describe our further development of the tool that extends its utilization for other species and named it AccuCalc. We enhanced the tool for the analysis of datasets with a low-frequency distribution of a rare phenotype by automated formatting of a synthetic phenotype and added another accuracy-based GWAS evaluation criterion to the accuracy calculation. We designed AccuCalc as a Python package for GWAS data analysis for any user-defined species-independent variant calling format (vcf) or HapMap format (hmp) as input data. AccuCalc saves analysis outputs in user-friendly tab-delimited formats and also offers visualization of the GWAS results as Manhattan plots accentuated by accuracy. Under the hood of Python, AccuCalc is publicly available and, thus, can be used conveniently for the SP2CM strategy utilization for every species.


Assuntos
Estudo de Associação Genômica Ampla , Genômica , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Genoma , Fenótipo , Mutação
7.
Front Genet ; 14: 1251382, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37928239

RESUMO

The rapid growth of sequencing technology and its increasing popularity in biology-related research over the years has made whole genome re-sequencing (WGRS) data become widely available. A large amount of WGRS data can unlock the knowledge gap between genomics and phenomics through gaining an understanding of the genomic variations that can lead to phenotype changes. These genomic variations are usually comprised of allele and structural changes in DNA, and these changes can affect the regulatory mechanisms causing changes in gene expression and altering the phenotypes of organisms. In this research work, we created the GenVarX toolset, that is backed by transcription factor binding sequence data in promoter regions, the copy number variations data, SNPs and Indels data, and phenotypes data which can potentially provide insights about phenotypic differences and solve compelling questions in plant research. Analytics-wise, we have developed strategies to better utilize the WGRS data and mine the data using efficient data processing scripts, libraries, tools, and frameworks to create the interactive and visualization-enhanced GenVarX toolset that encompasses both promoter regions and copy number variation analysis components. The main capabilities of the GenVarX toolset are to provide easy-to-use interfaces for users to perform queries, visualize data, and interact with the data. Based on different input windows on the user interface, users can provide inputs corresponding to each field and submit the information as a query. The data returned on the results page is usually displayed in a tabular fashion. In addition, interactive figures are also included in the toolset to facilitate the visualization of statistical results or tool outputs. Currently, the GenVarX toolset supports soybean, rice, and Arabidopsis. The researchers can access the soybean GenVarX toolset from SoyKB via https://soykb.org/SoybeanGenVarX/, rice GenVarX toolset, and Arabidopsis GenVarX toolset from KBCommons web portal with links https://kbcommons.org/system/tools/GenVarX/Osativa and https://kbcommons.org/system/tools/GenVarX/Athaliana, respectively.

8.
Front Genet ; 14: 1320652, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38259621

RESUMO

Genome-to-phenome research in agriculture aims to improve crops through in silico predictions. Genome-wide association study (GWAS) is potent in identifying genomic loci that underlie important traits. As a statistical method, increasing the sample quantity, data quality, or diversity of the GWAS dataset positively impacts GWAS power. For more precise breeding, concrete candidate genes with exact functional variants must be discovered. Many post-GWAS methods have been developed to narrow down the associated genomic regions and, ideally, to predict candidate genes and causative mutations (CMs). Historical natural selection and breeding-related artificial selection both act to change the frequencies of different alleles of genes that control phenotypes. With higher diversity and more extensive GWAS datasets, there is an increased chance of multiple alleles with independent CMs in a single causal gene. This can be caused by the presence of samples from geographically isolated regions that arose during natural or artificial selection. This simple fact is a complicating factor in GWAS-driven discoveries. Currently, none of the existing association methods address this issue and need to identify multiple alleles and, more specifically, the actual CMs. Therefore, we developed a tool that computes a score for a combination of variant positions in a single candidate gene and, based on the highest score, identifies the best number and combination of CMs. The tool is publicly available as a Python package on GitHub, and we further created a web-based Multiple Alleles discovery (MADis) tool that supports soybean and is hosted in SoyKB (https://soykb.org/SoybeanMADisTool/). We tested and validated the algorithm and presented the utilization of MADis in a pod pigmentation L1 gene case study with multiple CMs from natural or artificial selection. Finally, we identified a candidate gene for the pod color L2 locus and predicted the existence of multiple alleles that potentially cause loss of pod pigmentation. In this work, we show how a genomic analysis can be employed to explore the natural and artificial selection of multiple alleles and, thus, improve and accelerate crop breeding in agriculture.

9.
J Adv Res ; 42: 117-133, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36513408

RESUMO

INTRODUCTION: Genome-Wide Association Studies (GWAS) identify tagging variants in the genome that are statistically associated with the phenotype because of their linkage disequilibrium (LD) relationship with the causative mutation (CM). When both low-density genotyped accession panels with phenotypes and resequenced data accession panels are available, tagging variants can assist with post-GWAS challenges in CM discovery. OBJECTIVES: Our objective was to identify additional GWAS evaluation criteria to assess correspondence between genomic variants and phenotypes, as well as enable deeper analysis of the localized landscape of association. METHODS: We used genomic variant positions as Synthetic phenotypes in GWAS that we named "Synthetic phenotype association study" (SPAS). The extreme case of SPAS is what we call an "Inverse GWAS" where we used CM positions of cloned soybean genes. We developed and validated the Accuracy concept as a measure of the correspondence between variant positions and phenotypes. RESULTS: The SPAS approach demonstrated that the genotype status of an associated variant used as a Synthetic phenotype enabled us to explore the relationships between tagging variants and CMs, and further, that utilizing CMs as Synthetic phenotypes in Inverse GWAS illuminated the landscape of association. We implemented the Accuracy calculation for a curated accession panel to an online Accuracy calculation tool (AccuTool) as a resource for gene identification in soybean. We demonstrated our concepts on three examples of soybean cloned genes. As a result of our findings, we devised an enhanced "GWAS to Genes" analysis (Synthetic phenotype to CM strategy, SP2CM). Using SP2CM, we identified a CM for a novel gene. CONCLUSION: The SP2CM strategy utilizing Synthetic phenotypes and the Accuracy calculation of correspondence provides crucial information to assist researchers in CM discovery. The impact of this work is a more effective evaluation of landscapes of GWAS associations.


Assuntos
Estudo de Associação Genômica Ampla , Genômica , Fenótipo , Desequilíbrio de Ligação , Genótipo
10.
Front Plant Sci ; 13: 889066, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35574141

RESUMO

Adaptation of soybean cultivars to the photoperiod in which they are grown is critical for optimizing plant yield. However, despite its importance, only the major loci conferring variation in flowering time and maturity of US soybean have been isolated. By contrast, over 200 genes contributing to floral induction in the model organism Arabidopsis thaliana have been described. In this work, putative alleles of a library of soybean orthologs of these Arabidopsis flowering genes were tested for their latitudinal distribution among elite US soybean lines developed in the United States. Furthermore, variants comprising the alleles of genes with significant differences in latitudinal distribution were assessed for amino acid conservation across disparate genera to infer their impact on gene function. From these efforts, several candidate genes from various biological pathways were identified that are likely being exploited toward adaptation of US soybean to various maturity groups.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA