Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38337024

RESUMO

SUMMARY: Understanding the effects of genetic variants is crucial for accurately predicting traits and functional outcomes. Recent approaches have utilized artificial intelligence and protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 50 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the Evolutionary Scale Modeling (ESM) protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to potential functional consequences. In addition, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement. AVAILABILITY AND IMPLEMENTATION: The PanEffect code is freely available on GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/PanEffect). A maize implementation of PanEffect and underlying datasets are available at MaizeGDB (https://www.maizegdb.org/effect/maize/).


Assuntos
Bases de Dados Genéticas , Zea mays , Zea mays/genética , Inteligência Artificial , Genoma de Planta , Fenótipo , Software
2.
BMC Genomics ; 25(1): 533, 2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38816789

RESUMO

BACKGROUND: Environmental stress factors, such as biotic and abiotic stress, are becoming more common due to climate variability, significantly affecting global maize yield. Transcriptome profiling studies provide insights into the molecular mechanisms underlying stress response in maize, though the functions of many genes are still unknown. To enhance the functional annotation of maize-specific genes, MaizeGDB has outlined a data-driven approach with an emphasis on identifying genes and traits related to biotic and abiotic stress. RESULTS: We mapped high-quality RNA-Seq expression reads from 24 different publicly available datasets (17 abiotic and seven biotic studies) generated from the B73 cultivar to the recent version of the reference genome B73 (B73v5) and deduced stress-related functional annotation of maize gene models. We conducted a robust meta-analysis of the transcriptome profiles from the datasets to identify maize loci responsive to stress, identifying 3,230 differentially expressed genes (DEGs): 2,555 DEGs regulated in response to abiotic stress, 408 DEGs regulated during biotic stress, and 267 common DEGs (co-DEGs) that overlap between abiotic and biotic stress. We discovered hub genes from network analyses, and among the hub genes of the co-DEGs we identified a putative NAC domain transcription factor superfamily protein (Zm00001eb369060) IDP275, which previously responded to herbivory and drought stress. IDP275 was up-regulated in our analysis in response to eight different abiotic and four different biotic stresses. A gene set enrichment and pathway analysis of hub genes of the co-DEGs revealed hormone-mediated signaling processes and phenylpropanoid biosynthesis pathways, respectively. Using phylostratigraphic analysis, we also demonstrated how abiotic and biotic stress genes differentially evolve to adapt to changing environments. CONCLUSIONS: These results will help facilitate the functional annotation of multiple stress response gene models and annotation in maize. Data can be accessed and downloaded at the Maize Genetics and Genomics Database (MaizeGDB).


Assuntos
Anotação de Sequência Molecular , Estresse Fisiológico , Transcriptoma , Zea mays , Zea mays/genética , Estresse Fisiológico/genética , Regulação da Expressão Gênica de Plantas , Perfilação da Expressão Gênica , Genes de Plantas
3.
BMC Plant Biol ; 21(1): 385, 2021 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-34416864

RESUMO

Research in the past decade has demonstrated that a single reference genome is not representative of a species' diversity. MaizeGDB introduces a pan-genomic approach to hosting genomic data, leveraging the large number of diverse maize genomes and their associated datasets to quickly and efficiently connect genomes, gene models, expression, epigenome, sequence variation, structural variation, transposable elements, and diversity data across genomes so that researchers can easily track the structural and functional differences of a locus and its orthologs across maize. We believe our framework is unique and provides a template for any genomic database poised to host large-scale pan-genomic data.


Assuntos
Confiabilidade dos Dados , Coleta de Dados/métodos , Bases de Dados como Assunto , Genoma de Planta , Genômica , Zea mays/genética , Variação Genética
4.
Nucleic Acids Res ; 47(D1): D1146-D1154, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30407532

RESUMO

Since its 2015 update, MaizeGDB, the Maize Genetics and Genomics database, has expanded to support the sequenced genomes of many maize inbred lines in addition to the B73 reference genome assembly. Curation and development efforts have targeted high quality datasets and tools to support maize trait analysis, germplasm analysis, genetic studies, and breeding. MaizeGDB hosts a wide range of data including recent support of new data types including genome metadata, RNA-seq, proteomics, synteny, and large-scale diversity. To improve access and visualization of data types several new tools have been implemented to: access large-scale maize diversity data (SNPversity), download and compare gene expression data (qTeller), visualize pedigree data (Pedigree Viewer), link genes with phenotype images (MaizeDIG), and enable flexible user-specified queries to the MaizeGDB database (MaizeMine). MaizeGDB also continues to be the community hub for maize research, coordinating activities and providing technical support to the maize research community. Here we report the changes MaizeGDB has made within the last three years to keep pace with recent software and research advances, as well as the pan-genomic landscape that cheaper and better sequencing technologies have made possible. MaizeGDB is accessible online at https://www.maizegdb.org.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma de Planta/genética , Genômica/métodos , Zea mays/genética , Regulação da Expressão Gênica de Plantas , Variação Genética , Armazenamento e Recuperação da Informação/métodos , Internet , Polimorfismo de Nucleotídeo Único , Proteômica/métodos , Interface Usuário-Computador , Zea mays/metabolismo
5.
BMC Genomics ; 21(1): 822, 2020 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-33228531

RESUMO

BACKGROUND: Large genotyping datasets have become commonplace due to efficient, cheap methods for SNP identification. Typical genotyping datasets may have thousands to millions of data points per accession, across tens to thousands of accessions. There is a need for tools to help rapidly explore such datasets, to assess characteristics such as overall differences between accessions and regional anomalies across the genome. RESULTS: We present GCViT (Genotype Comparison Visualization Tool), for visualizing and exploring large genotyping datasets. GCViT can be used to identify introgressions, conserved or divergent genomic regions, pedigrees, and other features for more detailed exploration. The program can be used online or as a local instance for whole genome visualization of resequencing or SNP array data. The program performs comparisons of variants among user-selected accessions to identify allele differences and similarities between accessions and a user-selected reference, providing visualizations through histogram, heatmap, or haplotype views. The resulting analyses and images can be exported in various formats. CONCLUSIONS: GCViT provides methods for interactively visualizing SNP data on a whole genome scale, and can produce publication-ready figures. It can be used in online or local installations. GCViT enables users to confirm or identify genomics regions of interest associated with particular traits. GCViT is freely available at https://github.com/LegumeFederation/gcvit . The 1.0 version described here is available at https://doi.org/10.5281/zenodo.4008713 .


Assuntos
Genoma , Genômica , Software , Genótipo , Haplótipos , Polimorfismo de Nucleotídeo Único
6.
BMC Genomics ; 20(1): 481, 2019 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-31185892

RESUMO

BACKGROUND: Due to the recent domestication of peanut from a single tetraploidization event, relatively little genetic diversity underlies the extensive morphological and agronomic diversity in peanut cultivars today. To broaden the genetic variation in future breeding programs, it is necessary to characterize germplasm accessions for new sources of variation and to leverage the power of genome-wide association studies (GWAS) to discover markers associated with traits of interest. We report an analysis of linkage disequilibrium (LD), population structure, and genetic diversity, and examine the ability of GWA to infer marker-trait associations in the U.S. peanut mini core collection genotyped with a 58 K SNP array. RESULTS: LD persists over long distances in the collection, decaying to r2 = half decay distance at 3.78 Mb. Structure within the collection is best explained when separated into four or five groups (K = 4 and K = 5). At K = 4 and 5, accessions loosely clustered according to market type and subspecies, though with numerous exceptions. Out of 107 accessions, 43 clustered in correspondence to the main market type subgroup whereas 34 did not. The remaining 30 accessions had either missing taxonomic classification or were classified as mixed. Phylogenetic network analysis also clustered accessions into approximately five groups based on their genotypes, with loose correspondence to subspecies and market type. Genome wide association analysis was performed on these lines for 12 seed composition and quality traits. Significant marker associations were identified for arachidic and behenic fatty acid compositions, which despite having low bioavailability in peanut, have been reported to raise cholesterol levels in humans. Other traits such as blanchability showed consistent associations in multiple tests, with plausible candidate genes. CONCLUSIONS: Based on GWA, population structure as well as additional simulation results, we find that the primary limitations of this collection for GWAS are a small collection size, significant remaining structure/genetic similarity and long LD blocks that limit the resolution of association mapping. These results can be used to improve GWAS in peanut in future studies - for example, by increasing the size and reducing structure in the collections used for GWAS.


Assuntos
Arachis/genética , Variação Genética , Desequilíbrio de Ligação , Cromossomos de Plantas/genética , Frequência do Gene , Estudo de Associação Genômica Ampla , Haplótipos , Filogenia , Polimorfismo de Nucleotídeo Único , Dinâmica Populacional
7.
Nucleic Acids Res ; 44(D1): D1181-8, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26546515

RESUMO

Legume Information System (LIS), at http://legumeinfo.org, is a genomic data portal (GDP) for the legume family. LIS provides access to genetic and genomic information for major crop and model legumes. With more than two-dozen domesticated legume species, there are numerous specialists working on particular species, and also numerous GDPs for these species. LIS has been redesigned in the last three years both to better integrate data sets across the crop and model legumes, and to better accommodate specialized GDPs that serve particular legume species. To integrate data sets, LIS provides genome and map viewers, holds synteny mappings among all sequenced legume species and provides a set of gene families to allow traversal among orthologous and paralogous sequences across the legumes. To better accommodate other specialized GDPs, LIS uses open-source GMOD components where possible, and advocates use of common data templates, formats, schemas and interfaces so that data collected by one legume research community are accessible across all legume GDPs, through similar interfaces and using common APIs. This federated model for the legumes is managed as part of the 'Legume Federation' project (accessible via http://legumefederation.org), which can be thought of as an umbrella project encompassing LIS and other legume GDPs.


Assuntos
Bases de Dados Genéticas , Fabaceae/genética , Fabaceae/classificação , Genoma de Planta , Genômica , Internet , Família Multigênica , Filogenia , Proteínas de Plantas/química , Proteínas de Plantas/genética , Estrutura Terciária de Proteína , Locos de Características Quantitativas , Sintenia
8.
Nucleic Acids Res ; 44(D1): D1195-201, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26432828

RESUMO

MaizeGDB is a highly curated, community-oriented database and informatics service to researchers focused on the crop plant and model organism Zea mays ssp. mays. Although some form of the maize community database has existed over the last 25 years, there have only been two major releases. In 1991, the original maize genetics database MaizeDB was created. In 2003, the combined contents of MaizeDB and the sequence data from ZmDB were made accessible as a single resource named MaizeGDB. Over the next decade, MaizeGDB became more sequence driven while still maintaining traditional maize genetics datasets. This enabled the project to meet the continued growing and evolving needs of the maize research community, yet the interface and underlying infrastructure remained unchanged. In 2015, the MaizeGDB team completed a multi-year effort to update the MaizeGDB resource by reorganizing existing data, upgrading hardware and infrastructure, creating new tools, incorporating new data types (including diversity data, expression data, gene models, and metabolic pathways), and developing and deploying a modern interface. In addition to coordinating a data resource, the MaizeGDB team coordinates activities and provides technical support to the maize research community. MaizeGDB is accessible online at http://www.maizegdb.org.


Assuntos
Bases de Dados Genéticas , Zea mays/genética , Expressão Gênica , Genes de Plantas , Variação Genética , Genoma de Planta , Redes e Vias Metabólicas , Modelos Genéticos , Software , Interface Usuário-Computador , Zea mays/metabolismo
9.
Chromosoma ; 122(1-2): 67-75, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23223973

RESUMO

Knobs are conspicuous heterochromatic regions found on the chromosomes of maize and its relatives. The number, locations, and sizes of knobs vary dramatically, with most lines containing between four and eight knobs in mid-arm positions. Prior data suggest that some knobs may reduce recombination. However, comprehensive tests have not been carried out, primarily because most knobs have not been placed on the genetic map. We used fluorescent in situ hybridization and two recombinant inbred populations to map seven knobs and to accurately place three knobs from the B73 inbred on the genomic sequence assembly. The data show that knobs lie in gene-dense regions of the maize genome. Comparisons to 23 other recombinant inbred populations segregating for knobs at the same sites confirm that large knobs can locally reduce crossing over by as much as twofold on a cM/Mb scale. These effects do not extend beyond regions ~10 cM to either side of knobs and do not appear to affect linkage disequilibrium among genes within and near knob repeat regions of the B73 RefGen_v2 assembly.


Assuntos
Cromossomos de Plantas/genética , Heterocromatina/genética , Recombinação Genética , DNA de Plantas , Hibridização in Situ Fluorescente , Zea mays
10.
Genetics ; 227(1)2024 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-38577974

RESUMO

Pan-genomes, encompassing the entirety of genetic sequences found in a collection of genomes within a clade, are more useful than single reference genomes for studying species diversity. This is especially true for a species like Zea mays, which has a particularly diverse and complex genome. Presenting pan-genome data, analyses, and visualization is challenging, especially for a diverse species, but more so when pan-genomic data is linked to extensive gene model and gene data, including classical gene information, markers, insertions, expression and proteomic data, and protein structures as is the case at MaizeGDB. Here, we describe MaizeGDB's expansion to include the genic subset of the Zea pan-genome in a pan-gene data center featuring the maize genomes hosted at MaizeGDB, and the outgroup teosinte Zea genomes from the Pan-Andropoganeae project. The new data center offers a variety of browsing and visualization tools, including sequence alignment visualization, gene trees and other tools, to explore pan-genes in Zea that were calculated by the pipeline Pandagma. Combined, these data will help maize researchers study the complexity and diversity of Zea, and to use the comparative functions to validate pan-gene relationships for a selected gene model.


Assuntos
Bases de Dados Genéticas , Genoma de Planta , Genômica , Zea mays , Zea mays/genética , Genômica/métodos , Filogenia
11.
Genetics ; 224(1)2023 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-36755109

RESUMO

Protein structures play an important role in bioinformatics, such as in predicting gene function or validating gene model annotation. However, determining protein structure was, until now, costly and time-consuming, which resulted in a structural biology bottleneck. With the release of such programs AlphaFold and ESMFold, this bottleneck has been reduced by several orders of magnitude, permitting protein structural comparisons of entire genomes within reasonable timeframes. MaizeGDB has leveraged this technological breakthrough by offering several new tools to accelerate protein structural comparisons between maize and other plants as well as human and yeast outgroups. MaizeGDB also offers bulk downloads of these comparative protein structure data, along with predicted functional annotation information. In this way, MaizeGDB is poised to assist maize researchers in assessing functional homology, gene model annotation quality, and other information unavailable to maize scientists even a few years ago.


Assuntos
Interface Usuário-Computador , Zea mays , Humanos , Zea mays/genética , Zea mays/metabolismo , Bases de Dados Genéticas , Biologia Computacional/métodos , Genoma de Planta , Anotação de Sequência Molecular , Genômica/métodos
12.
G3 (Bethesda) ; 12(1)2022 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-34751378

RESUMO

The fatty acid composition of seed oil is a major determinant of the flavor, shelf-life, and nutritional quality of peanuts. Major QTLs controlling high oil content, high oleic content, and low linoleic content have been characterized in several seed oil crop species. Here, we employ genome-wide association approaches on a recently genotyped collection of 787 plant introduction accessions in the USDA peanut core collection, plus selected improved cultivars, to discover markers associated with the natural variation in fatty acid composition, and to explain the genetic control of fatty acid composition in seed oils. Overall, 251 single nucleotide polymorphisms (SNPs) had significant trait associations with the measured fatty acid components. Twelve SNPs were associated with two or three different traits. Of these loci with apparent pleiotropic effects, 10 were associated with both oleic (C18:1) and linoleic acid (C18:2) content at different positions in the genome. In all 10 cases, the favorable allele had an opposite effect-increasing and lowering the concentration, respectively, of oleic and linoleic acid. The other traits with pleiotropic variant control were palmitic (C16:0), behenic (C22:0), lignoceric (C24:0), gadoleic (C20:1), total saturated, and total unsaturated fatty acid content. One hundred (100) of the significantly associated SNPs were located within 1000 kbp of 55 genes with fatty acid biosynthesis functional annotations. These genes encoded, among others: ACCase carboxyl transferase subunits, and several fatty acid synthase II enzymes. With the exception of gadoleic (C20:1) and lignoceric (C24:0) acid content, which occur at relatively low abundance in cultivated peanuts, all traits had significant SNP interactions exceeding a stringent Bonferroni threshold (α = 1%). We detected 7682 pairwise SNP interactions affecting the relative abundance of fatty acid components in the seed oil. Of these, 627 SNP pairs had at least one SNP within 1000 kbp of a gene with fatty acid biosynthesis functional annotation. We evaluated 168 candidate genes underlying these SNP interactions. Functional enrichment and protein-to-protein interactions supported significant interactions (P-value < 1.0E-16) among the genes evaluated. These results show the complex nature of the biology and genes underlying the variation in seed oil fatty acid composition and contribute to an improved genotype-to-phenotype map for fatty acid variation in peanut seed oil.


Assuntos
Arachis , Ácidos Graxos , Arachis/genética , Ácidos Graxos/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Sementes/genética
13.
Science ; 373(6555): 655-662, 2021 08 06.
Artigo em Inglês | MEDLINE | ID: mdl-34353948

RESUMO

We report de novo genome assemblies, transcriptomes, annotations, and methylomes for the 26 inbreds that serve as the founders for the maize nested association mapping population. The number of pan-genes in these diverse genomes exceeds 103,000, with approximately a third found across all genotypes. The results demonstrate that the ancient tetraploid character of maize continues to degrade by fractionation to the present day. Excellent contiguity over repeat arrays and complete annotation of centromeres revealed additional variation in major cytological landmarks. We show that combining structural variation with single-nucleotide polymorphisms can improve the power of quantitative mapping studies. We also document variation at the level of DNA methylation and demonstrate that unmethylated regions are enriched for cis-regulatory elements that contribute to phenotypic variation.


Assuntos
Genoma de Planta , Anotação de Sequência Molecular , Zea mays/genética , Centrômero/genética , Mapeamento Cromossômico , Cromossomos de Plantas , Metilação de DNA , Resistência à Doença/genética , Genes de Plantas , Variação Genética , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Herança Multifatorial/genética , Fenótipo , Doenças das Plantas , Polimorfismo de Nucleotídeo Único , Sequências Reguladoras de Ácido Nucleico , Análise de Sequência de DNA , Tetraploidia , Transcriptoma , Sequenciamento Completo do Genoma
14.
Front Plant Sci ; 11: 592730, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33193550

RESUMO

MaizeMine is the data mining resource of the Maize Genetics and Genome Database (MaizeGDB; http://maizemine.maizegdb.org). It enables researchers to create and export customized annotation datasets that can be merged with their own research data for use in downstream analyses. MaizeMine uses the InterMine data warehousing system to integrate genomic sequences and gene annotations from the Zea mays B73 RefGen_v3 and B73 RefGen_v4 genome assemblies, Gene Ontology annotations, single nucleotide polymorphisms, protein annotations, homologs, pathways, and precomputed gene expression levels based on RNA-seq data from the Z. mays B73 Gene Expression Atlas. MaizeMine also provides database cross references between genes of alternative gene sets from Gramene and NCBI RefSeq. MaizeMine includes several search tools, including a keyword search, built-in template queries with intuitive search menus, and a QueryBuilder tool for creating custom queries. The Genomic Regions search tool executes queries based on lists of genome coordinates, and supports both the B73 RefGen_v3 and B73 RefGen_v4 assemblies. The List tool allows you to upload identifiers to create custom lists, perform set operations such as unions and intersections, and execute template queries with lists. When used with gene identifiers, the List tool automatically provides gene set enrichment for Gene Ontology (GO) and pathways, with a choice of statistical parameters and background gene sets. With the ability to save query outputs as lists that can be input to new queries, MaizeMine provides limitless possibilities for data integration and meta-analysis.

15.
G3 (Bethesda) ; 10(11): 4013-4026, 2020 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-32887672

RESUMO

Cultivated peanut (Arachis hypogaea) is an important oil, food, and feed crop worldwide. The USDA peanut germplasm collection currently contains 8,982 accessions. In the 1990s, 812 accessions were selected as a core collection on the basis of phenotype and country of origin. The present study reports genotyping results for the entire available core collection. Each accession was genotyped with the Arachis_Axiom2 SNP array, yielding 14,430 high-quality, informative SNPs across the collection. Additionally, a subset of 253 accessions was replicated, using between two and five seeds per accession, to assess heterogeneity within these accessions. The genotypic diversity of the core is mostly captured in five genotypic clusters, which have some correspondence with botanical variety and market type. There is little genetic clustering by country of origin, reflecting peanut's rapid global dispersion in the 18th and 19th centuries. A genetic cluster associated with the hypogaea/aequatoriana/peruviana varieties, with accessions coming primarily from Bolivia, Peru, and Ecuador, is consistent with these having been the earliest landraces. The genetics, phenotypic characteristics, and biogeography are all consistent with previous reports of tetraploid peanut originating in Southeast Bolivia. Analysis of the genotype data indicates an early genetic radiation, followed by regional distribution of major genetic classes through South America, and then a global dissemination that retains much of the early genetic diversity in peanut. Comparison of the genotypic data relative to alleles from the diploid progenitors also indicates that subgenome exchanges, both large and small, have been major contributors to the genetic diversity in peanut.


Assuntos
Arachis , Variação Genética , Alelos , Arachis/genética , Genótipo , Filogenia
16.
Nat Genet ; 51(5): 877-884, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-31043755

RESUMO

Like many other crops, the cultivated peanut (Arachis hypogaea L.) is of hybrid origin and has a polyploid genome that contains essentially complete sets of chromosomes from two ancestral species. Here we report the genome sequence of peanut and show that after its polyploid origin, the genome has evolved through mobile-element activity, deletions and by the flow of genetic information between corresponding ancestral chromosomes (that is, homeologous recombination). Uniformity of patterns of homeologous recombination at the ends of chromosomes favors a single origin for cultivated peanut and its wild counterpart A. monticola. However, through much of the genome, homeologous recombination has created diversity. Using new polyploid hybrids made from the ancestral species, we show how this can generate phenotypic changes such as spontaneous changes in the color of the flowers. We suggest that diversity generated by these genetic mechanisms helped to favor the domestication of the polyploid A. hypogaea over other diploid Arachis species cultivated by humans.


Assuntos
Arachis/genética , Arachis/classificação , Argentina , Cromossomos de Plantas/genética , Produtos Agrícolas/genética , Metilação de DNA , DNA de Plantas/genética , Domesticação , Evolução Molecular , Regulação da Expressão Gênica de Plantas , Variação Genética , Genoma de Planta , Hibridização Genética , Fenótipo , Poliploidia , Recombinação Genética , Especificidade da Espécie , Tetraploidia
17.
Front Genet ; 9: 454, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30356760

RESUMO

The factors behind genome size evolution have been of great interest, considering that eukaryotic genomes vary in size by more than three orders of magnitude. Using a model of two wild peanut relatives, Arachis duranensis and Arachis ipaensis, in which one genome experienced large rearrangements, we find that the main determinant in genome size reduction is a set of inversions that occurred in A. duranensis, and subsequent net sequence removal in the inverted regions. We observe a general pattern in which sequence is lost more rapidly at newly distal (telomeric) regions than it is gained at newly proximal (pericentromeric) regions - resulting in net sequence loss in the inverted regions. The major driver of this process is recombination, determined by the chromosomal location. Any type of genomic rearrangement that exposes proximal regions to higher recombination rates can cause genome size reduction by this mechanism. In comparisons between A. duranensis and A. ipaensis, we find that the inversions all occurred in A. duranensis. Sequence loss in those regions was primarily due to removal of transposable elements. Illegitimate recombination is likely the major mechanism responsible for the sequence removal, rather than unequal intrastrand recombination. We also measure the relative rate of genome size reduction in these two Arachis diploids. We also test our model in other plant species and find that it applies in all cases examined, suggesting our model is widely applicable.

18.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30239679

RESUMO

The future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require data management plans for publicly funded research. Furthermore, the value of data increases exponentially when they are properly stored, described, integrated and shared, so that they can be easily utilized in future analyses. AgBioData (https://www.agbiodata.org) is a consortium of people working at agricultural biological databases, data archives and knowledgbases who strive to identify common issues in database development, curation and management, with the goal of creating database products that are more Findable, Accessible, Interoperable and Reusable. We strive to promote authentic, detailed, accurate and explicit communication between all parties involved in scientific data. As a step toward this goal, we present the current state of biocuration, ontologies, metadata and persistence, database platforms, programmatic (machine) access to data, communication and sustainability with regard to data curation. Each section describes challenges and opportunities for these topics, along with recommendations and best practices.


Assuntos
Agricultura , Bases de Dados Genéticas , Genômica , Cruzamento , Ontologia Genética , Metadados , Inquéritos e Questionários
19.
Nat Genet ; 50(9): 1282-1288, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-30061736

RESUMO

The maize W22 inbred has served as a platform for maize genetics since the mid twentieth century. To streamline maize genome analyses, we have sequenced and de novo assembled a W22 reference genome using short-read sequencing technologies. We show that significant structural heterogeneity exists in comparison to the B73 reference genome at multiple scales, from transposon composition and copy number variation to single-nucleotide polymorphisms. The generation of this reference genome enables accurate placement of thousands of Mutator (Mu) and Dissociation (Ds) transposable element insertions for reverse and forward genetics studies. Annotation of the genome has been achieved using RNA-seq analysis, differential nuclease sensitivity profiling and bisulfite sequencing to map open reading frames, open chromatin sites and DNA methylation profiles, respectively. Collectively, the resources developed here integrate W22 as a community reference genome for functional genomics and provide a foundation for the maize pan-genome.


Assuntos
Elementos de DNA Transponíveis/genética , Genes de Plantas/genética , Genoma de Planta/genética , Zea mays/genética , Cromatina/genética , Cromossomos de Plantas/genética , Variações do Número de Cópias de DNA/genética , Metilação de DNA/genética , DNA de Plantas/genética , Genômica/métodos , Fases de Leitura Aberta/genética , Análise de Sequência de DNA/métodos
20.
Database (Oxford) ; 20172017 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-28605768

RESUMO

The Maize Genetics and Genomics Database (MaizeGDB) team prepared a survey to identify breeders' needs for visualizing pedigrees, diversity data and haplotypes in order to prioritize tool development and curation efforts at MaizeGDB. The survey was distributed to the maize research community on behalf of the Maize Genetics Executive Committee in Summer 2015. The survey garnered 48 responses from maize researchers, of which more than half were self-identified as breeders. The survey showed that the maize researchers considered their top priorities for visualization as: (i) displaying single nucleotide polymorphisms in a given region for a given list of lines, (ii) showing haplotypes for a given list of lines and (iii) presenting pedigree relationships visually. The survey also asked which populations would be most useful to display. The following two populations were on top of the list: (i) 3000 publicly available maize inbred lines used in Romay et al. (Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biol, 2013;14:R55) and (ii) maize lines with expired Plant Variety Protection Act (ex-PVP) certificates. Driven by this strong stakeholder input, MaizeGDB staff are currently working in four areas to improve its interface and web-based tools: (i) presenting immediate progenies of currently available stocks at the MaizeGDB Stock pages, (ii) displaying the most recent ex-PVP lines described in the Germplasm Resources Information Network (GRIN) on the MaizeGDB Stock pages, (iii) developing network views of pedigree relationships and (iv) visualizing genotypes from SNP-based diversity datasets. These survey results can help other biological databases to direct their efforts according to user preferences as they serve similar types of data sets for their communities. Database URL: https://www.maizegdb.org.


Assuntos
Bases de Dados Genéticas , Variação Genética , Haplótipos , Anotação de Sequência Molecular/métodos , Interface Usuário-Computador , Navegador , Zea mays/genética , Anotação de Sequência Molecular/normas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA