Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 139
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38529488

RESUMO

The combination of ultra-long Oxford Nanopore (ONT) sequencing reads with long, accurate PacBio HiFi reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely-studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the ultra-long reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and has the potential to provide a single-instrument solution for the reconstruction of complete genomes.

2.
BMC Biol ; 22(1): 13, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38273258

RESUMO

BACKGROUND: Single-nucleotide polymorphisms (SNPs) are the most widely used form of molecular genetic variation studies. As reference genomes and resequencing data sets expand exponentially, tools must be in place to call SNPs at a similar pace. The genome analysis toolkit (GATK) is one of the most widely used SNP calling software tools publicly available, but unfortunately, high-performance computing versions of this tool have yet to become widely available and affordable. RESULTS: Here we report an open-source high-performance computing genome variant calling workflow (HPC-GVCW) for GATK that can run on multiple computing platforms from supercomputers to desktop machines. We benchmarked HPC-GVCW on multiple crop species for performance and accuracy with comparable results with previously published reports (using GATK alone). Finally, we used HPC-GVCW in production mode to call SNPs on a "subpopulation aware" 16-genome rice reference panel with ~ 3000 resequenced rice accessions. The entire process took ~ 16 weeks and resulted in the identification of an average of 27.3 M SNPs/genome and the discovery of ~ 2.3 million novel SNPs that were not present in the flagship reference genome for rice (i.e., IRGSP RefSeq). CONCLUSIONS: This study developed an open-source pipeline (HPC-GVCW) to run GATK on HPC platforms, which significantly improved the speed at which SNPs can be called. The workflow is widely applicable as demonstrated successfully for four major crop species with genomes ranging in size from 400 Mb to 2.4 Gb. Using HPC-GVCW in production mode to call SNPs on a 25 multi-crop-reference genome data set produced over 1.1 billion SNPs that were publicly released for functional and breeding studies. For rice, many novel SNPs were identified and were found to reside within genes and open chromatin regions that are predicted to have functional consequences. Combined, our results demonstrate the usefulness of combining a high-performance SNP calling architecture solution with a subpopulation-aware reference genome panel for rapid SNP discovery and public deployment.


Assuntos
Genoma de Planta , Polimorfismo de Nucleotídeo Único , Fluxo de Trabalho , Melhoramento Vegetal , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos
3.
Nucleic Acids Res ; 52(D1): D107-D114, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37992296

RESUMO

Expression Atlas (www.ebi.ac.uk/gxa) and its newest counterpart the Single Cell Expression Atlas (www.ebi.ac.uk/gxa/sc) are EMBL-EBI's knowledgebases for gene and protein expression and localisation in bulk and at single cell level. These resources aim to allow users to investigate their expression in normal tissue (baseline) or in response to perturbations such as disease or changes to genotype (differential) across multiple species. Users are invited to search for genes or metadata terms across species or biological conditions in a standardised consistent interface. Alongside these data, new features in Single Cell Expression Atlas allow users to query metadata through our new cell type wheel search. At the experiment level data can be explored through two types of dimensionality reduction plots, t-distributed Stochastic Neighbor Embedding (tSNE) and Uniform Manifold Approximation and Projection (UMAP), overlaid with either clustering or metadata information to assist users' understanding. Data are also visualised as marker gene heatmaps identifying genes that help confer cluster identity. For some data, additional visualisations are available as interactive cell level anatomograms and cell type gene expression heatmaps.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Proteômica , Genótipo , Metadados , Análise de Célula Única , Internet , Humanos , Animais
4.
Plant J ; 117(5): 1543-1557, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38100514

RESUMO

Mutant populations are crucial for functional genomics and discovering novel traits for crop breeding. Sorghum, a drought and heat-tolerant C4 species, requires a vast, large-scale, annotated, and sequenced mutant resource to enhance crop improvement through functional genomics research. Here, we report a sorghum large-scale sequenced mutant population with 9.5 million ethyl methane sulfonate (EMS)-induced mutations that covered 98% of sorghum's annotated genes using inbred line BTx623. Remarkably, a total of 610 320 mutations within the promoter and enhancer regions of 18 000 and 11 790 genes, respectively, can be leveraged for novel research of cis-regulatory elements. A comparison of the distribution of mutations in the large-scale mutant library and sorghum association panel (SAP) provides insights into the influence of selection. EMS-induced mutations appeared to be random across different regions of the genome without significant enrichment in different sections of a gene, including the 5' UTR, gene body, and 3'-UTR. In contrast, there were low variation density in the coding and UTR regions in the SAP. Based on the Ka /Ks value, the mutant library (~1) experienced little selection, unlike the SAP (0.40), which has been strongly selected through breeding. All mutation data are publicly searchable through SorbMutDB (https://www.depts.ttu.edu/igcast/sorbmutdb.php) and SorghumBase (https://sorghumbase.org/). This current large-scale sequence-indexed sorghum mutant population is a crucial resource that enriched the sorghum gene pool with novel diversity and a highly valuable tool for the Poaceae family, that will advance plant biology research and crop breeding.


Assuntos
Sorghum , Sorghum/genética , Genética Reversa , Melhoramento Vegetal , Mutação , Fenótipo , Grão Comestível/genética , Metanossulfonato de Etila/farmacologia , Genoma de Planta/genética
5.
Front Plant Sci ; 14: 1237722, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37965006

RESUMO

Metal homeostasis has evolved to tightly modulate the availability of metals within the cell, avoiding cytotoxic interactions due to excess and protein inactivity due to deficiency. Even in the presence of homeostatic processes, however, low bioavailability of these essential metal nutrients in soils can negatively impact crop health and yield. While research has largely focused on how plants assimilate metals, acclimation to metal-limited environments requires a suite of strategies that are not necessarily involved in metal transport across membranes. The identification of these mechanisms provides a new opportunity to improve metal-use efficiency and develop plant foodstuffs with increased concentrations of bioavailable metal nutrients. Here, we investigate the function of two distinct subfamilies of the nucleotide-dependent metallochaperones (NMCs), named ZNG1 and ZNG2, that are found in plants, using Arabidopsis thaliana as a reference organism. AtZNG1 (AT1G26520) is an ortholog of human and fungal ZNG1, and like its previously characterized eukaryotic relatives, localizes to the cytosol and physically interacts with methionine aminopeptidase type I (AtMAP1A). Analysis of AtZNG1, AtMAP1A, AtMAP2A, and AtMAP2B transgenic mutants are consistent with the role of Arabidopsis ZNG1 as a Zn transferase for AtMAP1A, as previously described in yeast and zebrafish. Structural modeling reveals a flexible cysteine-rich loop that we hypothesize enables direct transfer of Zn from AtZNG1 to AtMAP1A during GTP hydrolysis. Based on proteomics and transcriptomics, loss of this ancient and conserved mechanism has pleiotropic consequences impacting the expression of hundreds of genes, including those involved in photosynthesis and vesicle transport. Members of the plant-specific family of NMCs, ZNG2A1 (AT1G80480) and ZNG2A2 (AT1G15730), are also required during Zn deficiency, but their target protein(s) remain to be discovered. RNA-seq analyses reveal wide-ranging impacts across the cell when the genes encoding these plastid-localized NMCs are disrupted.

6.
Plant Biotechnol J ; 21(12): 2458-2472, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37530518

RESUMO

Numerous staple crops exhibit polyploidy and are difficult to genetically modify. However, recent advances in genome sequencing and editing have enabled polyploid genome engineering. The hexaploid black nightshade species Solanum nigrum has immense potential as a beneficial food supplement. We assembled its genome at the scaffold level. After functional annotations, we identified homoeologous gene sets, with similar sequence and expression profiles, based on comparative analyses of orthologous genes with close diploid relatives Solanum americanum and S. lycopersicum. Using CRISPR-Cas9-mediated mutagenesis, we generated various mutation combinations in homoeologous genes. Multiple mutants showed quantitative phenotypic changes based on the genotype, resulting in a broad-spectrum effect on the quantitative traits of hexaploid S. nigrum. Furthermore, we successfully improved the fruit productivity of Boranong, an orphan cultivar of S. nigrum suggesting that engineering homoeologous genes could be useful for agricultural improvement of polyploid crops.


Assuntos
Produtos Agrícolas , Poliploidia , Sequência de Bases , Mapeamento Cromossômico/métodos , Mutação , Fenótipo , Produtos Agrícolas/genética , Genoma de Planta/genética , Edição de Genes
7.
Hortic Res ; 10(5): uhad061, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37213686

RESUMO

Grapevine is one of the most economically important crops worldwide. However, the previous versions of the grapevine reference genome tipically consist of thousands of fragments with missing centromeres and telomeres, limiting the accessibility of the repetitive sequences, the centromeric and telomeric regions, and the study of inheritance of important agronomic traits in these regions. Here, we assembled a telomere-to-telomere (T2T) gap-free reference genome for the cultivar PN40024 using PacBio HiFi long reads. The T2T reference genome (PN_T2T) is 69 Mb longer with 9018 more genes identified than the 12X.v0 version. We annotated 67% repetitive sequences, 19 centromeres and 36 telomeres, and incorporated gene annotations of previous versions into the PN_T2T assembly. We detected a total of 377 gene clusters, which showed associations with complex traits, such as aroma and disease resistance. Even though PN40024 derives from nine generations of selfing, we still found nine genomic hotspots of heterozygous sites associated with biological processes, such as the oxidation-reduction process and protein phosphorylation. The fully annotated complete reference genome therefore constitutes an important resource for grapevine genetic studies and breeding programs.

8.
G3 (Bethesda) ; 13(5)2023 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-36966465

RESUMO

The genome sequence of the diploid and highly homozygous Vitis vinifera genotype PN40024 serves as the reference for many grapevine studies. Despite several improvements to the PN40024 genome assembly, its current version PN12X.v2 is quite fragmented and only represents the haploid state of the genome with mixed haplotypes. In fact, being nearly homozygous, this genome contains several heterozygous regions that are yet to be resolved. Taking the opportunity of improvements that long-read sequencing technologies offer to fully discriminate haplotype sequences, an improved version of the reference, called PN40024.v4, was generated. Through incorporating long genomic sequencing reads to the assembly, the continuity of the 12X.v2 scaffolds was highly increased with a total number decreasing from 2,059 to 640 and a reduction in N bases of 88%. Additionally, the full alternative haplotype sequence was built for the first time, the chromosome anchoring was improved and the number of unplaced scaffolds was reduced by half. To obtain a high-quality gene annotation that outperforms previous versions, a liftover approach was complemented with an optimized annotation workflow for Vitis. Integration of the gene reference catalogue and its manual curation have also assisted in improving the annotation, while defining the most reliable estimation of 35,230 genes to date. Finally, we demonstrated that PN40024 resulted from 9 selfings of cv. "Helfensteiner" (cross of cv. "Pinot noir" and "Schiava grossa") instead of a single "Pinot noir". These advances will help maintain the PN40024 genome as a gold-standard reference, also contributing toward the eventual elaboration of the grapevine pangenome.


Assuntos
Genoma de Planta , Vitis , Genótipo , Mapeamento Cromossômico , Sequência de Bases , Anotação de Sequência Molecular , Vitis/genética
9.
Nat Commun ; 14(1): 1567, 2023 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-36944612

RESUMO

Understanding and exploiting genetic diversity is a key factor for the productive and stable production of rice. Here, we utilize 73 high-quality genomes that encompass the subpopulation structure of Asian rice (Oryza sativa), plus the genomes of two wild relatives (O. rufipogon and O. punctata), to build a pan-genome inversion index of 1769 non-redundant inversions that span an average of ~29% of the O. sativa cv. Nipponbare reference genome sequence. Using this index, we estimate an inversion rate of ~700 inversions per million years in Asian rice, which is 16 to 50 times higher than previously estimated for plants. Detailed analyses of these inversions show evidence of their effects on gene expression, recombination rate, and linkage disequilibrium. Our study uncovers the prevalence and scale of large inversions (≥100 bp) across the pan-genome of Asian rice and hints at their largely unexplored role in functional biology and crop performance.


Assuntos
Oryza , Oryza/genética , Análise de Sequência de DNA , Genoma de Planta/genética , Evolução Biológica , Filogenia
10.
Curr Opin Biotechnol ; 79: 102886, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36640454

RESUMO

Whole-genome sequencing and assembly have revolutionized plant genetics and molecular biology over the last two decades. However, significant shortcomings in first- and second-generation technology resulted in imperfect reference genomes: numerous and large gaps of low quality or undeterminable sequence in areas of highly repetitive DNA along with limited chromosomal phasing restricted the ability of researchers to characterize regulatory noncoding elements and genic regions that underwent recent duplication events. Recently, advances in long-read sequencing have resulted in the first gapless, telomere-to-telomere (T2T) assemblies of plant genomes. This leap forward has the potential to increase the speed and confidence of genomics and molecular experimentation while reducing costs for the research community.


Assuntos
Genômica , Melhoramento Vegetal , Análise de Sequência de DNA/métodos , Genômica/métodos , Genoma de Planta/genética , Plantas/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Tecnologia
11.
Plant Physiol ; 191(1): 35-46, 2023 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-36200899

RESUMO

We review how a data infrastructure for the Plant Cell Atlas might be built using existing infrastructure and platforms. The Human Cell Atlas has developed an extensive infrastructure for human and mouse single cell data, while the European Bioinformatics Institute has developed a Single Cell Expression Atlas, that currently houses several plant data sets. We discuss issues related to appropriate ontologies for describing a plant single cell experiment. We imagine how such an infrastructure will enable biologists and data scientists to glean new insights into plant biology in the coming decades, as long as such data are made accessible to the community in an open manner.


Assuntos
Biologia Computacional , Células Vegetais , Animais , Humanos , Camundongos , Plantas/genética
12.
Int J Mol Sci ; 23(15)2022 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-35955798

RESUMO

In plants, vegetative and reproductive development are associated with agronomically important traits that contribute to grain yield and biomass. Zinc finger homeodomain (ZF-HD) transcription factors (TFs) constitute a relatively small gene family that has been studied in several model plants, including Arabidopsis thaliana L. and Oryza sativa L. The ZF-HD family members play important roles in plant growth and development, but their contribution to the regulation of plant architecture remains largely unknown due to their functional redundancy. To understand the gene regulatory network controlled by ZF-HD TFs, we analyzed multiple loss-of-function mutants of ZF-HD TFs in Arabidopsis that exhibited morphological abnormalities in branching and flowering architecture. We found that ZF-HD TFs, especially HB34, negatively regulate the expression of miR157 and positively regulate SQUAMOSA PROMOTER BINDING-LIKE 10 (SPL10), a target of miR157. Genome-wide chromatin immunoprecipitation sequencing (ChIP-Seq) analysis revealed that miR157D and SPL10 are direct targets of HB34, creating a feed-forward loop that constitutes a robust miRNA regulatory module. Network motif analysis contains overrepresented coherent type IV feedforward motifs in the amiR zf-HD and hbq mutant background. This finding indicates that miRNA-mediated ZF-HD feedforward modules modify branching and inflorescence architecture in Arabidopsis. Taken together, these findings reveal a guiding role of ZF-HD TFs in the regulatory network module and demonstrate its role in plant architecture in Arabidopsis.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , MicroRNAs , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Regulação da Expressão Gênica de Plantas , MicroRNAs/genética , MicroRNAs/metabolismo , Plantas/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Dedos de Zinco
13.
Plant Direct ; 6(5): e393, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-35600998

RESUMO

Efficient acquisition and use of available phosphorus from the soil is crucial for plant growth, development, and yield. With an ever-increasing acreage of croplands with suboptimal available soil phosphorus, genetic improvement of sorghum germplasm for enhanced phosphorus acquisition from soil is crucial to increasing agricultural output and reducing inputs, while confronted with a growing world population and uncertain climate. Sorghum bicolor is a globally important commodity for food, fodder, and forage. Known for robust tolerance to heat, drought, and other abiotic stresses, its capacity for optimal phosphorus use efficiency (PUE) is still being investigated for optimized root system architectures (RSA). Whilst a few RSA-influencing genes have been identified in sorghum and other grasses, the epigenetic impact on expression and tissue-specific activation of candidate PUE genes remains elusive. Here, we present transcriptomic, epigenetic, and regulatory network profiling of RSA modulation in the BTx623 sorghum background in response to limiting phosphorus (LP) conditions. We show that during LP, sorghum RSA is remodeled to increase root length and surface area, likely enhancing its ability to acquire P. Global DNA 5-methylcytosine and H3K4 and H3K27 trimethylation levels decrease in response to LP, while H3K4me3 peaks and DNA hypomethylated regions contain recognition motifs of numerous developmental and nutrient responsive transcription factors that display disparate expression patterns between different root tissues (primary root apex, elongation zone, and lateral root apex).

14.
Genome Biol ; 23(1): 101, 2022 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-35440059

RESUMO

BACKGROUND: Genome-wide association studies (GWAS) aim to correlate phenotypic changes with genotypic variation. Upon transcription, single nucleotide variants (SNVs) may alter mRNA structure, with potential impacts on transcript stability, macromolecular interactions, and translation. However, plant genomes have not been assessed for the presence of these structure-altering polymorphisms or "riboSNitches." RESULTS: We experimentally demonstrate the presence of riboSNitches in transcripts of two Arabidopsis genes, ZINC RIBBON 3 (ZR3) and COTTON GOLGI-RELATED 3 (CGR3), which are associated with continentality and temperature variation in the natural environment. These riboSNitches are also associated with differences in the abundance of their respective transcripts, implying a role in regulating the gene's expression in adaptation to local climate conditions. We then computationally predict riboSNitches transcriptome-wide in mRNAs of 879 naturally inbred Arabidopsis accessions. We characterize correlations between SNPs/riboSNitches in these accessions and 434 climate descriptors of their local environments, suggesting a role of these variants in local adaptation. We integrate this information in CLIMtools V2.0 and provide a new web resource, T-CLIM, that reveals associations between transcript abundance variation and local environmental variation. CONCLUSION: We functionally validate two plant riboSNitches and, for the first time, demonstrate riboSNitch conditionality dependent on temperature, coining the term "conditional riboSNitch." We provide the first pan-genome-wide prediction of riboSNitches in plants. We expand our previous CLIMtools web resource with riboSNitch information and with 1868 additional Arabidopsis genomes and 269 additional climate conditions, which will greatly facilitate in silico studies of natural genetic variation, its phenotypic consequences, and its role in local adaptation.


Assuntos
Arabidopsis , Arabidopsis/genética , Clima , Genoma de Planta , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , RNA Mensageiro
16.
Methods Mol Biol ; 2443: 101-131, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35037202

RESUMO

Gramene is an integrated bioinformatics resource for accessing, visualizing, and comparing plant genomes and biological pathways. Originally targeting grasses, Gramene has grown to host annotations for over 90 plant genomes including agronomically important cereals (e.g., maize, sorghum, wheat, teff), fruits and vegetables (e.g., apple, watermelon, clementine, tomato, cassava), specialty crops (e.g., coffee, olive tree, pistachio, almond), and plants of special or emerging interest (e.g., cotton, tobacco, cannabis, or hemp). For some species, the resource includes multiple varieties of the same species, which has paved the road for the creation of species-specific pan-genome browsers. The resource also features plant research models, including Arabidopsis and C4 warm-season grasses and brassicas, as well as other species that fill phylogenetic gaps for plant evolution studies. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. This chapter outlines system requirements for end-users and database hosting, data types and basic navigation within Gramene, and provides examples of how to (1) explore Gramene's search results, (2) explore gene-centric comparative genomics data visualizations in Gramene, and (3) explore genetic variation associated with a gene locus. This is the first publication describing in detail Gramene's integrated search interface-intended to provide a simplified entry portal for the resource's main data categories (genomic location, phylogeny, gene expression, pathways, and external references) to the most complete and up-to-date set of plant genome and pathway annotations.


Assuntos
Bases de Dados Genéticas , Genoma de Planta , Produtos Agrícolas/genética , Genômica/métodos , Filogenia
17.
Methods Mol Biol ; 2443: 197-209, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35037207

RESUMO

SciApps is an open-source, web-based platform for processing, storing, visualizing, and distributing genomic data and analysis results. Built upon the Tapis (formerly Agave) platform, SciApps brings users TB-scale of data storage via CyVerse Data Store and over one million CPUs via the Extreme Science and Engineering Discovery Environment (XSEDE) resources at Texas Advanced Computing Center (TACC). SciApps provides users ways to chain individual jobs into automated and reproducible workflows in a distributed cloud and provides a management system for data, associated metadata, individual analysis jobs, and multi-step workflows. This chapter provides examples of how to (1) submitting, managing, constructing workflows, (2) using public workflows for Bulked Segregant Analysis (BSA), (3) constructing a Data Analysis Center (DAC), and Data Coordination Center (DCC) for the plant ENCODE project.


Assuntos
Genômica , Software , Biologia Computacional , Genoma de Planta , Genômica/métodos , Armazenamento e Recuperação da Informação , Fluxo de Trabalho
18.
Planta ; 255(2): 35, 2022 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-35015132

RESUMO

MAIN CONCLUSION: SorghumBase provides a community portal that integrates genetic, genomic, and breeding resources for sorghum germplasm improvement. Public research and development in agriculture rely on proper data and resource sharing within stakeholder communities. For plant breeders, agronomists, molecular biologists, geneticists, and bioinformaticians, centralizing desirable data into a user-friendly hub for crop systems is essential for successful collaborations and breakthroughs in germplasm development. Here, we present the SorghumBase web portal ( https://www.sorghumbase.org ), a resource for the sorghum research community. SorghumBase hosts a wide range of sorghum genomic information in a modular framework, built with open-source software, to provide a sustainable platform. This initial release of SorghumBase includes: (1) five sorghum reference genome assemblies in a pan-genome browser; (2) genetic variant information for natural diversity panels and ethyl methanesulfonate (EMS)-induced mutant populations; (3) search interface and integrated views of various data types; (4) links supporting interconnectivity with other repositories including genebank, QTL, and gene expression databases; and (5) a content management system to support access to community news and training materials. SorghumBase offers sorghum investigators improved data collation and access that will facilitate the growth of a robust research community to support genomics-assisted breeding.


Assuntos
Sorghum , Bases de Dados Genéticas , Grão Comestível , Genoma de Planta/genética , Genômica , Internet , Melhoramento Vegetal , Sorghum/genética
19.
Nucleic Acids Res ; 50(D1): D129-D140, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34850121

RESUMO

The EMBL-EBI Expression Atlas is an added value knowledge base that enables researchers to answer the question of where (tissue, organism part, developmental stage, cell type) and under which conditions (disease, treatment, gender, etc) a gene or protein of interest is expressed. Expression Atlas brings together data from >4500 expression studies from >65 different species, across different conditions and tissues. It makes these data freely available in an easy to visualise form, after expert curation to accurately represent the intended experimental design, re-analysed via standardised pipelines that rely on open-source community developed tools. Each study's metadata are annotated using ontologies. The data are re-analyzed with the aim of reproducing the original conclusions of the underlying experiments. Expression Atlas is currently divided into Bulk Expression Atlas and Single Cell Expression Atlas. Expression Atlas contains data from differential studies (microarray and bulk RNA-Seq) and baseline studies (bulk RNA-Seq and proteomics), whereas Single Cell Expression Atlas is currently dedicated to Single Cell RNA-Sequencing (scRNA-Seq) studies. The resource has been in continuous development since 2009 and it is available at https://www.ebi.ac.uk/gxa.


Assuntos
Bases de Dados Genéticas , Proteínas/genética , Proteômica , Software , Biologia Computacional , Perfilação da Expressão Gênica , Humanos , Proteínas/química , RNA-Seq , Análise de Sequência de RNA , Análise de Célula Única
20.
Front Plant Sci ; 13: 1040909, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36684744

RESUMO

Introduction: Sorghum (Sorghum bicolor (L.) Moench) is an agriculturally and economically important staple crop that has immense potential as a bioenergy feedstock due to its relatively high productivity on marginal lands. To capitalize on and further improve sorghum as a potential source of sustainable biofuel, it is essential to understand the genomic mechanisms underlying complex traits related to yield, composition, and environmental adaptations. Methods: Expanding on a recently developed mapping population, we generated de novo genome assemblies for 10 parental genotypes from this population and identified a comprehensive set of over 24 thousand large structural variants (SVs) and over 10.5 million single nucleotide polymorphisms (SNPs). Results: We show that SVs and nonsynonymous SNPs are enriched in different gene categories, emphasizing the need for long read sequencing in crop species to identify novel variation. Furthermore, we highlight SVs and SNPs occurring in genes and pathways with known associations to critical bioenergy-related phenotypes and characterize the landscape of genetic differences between sweet and cellulosic genotypes. Discussion: These resources can be integrated into both ongoing and future mapping and trait discovery for sorghum and its myriad uses including food, feed, bioenergy, and increasingly as a carbon dioxide removal mechanism.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...