Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
1.
BMC Biol ; 22(1): 13, 2024 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-38273258

RESUMEN

BACKGROUND: Single-nucleotide polymorphisms (SNPs) are the most widely used form of molecular genetic variation studies. As reference genomes and resequencing data sets expand exponentially, tools must be in place to call SNPs at a similar pace. The genome analysis toolkit (GATK) is one of the most widely used SNP calling software tools publicly available, but unfortunately, high-performance computing versions of this tool have yet to become widely available and affordable. RESULTS: Here we report an open-source high-performance computing genome variant calling workflow (HPC-GVCW) for GATK that can run on multiple computing platforms from supercomputers to desktop machines. We benchmarked HPC-GVCW on multiple crop species for performance and accuracy with comparable results with previously published reports (using GATK alone). Finally, we used HPC-GVCW in production mode to call SNPs on a "subpopulation aware" 16-genome rice reference panel with ~ 3000 resequenced rice accessions. The entire process took ~ 16 weeks and resulted in the identification of an average of 27.3 M SNPs/genome and the discovery of ~ 2.3 million novel SNPs that were not present in the flagship reference genome for rice (i.e., IRGSP RefSeq). CONCLUSIONS: This study developed an open-source pipeline (HPC-GVCW) to run GATK on HPC platforms, which significantly improved the speed at which SNPs can be called. The workflow is widely applicable as demonstrated successfully for four major crop species with genomes ranging in size from 400 Mb to 2.4 Gb. Using HPC-GVCW in production mode to call SNPs on a 25 multi-crop-reference genome data set produced over 1.1 billion SNPs that were publicly released for functional and breeding studies. For rice, many novel SNPs were identified and were found to reside within genes and open chromatin regions that are predicted to have functional consequences. Combined, our results demonstrate the usefulness of combining a high-performance SNP calling architecture solution with a subpopulation-aware reference genome panel for rapid SNP discovery and public deployment.


Asunto(s)
Genoma de Planta , Polimorfismo de Nucleótido Simple , Flujo de Trabajo , Fitomejoramiento , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
2.
J Proteome Res ; 23(7): 2518-2531, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38810119

RESUMEN

Phosphorylation is the most studied post-translational modification, and has multiple biological functions. In this study, we have reanalyzed publicly available mass spectrometry proteomics data sets enriched for phosphopeptides from Asian rice (Oryza sativa). In total we identified 15,565 phosphosites on serine, threonine, and tyrosine residues on rice proteins. We identified sequence motifs for phosphosites, and link motifs to enrichment of different biological processes, indicating different downstream regulation likely caused by different kinase groups. We cross-referenced phosphosites against the rice 3,000 genomes, to identify single amino acid variations (SAAVs) within or proximal to phosphosites that could cause loss of a site in a given rice variety and clustered the data to identify groups of sites with similar patterns across rice family groups. The data has been loaded into UniProt Knowledge-Base─enabling researchers to visualize sites alongside other data on rice proteins, e.g., structural models from AlphaFold2, PeptideAtlas, and the PRIDE database─enabling visualization of source evidence, including scores and supporting mass spectra.


Asunto(s)
Genoma de Planta , Oryza , Fosfoproteínas , Proteínas de Plantas , Proteómica , Transducción de Señal , Oryza/genética , Oryza/metabolismo , Oryza/química , Proteómica/métodos , Fosfoproteínas/metabolismo , Fosfoproteínas/genética , Fosfoproteínas/química , Fosfoproteínas/análisis , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Fosforilación , Procesamiento Proteico-Postraduccional , Fosfopéptidos/metabolismo , Fosfopéptidos/análisis , Bases de Datos de Proteínas , Secuencias de Aminoácidos , Espectrometría de Masas
3.
Plant Physiol ; 193(4): 2381-2397, 2023 Nov 22.
Artículo en Inglés | MEDLINE | ID: mdl-37665979

RESUMEN

Developing drought-resistant rice (Oryza sativa, L.) is essential for improving field productivity, especially in rain-fed areas affected by climate change. Wild relatives of rice are potential sources for drought-resistant traits. Therefore, we compared root growth and drought response among 22 wild Oryza species, from which Oryza glumaepatula was selected as a promising source for further exploration. A geographically diverse panel of 69 O. glumaepatula accessions was then screened for drought stress-related traits, and 6 of these accessions showed lower shoot dry weight (SDW) reduction, greater percentage of deep roots, and lower stomatal density (STO) under drought than the drought tolerant O. sativa variety, Sahbhagi dhan. Based on whole-genome resequencing of all 69 O. glumaepatula accessions and variant calling to a high-quality O. glumaepatula reference genome, we detected multiple genomic loci colocating for SDW, root dry weight at 30 to 45 cm depth, and STO in consecutive drought trials. Geo-referencing indicated that the potential drought donors originated in flood-prone locations, corroborating previous hypotheses about the coexistence of flood and drought tolerance within individual Oryza genomes. These findings present potential donor accessions, traits, and genomic loci from an AA genome wild relative of rice that, together with the recently developed reference genome, may be useful for further introgression of drought tolerance into the O. sativa backgrounds.


Asunto(s)
Oryza , Oryza/genética , Resistencia a la Sequía , Fenotipo , Genoma de Planta/genética , Sequías
4.
Nature ; 557(7703): 43-49, 2018 05.
Artículo en Inglés | MEDLINE | ID: mdl-29695866

RESUMEN

Here we analyse genetic variation, population structure and diversity among 3,010 diverse Asian cultivated rice (Oryza sativa L.) genomes from the 3,000 Rice Genomes Project. Our results are consistent with the five major groups previously recognized, but also suggest several unreported subpopulations that correlate with geographic location. We identified 29 million single nucleotide polymorphisms, 2.4 million small indels and over 90,000 structural variations that contribute to within- and between-population variation. Using pan-genome analyses, we identified more than 10,000 novel full-length protein-coding genes and a high number of presence-absence variations. The complex patterns of introgression observed in domestication genes are consistent with multiple independent rice domestication events. The public availability of data from the 3,000 Rice Genomes Project provides a resource for rice genomics research and breeding.


Asunto(s)
Productos Agrícolas/clasificación , Productos Agrícolas/genética , Variación Genética , Genoma de Planta/genética , Oryza/clasificación , Oryza/genética , Asia , Evolución Molecular , Genes de Plantas/genética , Genética de Población , Genómica , Haplotipos , Mutación INDEL/genética , Filogenia , Fitomejoramiento , Polimorfismo de Nucleótido Simple/genética
5.
Genome Res ; 29(5): 870-880, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30992303

RESUMEN

Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5' UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice.


Asunto(s)
Variación Genética , Genoma de Planta , Variación Estructural del Genoma , Genómica/métodos , Oryza/genética , Alelos , Mapeo Cromosómico , Elementos Transponibles de ADN , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Análisis de Secuencia de ADN/métodos , Estrés Fisiológico/genética
6.
Plant Cell Environ ; 45(3): 854-870, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-35099814

RESUMEN

The aus rice variety group originated in stress-prone regions and is a promising source for the development of new stress-tolerant rice cultivars. In this study, an aus panel (~220 genotypes) was evaluated in field trials under well-watered and drought conditions and in the greenhouse (basket, herbicide and lysimeter studies) to investigate relationships between grain yield and root architecture, and to identify component root traits behind the composite trait of deep root growth. In the field trials, high and stable grain yield was positively related to high and stable deep root growth (r = 0.16), which may indicate response to within-season soil moisture fluctuations (i.e., plasticity). When dissecting component traits related to deep root growth (including angle, elongation and branching), the number of nodal roots classified as 'large-diameter' was positively related to deep root growth (r = 0.24), and showed the highest number of colocated genome-wide association study (GWAS) peaks with grain yield under drought. The role of large-diameter nodal roots in deep root growth may be related to their branching potential. Two candidate loci that colocated for yield and root traits were identified that showed distinct haplotype distributions between contrasting yield/stability groups and could be good candidates to contribute to rice improvement.


Asunto(s)
Oryza , Mapeo Cromosómico , Sequías , Grano Comestible , Estudio de Asociación del Genoma Completo , Oryza/fisiología
7.
Int J Mol Sci ; 22(9)2021 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-33923150

RESUMEN

Tolerance of anaerobic germination (AG) is a key trait in the development of direct seeded rice. Through rapid and sustained coleoptile elongation, AG tolerance enables robust seedling establishment under flooded conditions. Previous attempts to fine map and characterize AG2 (qAG7.1), a major centromere-spanning AG tolerance QTL, derived from the indica variety Ma-Zhan Red, have failed. Here, a novel approach of "enriched haplotype" genome-wide association study based on the Ma-Zhan Red haplotype in the AG2 region was successfully used to narrow down AG2 from more than 7 Mb to less than 0.7 Mb. The AG2 peak region contained 27 genes, including the Rc gene, responsible for red pericarp development in pigmented rice. Through comparative variant and transcriptome analysis between AG tolerant donors and susceptible accessions several candidate genes potentially controlling AG2 were identified, among them several regulatory genes. Genome-wide comparative transcriptome analysis suggested differential regulation of sugar metabolism, particularly trehalose metabolism, as well as differential regulation of cell wall modification and chloroplast development to be implicated in AG tolerance mechanisms.


Asunto(s)
Cromosomas de las Plantas/genética , Estudio de Asociación del Genoma Completo , Germinación , Oryza/genética , Proteínas de Plantas/metabolismo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Anaerobiosis , Mapeo Cromosómico , Perfilación de la Expresión Génica , Oryza/crecimiento & desarrollo , Proteínas de Plantas/genética
8.
Nucleic Acids Res ; 45(D1): D1075-D1081, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899667

RESUMEN

We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genoma de Planta , Mutación INDEL , Oryza/genética , Polimorfismo de Nucleótido Simple , Motor de Búsqueda , Programas Informáticos , Alelos , Biología Computacional/métodos , Frecuencia de los Genes , Sitios Genéticos , Genómica/métodos , Genotipo , Interfaz Usuario-Computador , Navegador Web
9.
Nucleic Acids Res ; 43(Database issue): D1023-7, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25429973

RESUMEN

We have identified about 20 million rice SNPs by aligning reads from the 3000 rice genomes project with the Nipponbare genome. The SNPs and allele information are organized into a SNP-Seek system (http://www.oryzasnp.org/iric-portal/), which consists of Oracle database having a total number of rows with SNP genotypes close to 60 billion (20 M SNPs × 3 K rice lines) and web interface for convenient querying. The database allows quick retrieving of SNP alleles for all varieties in a given genome region, finding different alleles from predefined varieties and querying basic passport and morphological phenotypic information about sequenced rice lines. SNPs can be visualized together with the gene structures in JBrowse genome browser. Evolutionary relationships between rice varieties can be explored using phylogenetic trees or multidimensional scaling plots.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genoma de Planta , Oryza/genética , Polimorfismo de Nucleótido Simple , Oryza/anatomía & histología
10.
Gigascience ; 132024 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-38832465

RESUMEN

BACKGROUND: As the number of genome-wide association study (GWAS) and quantitative trait locus (QTL) mappings in rice continues to grow, so does the already long list of genomic loci associated with important agronomic traits. Typically, loci implicated by GWAS/QTL analysis contain tens to hundreds to thousands of single-nucleotide polmorphisms (SNPs)/genes, not all of which are causal and many of which are in noncoding regions. Unraveling the biological mechanisms that tie the GWAS regions and QTLs to the trait of interest is challenging, especially since it requires collating functional genomics information about the loci from multiple, disparate data sources. RESULTS: We present RicePilaf, a web app for post-GWAS/QTL analysis, that performs a slew of novel bioinformatics analyses to cross-reference GWAS results and QTL mappings with a host of publicly available rice databases. In particular, it integrates (i) pangenomic information from high-quality genome builds of multiple rice varieties, (ii) coexpression information from genome-scale coexpression networks, (iii) ontology and pathway information, (iv) regulatory information from rice transcription factor databases, (v) epigenomic information from multiple high-throughput epigenetic experiments, and (vi) text-mining information extracted from scientific abstracts linking genes and traits. We demonstrate the utility of RicePilaf by applying it to analyze GWAS peaks of preharvest sprouting and genes underlying yield-under-drought QTLs. CONCLUSIONS: RicePilaf enables rice scientists and breeders to shed functional light on their GWAS regions and QTLs, and it provides them with a means to prioritize SNPs/genes for further experiments. The source code, a Docker image, and a demo version of RicePilaf are publicly available at https://github.com/bioinfodlsu/rice-pilaf.


Asunto(s)
Minería de Datos , Estudio de Asociación del Genoma Completo , Oryza , Sitios de Carácter Cuantitativo , Oryza/genética , Programas Informáticos , Epigenómica/métodos , Biología Computacional/métodos , Polimorfismo de Nucleótido Simple , Genómica/métodos , Genoma de Planta , Mapeo Cromosómico , Bases de Datos Genéticas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA