RESUMEN
A growing interest in Cannabis sativa uses for food, fiber, and medicine, and recent changes in regulations have spurred numerous genomic studies of this once-prohibited plant. Cannabis research uses Next Generation Sequencing technologies for genomics and transcriptomics. While other crops have genome portals enabling access and analysis of numerous genotyping data from diverse accessions, leading to the discovery of alleles for important traits, this is absent for cannabis. The CannSeek web portal aims to address this gap. Single nucleotide polymorphism datasets were generated by identifying genome variants from public resequencing data and genome assemblies. Results and accompanying trait data are hosted in the CannSeek web application, built using the Rice SNP-Seek infrastructure with improvements to allow multiple reference genomes and provide a web-service Application Programming Interface. The tools built into the portal allow phylogenetic analyses, varietal grouping and identifications, and favorable haplotype discovery for cannabis accessions using public sequencing data. Availability and implementation: The CannSeek portal is available at https://icgrc.info/cannseek, https://icgrc.info/genotype_viewer.
RESUMEN
Populations can adapt to stressful environments through changes in gene expression. However, the role of gene regulation in mediating stress response and adaptation remains largely unexplored. Here, we use an integrative field dataset obtained from 780 plants of Oryza sativa ssp. indica (rice) grown in a field experiment under normal or moderate salt stress conditions to examine selection and evolution of gene expression variation under salinity stress conditions. We find that salinity stress induces increased selective pressure on gene expression. Further, we show that trans-eQTLs rather than cis-eQTLs are primarily associated with rice's gene expression under salinity stress, potentially via a few master-regulators. Importantly, and contrary to the expectations, we find that cis-trans reinforcement is more common than cis-trans compensation which may be reflective of rice diversification subsequent to domestication. We further identify genetic fixation as the likely mechanism underlying this compensation/reinforcement. Additionally, we show that cis- and trans-eQTLs are under different selection regimes, giving us insights into the evolutionary dynamics of gene expression variation. By examining genomic, transcriptomic, and phenotypic variation across a rice population, we gain insights into the molecular and genetic landscape underlying adaptive salinity stress responses, which is relevant for other crops and other stresses.
RESUMEN
BACKGROUND: As the number of genome-wide association study (GWAS) and quantitative trait locus (QTL) mappings in rice continues to grow, so does the already long list of genomic loci associated with important agronomic traits. Typically, loci implicated by GWAS/QTL analysis contain tens to hundreds to thousands of single-nucleotide polmorphisms (SNPs)/genes, not all of which are causal and many of which are in noncoding regions. Unraveling the biological mechanisms that tie the GWAS regions and QTLs to the trait of interest is challenging, especially since it requires collating functional genomics information about the loci from multiple, disparate data sources. RESULTS: We present RicePilaf, a web app for post-GWAS/QTL analysis, that performs a slew of novel bioinformatics analyses to cross-reference GWAS results and QTL mappings with a host of publicly available rice databases. In particular, it integrates (i) pangenomic information from high-quality genome builds of multiple rice varieties, (ii) coexpression information from genome-scale coexpression networks, (iii) ontology and pathway information, (iv) regulatory information from rice transcription factor databases, (v) epigenomic information from multiple high-throughput epigenetic experiments, and (vi) text-mining information extracted from scientific abstracts linking genes and traits. We demonstrate the utility of RicePilaf by applying it to analyze GWAS peaks of preharvest sprouting and genes underlying yield-under-drought QTLs. CONCLUSIONS: RicePilaf enables rice scientists and breeders to shed functional light on their GWAS regions and QTLs, and it provides them with a means to prioritize SNPs/genes for further experiments. The source code, a Docker image, and a demo version of RicePilaf are publicly available at https://github.com/bioinfodlsu/rice-pilaf.
Asunto(s)
Minería de Datos , Estudio de Asociación del Genoma Completo , Oryza , Sitios de Carácter Cuantitativo , Oryza/genética , Programas Informáticos , Epigenómica/métodos , Biología Computacional/métodos , Polimorfismo de Nucleótido Simple , Genómica/métodos , Genoma de Planta , Mapeo Cromosómico , Bases de Datos GenéticasRESUMEN
Phosphorylation is the most studied post-translational modification, and has multiple biological functions. In this study, we have reanalyzed publicly available mass spectrometry proteomics data sets enriched for phosphopeptides from Asian rice (Oryza sativa). In total we identified 15,565 phosphosites on serine, threonine, and tyrosine residues on rice proteins. We identified sequence motifs for phosphosites, and link motifs to enrichment of different biological processes, indicating different downstream regulation likely caused by different kinase groups. We cross-referenced phosphosites against the rice 3,000 genomes, to identify single amino acid variations (SAAVs) within or proximal to phosphosites that could cause loss of a site in a given rice variety and clustered the data to identify groups of sites with similar patterns across rice family groups. The data has been loaded into UniProt Knowledge-Baseâenabling researchers to visualize sites alongside other data on rice proteins, e.g., structural models from AlphaFold2, PeptideAtlas, and the PRIDE databaseâenabling visualization of source evidence, including scores and supporting mass spectra.
Asunto(s)
Genoma de Planta , Oryza , Fosfoproteínas , Proteínas de Plantas , Proteómica , Transducción de Señal , Oryza/genética , Oryza/metabolismo , Oryza/química , Proteómica/métodos , Fosfoproteínas/metabolismo , Fosfoproteínas/genética , Fosfoproteínas/química , Fosfoproteínas/análisis , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Fosforilación , Procesamiento Proteico-Postraduccional , Fosfopéptidos/metabolismo , Fosfopéptidos/análisis , Bases de Datos de Proteínas , Secuencias de Aminoácidos , Espectrometría de MasasRESUMEN
BACKGROUND: Single-nucleotide polymorphisms (SNPs) are the most widely used form of molecular genetic variation studies. As reference genomes and resequencing data sets expand exponentially, tools must be in place to call SNPs at a similar pace. The genome analysis toolkit (GATK) is one of the most widely used SNP calling software tools publicly available, but unfortunately, high-performance computing versions of this tool have yet to become widely available and affordable. RESULTS: Here we report an open-source high-performance computing genome variant calling workflow (HPC-GVCW) for GATK that can run on multiple computing platforms from supercomputers to desktop machines. We benchmarked HPC-GVCW on multiple crop species for performance and accuracy with comparable results with previously published reports (using GATK alone). Finally, we used HPC-GVCW in production mode to call SNPs on a "subpopulation aware" 16-genome rice reference panel with ~ 3000 resequenced rice accessions. The entire process took ~ 16 weeks and resulted in the identification of an average of 27.3 M SNPs/genome and the discovery of ~ 2.3 million novel SNPs that were not present in the flagship reference genome for rice (i.e., IRGSP RefSeq). CONCLUSIONS: This study developed an open-source pipeline (HPC-GVCW) to run GATK on HPC platforms, which significantly improved the speed at which SNPs can be called. The workflow is widely applicable as demonstrated successfully for four major crop species with genomes ranging in size from 400 Mb to 2.4 Gb. Using HPC-GVCW in production mode to call SNPs on a 25 multi-crop-reference genome data set produced over 1.1 billion SNPs that were publicly released for functional and breeding studies. For rice, many novel SNPs were identified and were found to reside within genes and open chromatin regions that are predicted to have functional consequences. Combined, our results demonstrate the usefulness of combining a high-performance SNP calling architecture solution with a subpopulation-aware reference genome panel for rapid SNP discovery and public deployment.
Asunto(s)
Genoma de Planta , Polimorfismo de Nucleótido Simple , Flujo de Trabajo , Fitomejoramiento , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodosRESUMEN
Phosphorylation is the most studied post-translational modification, and has multiple biological functions. In this study, we have re-analysed publicly available mass spectrometry proteomics datasets enriched for phosphopeptides from Asian rice (Oryza sativa). In total we identified 15,522 phosphosites on serine, threonine and tyrosine residues on rice proteins. We identified sequence motifs for phosphosites, and link motifs to enrichment of different biological processes, indicating different downstream regulation likely caused by different kinase groups. We cross-referenced phosphosites against the rice 3,000 genomes, to identify single amino acid variations (SAAVs) within or proximal to phosphosites that could cause loss of a site in a given rice variety. The data was clustered to identify groups of sites with similar patterns across rice family groups, for example those highly conserved in Japonica, but mostly absent in Aus type rice varieties - known to have different responses to drought. These resources can assist rice researchers to discover alleles with significantly different functional effects across rice varieties. The data has been loaded into UniProt Knowledge-Base - enabling researchers to visualise sites alongside other data on rice proteins e.g. structural models from AlphaFold2, PeptideAtlas and the PRIDE database - enabling visualisation of source evidence, including scores and supporting mass spectra.
RESUMEN
Traditional rice varieties have been critical for developing improved stress-tolerant rice varieties. Tools to analyze the genome sequences of traditional varieties are accelerating the identification and deployment of genes conferring climate change resilience.
Asunto(s)
Oryza , Oryza/genética , Fitomejoramiento , Cambio ClimáticoRESUMEN
Breeding staple crops with increased micronutrient concentration is a sustainable approach to address micronutrient malnutrition. We carried out Multi-Cross QTL analysis and Inclusive Composite Interval Mapping for 11 agronomic, yield and biofortification traits using four connected RILs populations of rice. Overall, MC-156 QTLs were detected for agronomic (115) and biofortification (41) traits, which were higher in number but smaller in effects compared to single population analysis. The MC-QTL analysis was able to detect important QTLs viz: qZn5.2, qFe7.1, qGY10.1, qDF7.1, qPH1.1, qNT4.1, qPT4.1, qPL1.2, qTGW5.1, qGL3.1 , and qGW6.1 , which can be used in rice genomics assisted breeding. A major QTL (qZn5.2 ) for grain Zn concentration has been detected on chromosome 5 that accounted for 13% of R2. In all, 26 QTL clusters were identified on different chromosomes. qPH6.1 epistatically interacted with qZn5.1 and qGY6.2 . Most of QTLs were co-located with functionally related candidate genes indicating the accuracy of QTL mapping. The genomic region of qZn5.2 was co-located with putative genes such as OsZIP5, OsZIP9, and LOC_OS05G40490 that are involved in Zn uptake. These genes included polymorphic functional SNPs, and their promoter regions were enriched with cis-regulatory elements involved in plant growth and development, and biotic and abiotic stress tolerance. Major effect QTL identified for biofortification and agronomic traits can be utilized in breeding for Zn biofortified rice varieties.
RESUMEN
Bacterial blight resistance gene B5 has received little attention since it was first described in 1950. A near-isogenic line (NIL) of Gossypium hirsutum cotton, AcB5, was generated in an otherwise bacterial-blight-susceptible 'Acala 44' background. The introgressed locus B5 in AcB5 conferred strong and broad-spectrum resistance to bacterial blight. Segregation patterns of test crosses under Oklahoma field conditions indicated that AcB5 is likely homozygous for resistance at two loci with partial dominance gene action. In controlled-environment conditions, two of the four copies of B5 were required for effective resistance. Contrary to expectations of gene-for-gene theory, AcB5 conferred high resistance toward isogenic strains of Xanthomonas citri subsp. malvacearum carrying cloned avirulence genes avrB4, avrb7, avrBIn, avrB101, and avrB102, respectively, and weaker resistance toward the strain carrying cloned avrb6. The hypothesis that each B gene, in the absence of a polygenic complex, triggers sesquiterpenoid phytoalexin production was tested by measurement of cadalene and lacinilene phytoalexins during resistant responses in five NILs carrying different B genes, four other lines carrying multiple resistance genes, as well as susceptible Ac44E. Phytoalexin production was an obvious, but variable, response in all nine resistant lines. AcB5 accumulated an order of magnitude more of all four phytoalexins than any of the other resistant NILs. Its total levels were comparable to those detected in OK1.2, a highly resistant line that possesses several B genes in a polygenic background.
Asunto(s)
Sesquiterpenos , Xanthomonas , Gossypium/genética , Gossypium/microbiología , Fitoalexinas , Enfermedades de las Plantas/microbiología , Xanthomonas/genéticaRESUMEN
Seed deterioration during storage results in poor germination, reduced vigour, and non-uniform seedling emergence. The aging rate depends on storage conditions and genetic factors. This study aims to identify these genetic factors determining the longevity of rice (Oryza sativa L.) seeds stored under experimental aging conditions mimicking long-term dry storage. Genetic variation for tolerance to aging was studied in 300 Indica rice accessions by storing dry seeds under an elevated partial pressure of oxygen (EPPO) condition. A genome-wide association analysis identified 11 unique genomic regions for all measured germination parameters after aging, differing from those previously identified in rice under humid experimental aging conditions. The significant single nucleotide polymorphism in the most prominent region was located within the Rc gene, encoding a basic helix-loop-helix transcription factor. Storage experiments using near-isogenic rice lines (SD7-1D (Rc) and SD7-1d (rc) with the same allelic variation confirmed the role of the wildtype Rc gene, providing stronger tolerance to dry EPPO aging. In the seed pericarp, a functional Rc gene results in accumulation of proanthocyanidins, an important sub-class of flavonoids having strong antioxidant activity, which may explain the variation in tolerance to dry EPPO aging.
Asunto(s)
Oryza , Oryza/genética , Estudio de Asociación del Genoma Completo , Germinación/genética , Plantones/genética , Semillas/genéticaRESUMEN
BACKGROUND: Asian rice Oryza sativa, first domesticated in East Asia, has considerable success in African fields. When and where this introduction occurred is unclear. Rice varieties of Asian origin may have evolved locally during and after migration to Africa, resulting in unique adaptations, particularly in relation to upland cultivation as frequently practiced in Africa. METHODS: We investigated the genetic differentiation between Asian and African varieties using the 3000 Rice Genomes SNP dataset. African upland cultivars were first characterized using principal component analysis among 292 tropical Japonica accessions from Africa and Asia. The particularities of African accessions were then explored using two inference techniques, PCA-KDE for supervised classification and chromosome painting, and ELAI for individual allelic dosage monitoring. KEY RESULTS: Ambiguities of local differentiation between Japonica and other groups pointed at genomic segments that potentially resulted from genetic exchange. Those specific to West African upland accessions were concentrated on chromosome 6 and featured several cAus introgression signals, including a large one between 17.9 and 21.7 Mb. We found iHS statistics in support of positive selection in this region and we provide a list of candidate genes enriched in GO terms that have regulatory functions involved in stress responses that could have facilitated adaptation to harsh upland growing conditions.
RESUMEN
Drought stress in Southeast Asia greatly affects rice production, and the rice root system plays a substantial role in avoiding drought stress. In this study, we examined the phenotypic and genetic correlations among root anatomical, morphological, and agronomic phenotypes over multiple field seasons. A set of >200 rice accessions from Southeast Asia (a subset of the 3000 Rice Genomes Project) was characterized with the aim to identify root morphological and anatomical phenotypes related to productivity under drought stress. Drought stress resulted in slight increases in the basal metaxylem and stele diameter of nodal roots. Although few direct correlations between root phenotypes and grain yield were identified, biomass was consistently positively correlated with crown root number and negatively correlated with stele diameter. The accessions with highest grain yield were characterized by higher crown root numbers and median metaxylem diameter and smaller stele diameter. Genome-wide association study (GWAS) revealed 162 and 210 significant SNPs associated with root phenotypes in the two seasons which resulted in identification of 59 candidate genes related to root development. The gene OsRSL3 was found in a QTL region for median metaxylem diameter. Four SNPs in OsRSL3 were found that caused amino acid changes and significantly associated with the root phenotype. Based on the haplotype analysis for median metaxylem diameter, the rice accessions studied were classified into five allele combinations in order to identify the most favorable haplotypes. The candidate genes and favorable haplotypes provide information useful for the genetic improvement of root phenotypes under drought stress.
RESUMEN
Crop wild relatives represent valuable reservoirs of variation for breeding, but their populations are threatened in natural habitats, are sparsely represented in genebanks, and most are poorly characterized. The focus of this study is the Oryza rufipogon species complex (ORSC), wild progenitor of Asian rice (Oryza sativa L.). The ORSC comprises perennial, annual and intermediate forms which were historically designated as O. rufipogon, O. nivara, and O. sativa f. spontanea (or Oryza spp., an annual form of mixed O. rufipogon/O. nivara and O. sativa ancestry), respectively, based on non-standardized morphological, geographical, and/or ecologically-based species definitions and boundaries. Here, a collection of 240 diverse ORSC accessions, characterized by genotyping-by-sequencing (113,739 SNPs), was phenotyped for 44 traits associated with plant, panicle, and seed morphology in the screenhouse at the International Rice Research Institute, Philippines. These traits included heritable phenotypes often recorded as characterization data by genebanks. Over 100 of these ORSC accessions were also phenotyped in the greenhouse for 18 traits in Stuttgart, Arkansas, and 16 traits in Ithaca, New York, United States. We implemented a Bayesian Gaussian mixture model to infer accession groups from a subset of these phenotypic data and ascertained three phenotype-based group assignments. We used concordance between the genotypic subpopulations and these phenotype-based groups to identify a suite of phenotypic traits that could reliably differentiate the ORSC populations, whether measured in tropical or temperate regions. The traits provide insight into plant morphology, life history (perenniality versus annuality) and mating habit (self- versus cross-pollinated), and are largely consistent with genebank species designations. One phenotypic group contains predominantly O. rufipogon accessions characterized as perennial and largely out-crossing and one contains predominantly O. nivara accessions characterized as annual and largely inbreeding. From these groups, 42 "core" O. rufipogon and 25 "core" O. nivara accessions were identified for domestication studies. The third group, comprising 20% of our collection, has the most accessions identified as Oryza spp. (51.2%) and levels of O. sativa admixture accounting for more than 50% of the genome. This third group is potentially useful as a "pre-breeding" pool for breeders attempting to incorporate novel variation into elite breeding lines.
RESUMEN
Crop landraces have unique local agroecological and societal functions and offer important genetic resources for plant breeding. Recognition of the value of landrace diversity and concern about its erosion on farms have led to sustained efforts to establish ex situ collections worldwide. The degree to which these efforts have succeeded in conserving landraces has not been comprehensively assessed. Here we modelled the potential distributions of eco-geographically distinguishable groups of landraces of 25 cereal, pulse and starchy root/tuber/fruit crops within their geographic regions of diversity. We then analysed the extent to which these landrace groups are represented in genebank collections, using geographic and ecological coverage metrics as a proxy for genetic diversity. We find that ex situ conservation of landrace groups is currently moderately comprehensive on average, with substantial variation among crops; a mean of 63% ± 12.6% of distributions is currently represented in genebanks. Breadfruit, bananas and plantains, lentils, common beans, chickpeas, barley and bread wheat landrace groups are among the most fully represented, whereas the largest conservation gaps persist for pearl millet, yams, finger millet, groundnut, potatoes and peas. Geographic regions prioritized for further collection of landrace groups for ex situ conservation include South Asia, the Mediterranean and West Asia, Mesoamerica, sub-Saharan Africa, the Andean mountains of South America and Central to East Asia. With further progress to fill these gaps, a high degree of representation of landrace group diversity in genebanks is feasible globally, thus fulfilling international targets for their ex situ conservation.
Asunto(s)
Productos Agrícolas , Fitomejoramiento , Productos Agrícolas/genética , Asia Oriental , América del Sur , Triticum/genéticaRESUMEN
The aus rice variety group originated in stress-prone regions and is a promising source for the development of new stress-tolerant rice cultivars. In this study, an aus panel (~220 genotypes) was evaluated in field trials under well-watered and drought conditions and in the greenhouse (basket, herbicide and lysimeter studies) to investigate relationships between grain yield and root architecture, and to identify component root traits behind the composite trait of deep root growth. In the field trials, high and stable grain yield was positively related to high and stable deep root growth (r = 0.16), which may indicate response to within-season soil moisture fluctuations (i.e., plasticity). When dissecting component traits related to deep root growth (including angle, elongation and branching), the number of nodal roots classified as 'large-diameter' was positively related to deep root growth (r = 0.24), and showed the highest number of colocated genome-wide association study (GWAS) peaks with grain yield under drought. The role of large-diameter nodal roots in deep root growth may be related to their branching potential. Two candidate loci that colocated for yield and root traits were identified that showed distinct haplotype distributions between contrasting yield/stability groups and could be good candidates to contribute to rice improvement.
Asunto(s)
Oryza , Mapeo Cromosómico , Sequías , Grano Comestible , Estudio de Asociación del Genoma Completo , Oryza/fisiologíaRESUMEN
Pre-harvest sprouting (PHS), induced by unexpected weather events, such as typhoons, at the late seed maturity stage, is becoming a serious threat to rice production, especially in the state of California, USA, Japan, and the Republic of Korea, where japonica varieties (mostly susceptible to PHS) are mainly cultivated. A projected economic loss by severe PHS in these three countries could range between 8-10 billion USD per year during the next 10 years. Here, we present promising rice germplasm with strong resistance to PHS that were selected from a diverse rice panel of accessions held in the International Rice Genebank (IRG) at the International Rice Research Institute (IRRI). To induce PHS, three panicle samples per accession were harvested at 20 and 30 days after flowering (DAF), respectively, and incubated at 100% relative humidity (RH), 30 °C in a growth chamber for 15 days. A genome-wide association (GWA) analysis using a 4.8 million single nucleotide polymorphisms (SNP) marker set was performed to identify loci and candidate genes conferring PHS resistance. Interestingly, two tropical japonica and four temperate japonica accessions showed outstanding PHS resistance as compared to tolerant indica accessions. Two major loci on chromosomes 1 and 4 were associated with PHS resistance. A priori candidate genes interactions with rice gene networks, which are based on the gene ontology (GO), co-expression, and other evidence, suggested that a key resistance mechanism is related to abscisic acid (ABA), gibberellic acid (GA), and auxin mediated signaling pathways.