RESUMO
Crop domestications are long-term selection experiments that have greatly advanced human civilization. The domestication of cultivated rice (Oryza sativa L.) ranks as one of the most important developments in history. However, its origins and domestication processes are controversial and have long been debated. Here we generate genome sequences from 446 geographically diverse accessions of the wild rice species Oryza rufipogon, the immediate ancestral progenitor of cultivated rice, and from 1,083 cultivated indica and japonica varieties to construct a comprehensive map of rice genome variation. In the search for signatures of selection, we identify 55 selective sweeps that have occurred during domestication. In-depth analyses of the domestication sweeps and genome-wide patterns reveal that Oryza sativa japonica rice was first domesticated from a specific population of O. rufipogon around the middle area of the Pearl River in southern China, and that Oryza sativa indica rice was subsequently developed from crosses between japonica rice and local wild rice as the initial cultivars spread into South East and South Asia. The domestication-associated traits are analysed through high-resolution genetic mapping. This study provides an important resource for rice breeding and an effective genomics approach for crop domestication research.
Assuntos
Agricultura/história , Produtos Agrícolas/genética , Evolução Molecular , Variação Genética/genética , Genoma de Planta/genética , Mapeamento Geográfico , Oryza/genética , Cruzamento/história , Produtos Agrícolas/classificação , Produtos Agrícolas/crescimento & desenvolvimento , Genômica , História Antiga , Oryza/classificação , Oryza/crescimento & desenvolvimento , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Seleção GenéticaRESUMO
Various stable circular RNAs (circRNAs) are newly identified to be the abundance of noncoding RNAs in Archaea, Caenorhabditis elegans, mice, and humans through high-throughput deep sequencing coupled with analysis of massive transcriptional data. CircRNAs play important roles in miRNA function and transcriptional controlling by acting as competing endogenous RNAs or positive regulators on their parent coding genes. However, little is known regarding circRNAs in plants. Here, we report 2354 rice circRNAs that were identified through deep sequencing and computational analysis of ssRNA-seq data. Among them, 1356 are exonic circRNAs. Some circRNAs exhibit tissue-specific expression. Rice circRNAs have a considerable number of isoforms, including alternative backsplicing and alternative splicing circularization patterns. Parental genes with multiple exons are preferentially circularized. Only 484 circRNAs have backsplices derived from known splice sites. In addition, only 92 circRNAs were found to be enriched for miniature inverted-repeat transposable elements (MITEs) in flanking sequences or to be complementary to at least 18-bp flanking intronic sequences, indicating that there are some other production mechanisms in addition to direct backsplicing in rice. Rice circRNAs have no significant enrichment for miRNA target sites. A transgenic study showed that overexpression of a circRNA construct could reduce the expression level of its parental gene in transgenic plants compared with empty-vector control plants. This suggested that circRNA and its linear form might act as a negative regulator of its parental gene. Overall, these analyses reveal the prevalence of circRNAs in rice and provide new biological insights into rice circRNAs.
Assuntos
Oryza/genética , RNA de Plantas/genética , Transcriptoma , Sequência de Bases , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Oryza/metabolismo , RNA de Plantas/metabolismo , Análise de Sequência de RNARESUMO
The functional complexity of the rice transcriptome is not yet fully elucidated, despite many studies having reported the use of DNA microarrays. Next-generation DNA sequencing technologies provide a powerful approach for mapping and quantifying the transcriptome, termed RNA sequencing (RNA-seq). In this study, we applied RNA-seq to globally sample transcripts of the cultivated rice Oryza sativa indica and japonica subspecies for resolving the whole-genome transcription profiles. We identified 15,708 novel transcriptional active regions (nTARs), of which 51.7% have no homolog to public protein data and >63% are putative single-exon transcripts, which are highly different from protein-coding genes (<20%). We found that approximately 48% of rice genes show alternative splicing patterns, a percentage considerably higher than previous estimations. On the basis of the available rice gene models, 83.1% (46,472 genes) of the current rice gene models were validated by RNA-seq, and 6228 genes were identified to be extended at the 5' and/or 3' ends by at least 50 bp. Comparative transcriptome analysis demonstrated that 3464 genes exhibited differential expression patterns. The ratio of SNPs with nonsynonymous/synonymous mutations was nearly 1:1.06. In total, we interrogated and compared transcriptomes of the two rice subspecies to reveal the overall transcriptional landscape at maximal resolution.
Assuntos
Perfilação da Expressão Gênica , Oryza/genética , Análise de Sequência de RNA , Processamento Alternativo , Sequência de Bases , Genes de Plantas , Genoma de Planta , Anotação de Sequência Molecular , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo ÚnicoRESUMO
BACKGROUND: Cis-natural antisense transcripts (cis-NATs) are RNAs transcribed from the antisense strand of a gene locus, and are complementary to the RNA transcribed from the sense strand. Common techniques including microarray approach and analysis of transcriptome databases are the major ways to globally identify cis-NATs in various eukaryotic organisms. Genome-wide in silico analysis has identified a large number of cis-NATs that may generate endogenous short interfering RNAs (nat-siRNAs), which participate in important biogenesis mechanisms for transcriptional and post-transcriptional regulation in rice. However, the transcriptomes are yet to be deeply sequenced to comprehensively investigate cis-NATs. RESULTS: We applied high-throughput strand-specific complementary DNA sequencing technology (ssRNA-seq) to deeply sequence mRNA for assessing sense and antisense transcripts that were derived under salt, drought and cold stresses, and normal conditions, in the model plant rice (Oryza sativa). Combined with RAP-DB genome annotation (the Rice Annotation Project Database build-5 data set), 76,013 transcripts corresponding to 45,844 unique gene loci were assembled, in which 4873 gene loci were newly identified. Of 3819 putative rice cis-NATs, 2292 were detected as expressed and giving rise to small RNAs from their overlapping regions through integrated analysis of ssRNA-seq data and small RNA data. Among them, 503 cis-NATs seemed to be associated with specific conditions. The deep sequence data from isolated epidermal cells of rice seedlings further showed that 54.0% of cis-NATs were expressed simultaneously in a population of homogenous cells. Nearly 9.7% of rice transcripts were involved in one-to-one or many-to-many cis-NATs formation. Furthermore, only 17.4-34.7% of 223 many-to-many cis-NAT groups were all expressed and generated nat-siRNAs, indicating that only some cis-NAT groups may be involved in complex regulatory networks. CONCLUSIONS: Our study profiles an abundance of cis-NATs and nat-siRNAs in rice. These data are valuable for gaining insight into the complex function of the rice transcriptome.
Assuntos
Oryza/genética , RNA Interferente Pequeno/genética , Transcriptoma/genética , Sequência de Bases , Northern Blotting , Primers do DNA/genética , Perfilação da Expressão Gênica , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Microdissecção e Captura a Laser , Dados de Sequência Molecular , Oryza/metabolismo , Folhas de Planta/genética , Reação em Cadeia da Polimerase Via Transcriptase ReversaRESUMO
Non-random gene organization in eukaryotes plays a significant role in genome evolution. Here, we investigate the origin of a biosynthetic gene cluster for production of defence compounds in oat-the avenacin cluster. We elucidate the structure and organisation of this 12-gene cluster, characterise the last two missing pathway steps, and reconstitute the entire pathway in tobacco by transient expression. We show that the cluster has formed de novo since the divergence of oats in a subtelomeric region of the genome that lacks homology with other grasses, and that gene order is approximately colinear with the biosynthetic pathway. We speculate that the positioning of the late pathway genes furthest away from the telomere may mitigate against a 'self-poisoning' scenario in which toxic intermediates accumulate as a result of telomeric gene deletions. Our investigations reveal a striking example of adaptive evolution underpinned by remarkable genome plasticity.
Assuntos
Avena/genética , Resistência à Doença/genética , Redes e Vias Metabólicas/genética , Telômero/genética , Avena/metabolismo , Grão Comestível/genética , Evolução Molecular , Sequenciamento de Nucleotídeos em Larga Escala , Hibridização in Situ Fluorescente , Família Multigênica , RNA-Seq , Sequências Repetitivas de Ácido Nucleico , Saponinas/biossíntese , Saponinas/química , Saponinas/genética , Sintenia/genética , Nicotiana/metabolismo , Sequenciamento Completo do GenomaRESUMO
BACKGROUND: With the availability of rice and sorghum genome sequences and ongoing efforts to sequence genomes of other cereal and energy crops, the grass family (Poaceae) has become a model system for comparative genomics and for better understanding gene and genome evolution that underlies phenotypic and ecological divergence of plants. While the genomic resources have accumulated rapidly for almost all major lineages of grasses, bamboo remains the only large subfamily of Poaceae with little genomic information available in databases, which seriously hampers our ability to take a full advantage of the wealth of grass genomic data for effective comparative studies. RESULTS: Here we report the cloning and sequencing of 10,608 putative full length cDNAs (FL-cDNAs) primarily from Moso bamboo, Phyllostachys heterocycla cv. pubescens, a large woody bamboo with the highest ecological and economic values of all bamboos. This represents the third largest FL-cDNA collection to date of all plant species, and provides the first insight into the gene and genome structures of bamboos. We developed a Moso bamboo genomic resource database that so far contained the sequences of 10,608 putative FL-cDNAs and nearly 38,000 expressed sequence tags (ESTs) generated in this study. CONCLUSION: Analysis of FL-cDNA sequences show that bamboo diverged from its close relatives such as rice, wheat, and barley through an adaptive radiation. A comparative analysis of the lignin biosynthesis pathway between bamboo and rice suggested that genes encoding caffeoyl-CoA O-methyltransferase may serve as targets for genetic manipulation of lignin content to reduce pollutants generated from bamboo pulping.
Assuntos
DNA Complementar/genética , Genoma de Planta/genética , Poaceae/genética , Processamento Alternativo/genética , Composição de Bases , Sequência de Bases , Elementos de DNA Transponíveis/genética , DNA Complementar/química , Bases de Dados Genéticas , Etiquetas de Sequências Expressas , Lignina/biossíntese , Repetições Minissatélites/genética , Dados de Sequência Molecular , Filogenia , Poaceae/classificaçãoRESUMO
Although bacterial small noncoding RNAs (sRNAs) are known to play a critical role in various cellular processes, including pathogenesis, the identity and action of such sRNAs are still poorly understood in many organisms. Here we have performed a genome-wide screen and functional analysis of the sRNAs in Xanthomonas campestris pv. campestris (Xcc), an important phytopathogen. The 50-500-nt RNA fragments isolated from the wild-type strain grown in a virulence gene-inducing condition were sequenced and a total of 612 sRNA candidates (SRCs) were identified. The majority (82%) of the SRCs were derived from mRNA, rather than specific sRNA genes. A representative panel of 121 SRCs were analysed by northern blotting; 117 SRCs were detected, supporting the contention that the overwhelming majority of the 612 SRCs identified are indeed sRNAs. Phenotypic analysis of strains overexpressing different candidates showed that a particular sRNA, RsmU, acts as a negative regulator of virulence, the hypersensitive response, and cell motility in Xcc. In vitro electrophoretic mobility shift assay and in vivo coimmunoprecipitation analyses indicated that RsmU interacted with the global posttranscriptional regulator RsmA, although sequence analysis displayed that RsmU is not a member of the sRNAs families known to antagonize RsmA. Northern blotting analyses demonstrated that RsmU has two isoforms that are processed from the 3'-untranslated region of the mRNA of XC1332 predicted to encode ComEA, a periplasmic protein required for DNA uptake in bacteria. This work uncovers an unexpected major sRNA biogenesis strategy in bacteria and a hidden layer of sRNA-mediated virulence regulation in Xcc.
Assuntos
Proteínas de Bactérias/genética , Genoma Bacteriano/genética , Pequeno RNA não Traduzido/genética , Xanthomonas campestris/genética , Folhas de Planta/microbiologia , Isoformas de RNA/genética , RNA Mensageiro/genética , Virulência/genética , Xanthomonas campestris/patogenicidadeRESUMO
Hybrid rice breeding for exploiting hybrid vigor, heterosis, has greatly increased grain yield. However, the heterosis-related genes associated with rice grain production remain largely unknown, partly because comprehensive mapping of heterosis-related traits is still labor-intensive and time-consuming. Here, we present a quantitative trait locus (QTL) mapping method, GradedPool-Seq, for rapidly mapping QTLs by whole-genome sequencing of graded-pool samples from F2 progeny via bulked-segregant analysis. We implement this method and map-based cloning to dissect the heterotic QTL GW3p6 from the female line. We then generate the near isogenic line NIL-FH676::GW3p6 by introgressing the GW3p6 allele from the female line Guangzhan63-4S into the male inbred line Fuhui676. The NIL-FH676::GW3p6 exhibits grain yield highly increased compared to Fuhui676. This study demonstrates that it may be possible to achieve a high level of grain production in inbred rice lines without the need to construct hybrids.
Assuntos
Mapeamento Cromossômico/métodos , Grão Comestível/genética , Vigor Híbrido/genética , Oryza/genética , Melhoramento Vegetal/métodos , Cromossomos de Plantas/genética , Locos de Características Quantitativas/genéticaRESUMO
BACKGROUND: The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. RESULTS: Rice Indica cDNA Database (RICD) is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB) and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. CONCLUSION: The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
Assuntos
DNA Complementar/genética , Bases de Dados Genéticas , Genômica , Oryza/genética , Cromossomos de Plantas , Células Clonais , Internet , Análise de SequênciaRESUMO
When published, this article did not initially appear open access. This error has been corrected, and the open access status of the paper is noted in all versions of the paper.
RESUMO
The rich genetic diversity in Oryza sativa and Oryza rufipogon serves as the main sources in rice breeding. Large-scale resequencing has been undertaken to discover allelic variants in rice, but much of the information for genetic variation is often lost by direct mapping of short sequence reads onto the O. sativa japonica Nipponbare reference genome. Here we constructed a pan-genome dataset of the O. sativa-O. rufipogon species complex through deep sequencing and de novo assembly of 66 divergent accessions. Intergenomic comparisons identified 23 million sequence variants in the rice genome. This catalog of sequence variations includes many known quantitative trait nucleotides and will be helpful in pinpointing new causal variants that underlie complex traits. In particular, we systemically investigated the whole set of coding genes using this pan-genome data, which revealed extensive presence and absence of variation among rice accessions. This pan-genome resource will further promote evolutionary and functional studies in rice.
Assuntos
Produtos Agrícolas/genética , Variação Genética , Genoma de Planta , Genômica/métodos , Oryza/genética , Domesticação , Sequenciamento de Nucleotídeos em Larga Escala , Oryza/classificação , Melhoramento Vegetal , Análise de Sequência de DNARESUMO
Oilseed crops are used to produce vegetable oil. Sesame (Sesamum indicum), an oilseed crop grown worldwide, has high oil content and a small diploid genome, but the genetic basis of oil production and quality is unclear. Here we sequence 705 diverse sesame varieties to construct a haplotype map of the sesame genome and de novo assemble two representative varieties to identify sequence variations. We investigate 56 agronomic traits in four environments and identify 549 associated loci. Examination of the major loci identifies 46 candidate causative genes, including genes related to oil content, fatty acid biosynthesis and yield. Several of the candidate genes for oil content encode enzymes involved in oil metabolism. Two major genes associated with lignification and black pigmentation in the seed coat are also associated with large variation in oil content. These findings may inform breeding and improvement strategies for a broad range of oilseed crops.
Assuntos
Estudo de Associação Genômica Ampla , Óleo de Gergelim/biossíntese , Sesamum/genética , Sequência de Aminoácidos , Genes de Plantas , Dados de Sequência Molecular , Sementes/metabolismo , Sesamum/metabolismoRESUMO
Exploitation of heterosis is one of the most important applications of genetics in agriculture. However, the genetic mechanisms of heterosis are only partly understood, and a global view of heterosis from a representative number of hybrid combinations is lacking. Here we develop an integrated genomic approach to construct a genome map for 1,495 elite hybrid rice varieties and their inbred parental lines. We investigate 38 agronomic traits and identify 130 associated loci. In-depth analyses of the effects of heterozygous genotypes reveal that there are only a few loci with strong overdominance effects in hybrids, but a strong correlation is observed between the yield and the number of superior alleles. While most parental inbred lines have only a small number of superior alleles, high-yielding hybrid varieties have several. We conclude that the accumulation of numerous rare superior alleles with positive dominance is an important contributor to the heterotic phenomena.
Assuntos
Alelos , Genoma de Planta , Vigor Híbrido/genética , Hibridização Genética , Oryza/genética , Agricultura , Interação Gene-Ambiente , Estudo de Associação Genômica Ampla , Heterozigoto , Fenótipo , Característica Quantitativa HerdávelRESUMO
The grass carp is an important farmed fish, accounting for â¼16% of global freshwater aquaculture, and has a vegetarian diet. Here we report a 0.9-Gb draft genome of a gynogenetic female adult and a 1.07-Gb genome of a wild male adult. Genome annotation identified 27,263 protein-coding gene models in the female genome. A total of 114 scaffolds consisting of 573 Mb are anchored on 24 linkage groups. Divergence between grass carp and zebrafish is estimated to have occurred 49-54 million years ago. We identify a chromosome fusion in grass carp relative to zebrafish and report frequent crossovers between the grass carp X and Y chromosomes. We find that transcriptional activation of the mevalonate pathway and steroid biosynthesis in liver is associated with the grass carp's adaptation from a carnivorous to an herbivorous diet. We believe that the grass carp genome could serve as an initial platform for breeding better-quality fish using a genomic approach.
Assuntos
Carpas/genética , Adaptação Biológica/genética , Animais , Evolução Molecular , Feminino , Proteínas de Peixes/genética , Proteínas de Peixes/metabolismo , Genoma , Herbivoria/genética , Masculino , Anotação de Sequência Molecular , Dados de Sequência Molecular , Análise de Sequência de DNA , TranscriptomaRESUMO
Bamboo represents the only major lineage of grasses that is native to forests and is one of the most important non-timber forest products in the world. However, no species in the Bambusoideae subfamily has been sequenced. Here, we report a high-quality draft genome sequence of moso bamboo (P. heterocycla var. pubescens). The 2.05-Gb assembly covers 95% of the genomic region. Gene prediction modeling identified 31,987 genes, most of which are supported by cDNA and deep RNA sequencing data. Analyses of clustered gene families and gene collinearity show that bamboo underwent whole-genome duplication 7-12 million years ago. Identification of gene families that are key in cell wall biosynthesis suggests that the whole-genome duplication event generated more gene duplicates involved in bamboo shoot development. RNA sequencing analysis of bamboo flowering tissues suggests a potential connection between drought-responsive and flowering genes.
Assuntos
Bambusa/genética , Parede Celular/metabolismo , Secas , Flores/genética , Genes de Plantas , Genoma de Planta , Árvores/genética , Bambusa/crescimento & desenvolvimento , Parede Celular/genética , DNA de Plantas/genética , Regulação da Expressão Gênica de Plantas , Família Multigênica , RNA de Plantas/genéticaRESUMO
Foxtail millet (Setaria italica) is an important grain crop that is grown in arid regions. Here we sequenced 916 diverse foxtail millet varieties, identified 2.58 million SNPs and used 0.8 million common SNPs to construct a haplotype map of the foxtail millet genome. We classified the foxtail millet varieties into two divergent groups that are strongly correlated with early and late flowering times. We phenotyped the 916 varieties under five different environments and identified 512 loci associated with 47 agronomic traits by genome-wide association studies. We performed a de novo assembly of deeply sequenced genomes of a Setaria viridis accession (the wild progenitor of S. italica) and an S. italica variety and identified complex interspecies and intraspecies variants. We also identified 36 selective sweeps that seem to have occurred during modern breeding. This study provides fundamental resources for genetics research and genetic improvement in foxtail millet.
Assuntos
Variação Genética , Genoma de Planta , Estudo de Associação Genômica Ampla , Haplótipos , Característica Quantitativa Herdável , Setaria (Planta)/genética , China , Biologia Computacional , Genética Populacional , Genômica , Mutação INDEL , Desequilíbrio de Ligação , Anotação de Sequência Molecular , Fenótipo , Filogenia , Filogeografia , Polimorfismo de Nucleotídeo ÚnicoRESUMO
A high-density haplotype map recently enabled a genome-wide association study (GWAS) in a population of indica subspecies of Chinese rice landraces. Here we extend this methodology to a larger and more diverse sample of 950 worldwide rice varieties, including the Oryza sativa indica and Oryza sativa japonica subspecies, to perform an additional GWAS. We identified a total of 32 new loci associated with flowering time and with ten grain-related traits, indicating that the larger sample increased the power to detect trait-associated variants using GWAS. To characterize various alleles and complex genetic variation, we developed an analytical framework for haplotype-based de novo assembly of the low-coverage sequencing data in rice. We identified candidate genes for 18 associated loci through detailed annotation. This study shows that the integrated approach of sequence-based GWAS and functional genome annotation has the potential to match complex traits to their causal polymorphisms in rice.
Assuntos
Estudo de Associação Genômica Ampla , Oryza/genética , Grão Comestível/genética , Flores/genética , Perfilação da Expressão Gênica , Genes de Plantas , Genética Populacional , Haplótipos , Polimorfismo Genético , Análise de Sequência de DNARESUMO
Uncovering the genetic basis of agronomic traits in crop landraces that have adapted to various agro-climatic conditions is important to world food security. Here we have identified â¼ 3.6 million SNPs by sequencing 517 rice landraces and constructed a high-density haplotype map of the rice genome using a novel data-imputation method. We performed genome-wide association studies (GWAS) for 14 agronomic traits in the population of Oryza sativa indica subspecies. The loci identified through GWAS explained â¼ 36% of the phenotypic variance, on average. The peak signals at six loci were tied closely to previously identified genes. This study provides a fundamental resource for rice genetics research and breeding, and demonstrates that an approach integrating second-generation genome sequencing and GWAS can be used as a powerful complementary strategy to classical biparental cross-mapping for dissecting complex traits in rice.