Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Genome Res ; 29(5): 870-880, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30992303

RESUMO

Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5' UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice.


Assuntos
Variação Genética , Genoma de Planta , Variação Estrutural do Genoma , Genômica/métodos , Oryza/genética , Alelos , Mapeamento Cromossômico , Elementos de DNA Transponíveis , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Análise de Sequência de DNA/métodos , Estresse Fisiológico/genética
2.
Brief Bioinform ; 20(2): 565-571, 2019 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-29659709

RESUMO

Improving productivity of the staple crops wheat and rice is essential to feed the growing global population, particularly in the context of a changing climate. However, current rates of yield gain are insufficient to support the predicted population growth. New approaches are required to accelerate the breeding process, and many of these are driven by the application of large-scale crop data. To leverage the substantial volumes and types of data that can be applied for precision breeding, the wheat and rice research communities are working towards the development of integrated systems to access and standardize the dispersed, heterogeneous available data. Here, we outline the initiatives of the International Wheat Information System (WheatIS) and the International Rice Informatics Consortium (IRIC) to establish Web-based single-access systems and data mining tools to make the available resources more accessible, drive discovery and accelerate the production of new crop varieties. We discuss the progress of WheatIS and IRIC towards unifying specialized wheat and rice databases and building custom software platforms to manage and interrogate these data. Single-access crop information systems will strengthen scientific collaboration, optimize the use of public research funds and help achieve the required yield gains in the two most important global food crops.


Assuntos
Produtos Agrícolas/crescimento & desenvolvimento , Sistemas de Informação , Oryza/crescimento & desenvolvimento , Triticum/crescimento & desenvolvimento
3.
Nucleic Acids Res ; 45(D1): D1075-D1081, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899667

RESUMO

We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genoma de Planta , Mutação INDEL , Oryza/genética , Polimorfismo de Nucleotídeo Único , Ferramenta de Busca , Software , Alelos , Biologia Computacional/métodos , Frequência do Gene , Loci Gênicos , Genômica/métodos , Genótipo , Interface Usuário-Computador , Navegador
4.
Nucleic Acids Res ; 43(Database issue): D1023-7, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25429973

RESUMO

We have identified about 20 million rice SNPs by aligning reads from the 3000 rice genomes project with the Nipponbare genome. The SNPs and allele information are organized into a SNP-Seek system (http://www.oryzasnp.org/iric-portal/), which consists of Oracle database having a total number of rows with SNP genotypes close to 60 billion (20 M SNPs × 3 K rice lines) and web interface for convenient querying. The database allows quick retrieving of SNP alleles for all varieties in a given genome region, finding different alleles from predefined varieties and querying basic passport and morphological phenotypic information about sequenced rice lines. SNPs can be visualized together with the gene structures in JBrowse genome browser. Evolutionary relationships between rice varieties can be explored using phylogenetic trees or multidimensional scaling plots.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genoma de Planta , Oryza/genética , Polimorfismo de Nucleotídeo Único , Oryza/anatomia & histologia
6.
HLA ; 102(5): 599-606, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37580306

RESUMO

Analysis of publicly available whole-genome sequence data from the Human Pangenome Project and the 1000 Genomes Project has identified a DNA segment of approximately 60 kb in the major histocompatibility complex (MHC) between HLA-W and HLA-J that is present in some MHC haplotypes but not others. This DNA segment is largely repeat element-rich but includes the pseudogene HLA-Y, thus pinpointing the location of this pseudogene, and a new HLA class I sequence we have called HLA-OLI. HLA-OLI clusters phylogenetically with the HLA class I pseudogenes, HLA-P and HLA-W, and appears to have a similar genetic structure. The availability of whole-genome sequence data from diverse populations enables a detailed characterization of the MHC at the population level and will have implications for understanding MHC disease associations and the non-HLA MHC factors that impact unrelated hematopoietic cell transplant outcomes.

7.
BMC Genomics ; 11: 308, 2010 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-20470436

RESUMO

BACKGROUND: The third, or wobble, position in a codon provides a high degree of possible degeneracy and is an elegant fault-tolerance mechanism. Nucleotide biases between organisms at the wobble position have been documented and correlated with the abundances of the complementary tRNAs. We and others have noticed a bias for cytosine and guanine at the third position in a subset of transcripts within a single organism. The bias is present in some plant species and warm-blooded vertebrates but not in all plants, or in invertebrates or cold-blooded vertebrates. RESULTS: Here we demonstrate that in certain organisms the amount of GC at the wobble position (GC3) can be used to distinguish two classes of genes. We highlight the following features of genes with high GC3 content: they (1) provide more targets for methylation, (2) exhibit more variable expression, (3) more frequently possess upstream TATA boxes, (4) are predominant in certain classes of genes (e.g., stress responsive genes) and (5) have a GC3 content that increases from 5'to 3'. These observations led us to formulate a hypothesis to explain GC3 bimodality in grasses. CONCLUSIONS: Our findings suggest that high levels of GC3 typify a class of genes whose expression is regulated through DNA methylation or are a legacy of accelerated evolution through gene conversion. We discuss the three most probable explanations for GC3 bimodality: biased gene conversion, transcriptional and translational advantage and gene methylation.


Assuntos
Códon/química , Códon/genética , Poaceae/genética , Composição de Bases , Metilação de DNA , Conversão Gênica , Regulação da Expressão Gênica de Plantas , Genes de Plantas/genética , Variação Genética , Genômica , Íntrons/genética , Oryza/genética , Homologia de Sequência do Ácido Nucleico , Sorghum/genética , TATA Box/genética , Zea mays/genética
8.
Rice (N Y) ; 13(1): 72, 2020 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-33034758

RESUMO

BACKGROUND: The crop microbial communities are shaped by interactions between the host, microbes and the environment, however, their relative contribution is beginning to be understood. Here, we explore these interactions in the leaf bacterial community across 3024 rice accessions. FINDINGS: By using unmapped DNA sequencing reads as microbial reads, we characterized the structure of the rice bacterial microbiome. We identified central bacteria taxa that emerge as microbial "hubs" and may have an influence on the network of host-microbe interactions. We found regions in the rice genome that might control the assembly of these microbial hubs. To our knowledge this is one of the first studies that uses raw data from plant genome sequencing projects to characterize the leaf bacterial communities. CONCLUSION: We showed, that the structure of the rice leaf microbiome is modulated by multiple interactions among host, microbes, and environment. Our data provide insight into the factors influencing microbial assemblage in the rice leaf and also opens the door for future initiatives to modulate rice consortia for crop improvement efforts.

9.
Sci Data ; 7(1): 113, 2020 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-32265447

RESUMO

As the human population grows from 7.8 billion to 10 billion over the next 30 years, breeders must do everything possible to create crops that are highly productive and nutritious, while simultaneously having less of an environmental footprint. Rice will play a critical role in meeting this demand and thus, knowledge of the full repertoire of genetic diversity that exists in germplasm banks across the globe is required. To meet this demand, we describe the generation, validation and preliminary analyses of transposable element and long-range structural variation content of 12 near-gap-free reference genome sequences (RefSeqs) from representatives of 12 of 15 subpopulations of cultivated Asian rice. When combined with 4 existing RefSeqs, that represent the 3 remaining rice subpopulations and the largest admixed population, this collection of 16 Platinum Standard RefSeqs (PSRefSeq) can be used as a template to map resequencing data to detect virtually all standing natural variation that exists in the pan-genome of cultivated Asian rice.


Assuntos
Genoma de Planta , Oryza/genética , Produtos Agrícolas/genética , Variação Genética , Genômica
10.
Sci Rep ; 9(1): 1536, 2019 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-30733489

RESUMO

Plant disease resistance that is durable and effective against diverse pathogens (broad-spectrum) is essential to stabilize crop production. Such resistance is frequently controlled by Quantitative Trait Loci (QTL), and often involves differential regulation of Defense Response (DR) genes. In this study, we sought to understand how expression of DR genes is orchestrated, with the long-term goal of enabling genome-wide breeding for more effective and durable resistance. We identified short sequence motifs in rice promoters that are shared across Broad-Spectrum DR (BS-DR) genes co-expressed after challenge with three major rice pathogens (Magnaporthe oryzae, Rhizoctonia solani, and Xanthomonas oryzae pv. oryzae) and several chemical elicitors. Specific groupings of these BS-DR-associated motifs, called cis-Regulatory Modules (CRMs), are enriched in DR gene promoters, and the CRMs include cis-elements known to be involved in disease resistance. Polymorphisms in CRMs occur in promoters of genes in resistant relative to susceptible BS-DR haplotypes providing evidence that these CRMs have a predictive role in the contribution of other BS-DR genes to resistance. Therefore, we predict that a CRM signature within BS-DR gene promoters can be used as a marker for future breeding practices to enrich for the most responsive and effective BS-DR genes across the genome.


Assuntos
Resistência à Doença/genética , Oryza/genética , Doenças das Plantas/genética , Elementos Reguladores de Transcrição/genética , Sequência de Bases , Genoma de Planta , Haplótipos , Carioferinas/química , Carioferinas/genética , Carioferinas/metabolismo , Magnaporthe/patogenicidade , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Regiões Promotoras Genéticas , Locos de Características Quantitativas , RNA Interferente Pequeno/metabolismo , Receptores Citoplasmáticos e Nucleares/química , Receptores Citoplasmáticos e Nucleares/genética , Receptores Citoplasmáticos e Nucleares/metabolismo , Rhizoctonia/patogenicidade , Proteína Exportina 1
11.
Gigascience ; 8(5)2019 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-31107941

RESUMO

BACKGROUND: Rice molecular genetics, breeding, genetic diversity, and allied research (such as rice-pathogen interaction) have adopted sequencing technologies and high-density genotyping platforms for genome variation analysis and gene discovery. Germplasm collections representing rice diversity, improved varieties, and elite breeding materials are accessible through rice gene banks for use in research and breeding, with many having genome sequences and high-density genotype data available. Combining phenotypic and genotypic information on these accessions enables genome-wide association analysis, which is driving quantitative trait loci discovery and molecular marker development. Comparative sequence analyses across quantitative trait loci regions facilitate the discovery of novel alleles. Analyses involving DNA sequences and large genotyping matrices for thousands of samples, however, pose a challenge to non-computer savvy rice researchers. FINDINGS: The Rice Galaxy resource has shared datasets that include high-density genotypes from the 3,000 Rice Genomes project and sequences with corresponding annotations from 9 published rice genomes. The Rice Galaxy web server and deployment installer includes tools for designing single-nucleotide polymorphism assays, analyzing genome-wide association studies, population diversity, rice-bacterial pathogen diagnostics, and a suite of published genomic prediction methods. A prototype Rice Galaxy compliant to Open Access, Open Data, and Findable, Accessible, Interoperable, and Reproducible principles is also presented. CONCLUSIONS: Rice Galaxy is a freely available resource that empowers the plant research community to perform state-of-the-art analyses and utilize publicly available big datasets for both fundamental and applied science.


Assuntos
Bases de Dados Genéticas , Genômica/métodos , Oryza/genética , Melhoramento Vegetal/métodos , Software , Banco de Sementes
12.
Nat Commun ; 9(1): 3519, 2018 08 29.
Artigo em Inglês | MEDLINE | ID: mdl-30158584

RESUMO

As sequencing and genotyping technologies evolve, crop genetics researchers accumulate increasing numbers of genomic data sets from various genotyping platforms on different germplasm panels. Imputation is an effective approach to increase marker density of existing data sets toward the goal of integrating resources for downstream applications. While a number of imputation software packages are available, the limitations to utilization for the rice community include high computational demand and lack of a reference panel. To address these challenges, we develop the Rice Imputation Server, a publicly available web application leveraging genetic information from a globally diverse rice reference panel assembled here. This resource allows researchers to benefit from increased marker density without needing to perform imputation on their own machines. We demonstrate improvements that imputed data provide to rice genome-wide association (GWA) results of grain amylose content and show that the major functional nucleotide polymorphism is tagged only in the imputed data set.


Assuntos
Oryza/genética , Frequência do Gene/genética , Estudo de Associação Genômica Ampla , Genótipo , Polimorfismo de Nucleotídeo Único/genética
14.
Nat Genet ; 50(2): 285-296, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29358651

RESUMO

The genus Oryza is a model system for the study of molecular evolution over time scales ranging from a few thousand to 15 million years. Using 13 reference genomes spanning the Oryza species tree, we show that despite few large-scale chromosomal rearrangements rapid species diversification is mirrored by lineage-specific emergence and turnover of many novel elements, including transposons, and potential new coding and noncoding genes. Our study resolves controversial areas of the Oryza phylogeny, showing a complex history of introgression among different chromosomes in the young 'AA' subclade containing the two domesticated species. This study highlights the prevalence of functionally coupled disease resistance genes and identifies many new haplotypes of potential use for future crop protection. Finally, this study marks a milestone in modern rice research with the release of a complete long-read assembly of IR 8 'Miracle Rice', which relieved famine and drove the Green Revolution in Asia 50 years ago.


Assuntos
Produtos Agrícolas/genética , Evolução Molecular , Variação Genética , Oryza/classificação , Oryza/genética , Sequência Conservada , Domesticação , Especiação Genética , Genoma de Planta , Filogenia
15.
Sci Rep ; 7(1): 12478, 2017 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-28963534

RESUMO

In this study, we used 2.9 million single nucleotide polymorphisms (SNP) and 393,429 indels derived from whole genome sequences of 591 rice landraces to determine the genetic basis of cooked and raw grain length, width and shape using genome-wide association study (GWAS). We identified a unique fine-mapped genetic region GWi7.1 significantly associated with cooked and raw grain width. Additionally, GWi7.2 that harbors GL7/GW7 a cloned gene for grain dimension was found. Novel regions in chromosomes 10 and 11 were also found to be associated with cooked grain shape and raw grain width, respectively. The indel-based GWAS identified fine-mapped genetic regions GL3.1 and GWi5.1 that matched synteny breakpoints between indica and japonica. GL3.1 was positioned a few kilobases away from GS3, a cloned gene for cooked and raw grain lengths in indica. GWi5.1 found to be significantly associated with cooked and raw grain width. It anchors upstream of cloned gene GW5, which varied between indica and japonica accessions. GWi11.1 is present inside the 3'-UTR of a functional gene in indica that corresponds to a syntenic break in chromosome 11 of japonica. Our results identified novel allelic structural variants and haplotypes confirmed using single locus and multilocus SNP and indel-based GWAS.


Assuntos
Mapeamento Cromossômico , Cromossomos de Plantas/química , Grão Comestível/genética , Oryza/genética , Locos de Características Quantitativas , Característica Quantitativa Herdável , Alelos , Culinária , Grão Comestível/anatomia & histologia , Estudo de Associação Genômica Ampla , Haplótipos , Desequilíbrio de Ligação , Oryza/anatomia & histologia , Fenótipo , Proteínas de Plantas , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma
16.
Sci Rep ; 6: 35730, 2016 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-27774999

RESUMO

We analyzed functionality and relative distribution of genetic variants across the complete Oryza sativa genome, using the 40 million single nucleotide polymorphisms (SNPs) dataset from the 3,000 Rice Genomes Project (http://snp-seek.irri.org), the largest and highest density SNP collection for any higher plant. We have shown that the DNA-binding transcription factors (TFs) are the most conserved group of genes, whereas kinases and membrane-localized transporters are the most variable ones. TFs may be conserved because they belong to some of the most connected regulatory hubs that modulate transcription of vast downstream gene networks, whereas signaling kinases and transporters need to adapt rapidly to changing environmental conditions. In general, the observed profound patterns of nucleotide variability reveal functionally important genomic regions. As expected, nucleotide diversity is much higher in intergenic regions than within gene bodies (regions spanning gene models), and protein-coding sequences are more conserved than untranslated gene regions. We have observed a sharp decline in nucleotide diversity that begins at about 250 nucleotides upstream of the transcription start and reaches minimal diversity exactly at the transcription start. We found the transcription termination sites to have remarkably symmetrical patterns of SNP density, implying presence of functional sites near transcription termination. Also, nucleotide diversity was significantly lower near 3' UTRs, the area rich with regulatory regions.


Assuntos
DNA Intergênico/genética , Genoma de Planta/genética , Nucleotídeos/genética , Polimorfismo de Nucleotídeo Único/genética , Regiões 3' não Traduzidas/genética , Códon de Terminação/genética , Redes Reguladoras de Genes , Genômica/métodos , Oryza/genética , Transcrição Gênica/genética
17.
J Mol Biol ; 339(3): 647-78, 2004 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-15147847

RESUMO

The assignment of protein domains from three-dimensional structure is critically important in understanding protein evolution and function, yet little quality assurance has been performed. Here, the differences in the assignment of structural domains are evaluated using six common assignment methods. Three human expert methods (AUTHORS (authors' annotation), CATH and SCOP) and three fully automated methods (DALI, DomainParser and PDP) are investigated by analysis of individual methods against the author's assignment as well as analysis based on the consensus among groups of methods (only expert, only automatic, combined). The results demonstrate that caution is recommended in using current domain assignments, and indicates where additional work is needed. Specifically, the major factors responsible for conflicting domain assignments between methods, both experts and automatic, are: (1) the definition of very small domains; (2) splitting secondary structures between domains; (3) the size and number of discontinuous domains; (4) closely packed or convoluted domain-domain interfaces; (5) structures with large and complex architectures; and (6) the level of significance placed upon structural, functional and evolutionary concepts in considering structural domain definitions. A web-based resource that focuses on the results of benchmarking and the analysis of domain assignments is available at


Assuntos
Proteínas/química , Algoritmos , Modelos Moleculares , Conformação Proteica
18.
Rice (N Y) ; 8(1): 34, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26606925

RESUMO

Traditional rice varieties harbour a large store of genetic diversity with potential to accelerate rice improvement. For a long time, this diversity maintained in the International Rice Genebank has not been fully used because of a lack of genome information. The publication of the first reference genome of Nipponbare by the International Rice Genome Sequencing Project (IRGSP) marked the beginning of a systematic exploration and use of rice diversity for genetic research and breeding. Since then, the Nipponbare genome has served as the reference for the assembly of many additional genomes. The recently completed 3000 Rice Genomes Project together with the public database (SNP-Seek) provides a new genomic and data resource that enables the identification of useful accessions for breeding. Using disease resistance traits as case studies, we demonstrated the power of allele mining in the 3,000 genomes for extracting accessions from the GeneBank for targeted phenotyping. Although potentially useful landraces can now be identified, their use in breeding is often hindered by unfavourable linkages. Efficient breeding designs are much needed to transfer the useful diversity to breeding. Multi-parent Advanced Generation InterCross (MAGIC) is a breeding design to produce highly recombined populations. The MAGIC approach can be used to generate pre-breeding populations with increased genotypic diversity and reduced linkage drag. Allele mining combined with a multi-parent breeding design can help convert useful diversity into breeding-ready genetic resources.

19.
OMICS ; 8(4): 322-33, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15703479

RESUMO

Characterizing gene function is one of the major challenging tasks in the post-genomic era. To address this challenge, we have developed GeneFAS (Gene Function Annotation System), a new integrated probabilistic method for cellular function prediction by combining information from protein-protein interactions, protein complexes, microarray gene expression profiles, and annotations of known proteins through an integrative statistical model. Our approach is based on a novel assessment for the relationship between (1) the interaction/correlation of two proteins' high-throughput data and (2) their functional relationship in terms of their Gene Ontology (GO) hierarchy. We have developed a Web server for the predictions. We have applied our method to yeast Saccharomyces cerevisiae and predicted functions for 1548 out of 2472 unannotated proteins.


Assuntos
Regulação Fúngica da Expressão Gênica , Genes Fúngicos , Genoma , Proteômica/métodos , Proteínas de Saccharomyces cerevisiae/química , Saccharomyces cerevisiae/genética , Automação , Teorema de Bayes , Bases de Dados como Assunto , Bases de Dados de Proteínas , Genoma Fúngico , Internet , Substâncias Macromoleculares , Modelos Biológicos , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos , Fases de Leitura Aberta , Filogenia , Ligação Proteica , Mapeamento de Interação de Proteínas , Estrutura Terciária de Proteína , Sensibilidade e Especificidade , Software , Técnicas do Sistema de Duplo-Híbrido
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA