Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Rice (N Y) ; 8(1): 34, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26606925

RESUMO

Traditional rice varieties harbour a large store of genetic diversity with potential to accelerate rice improvement. For a long time, this diversity maintained in the International Rice Genebank has not been fully used because of a lack of genome information. The publication of the first reference genome of Nipponbare by the International Rice Genome Sequencing Project (IRGSP) marked the beginning of a systematic exploration and use of rice diversity for genetic research and breeding. Since then, the Nipponbare genome has served as the reference for the assembly of many additional genomes. The recently completed 3000 Rice Genomes Project together with the public database (SNP-Seek) provides a new genomic and data resource that enables the identification of useful accessions for breeding. Using disease resistance traits as case studies, we demonstrated the power of allele mining in the 3,000 genomes for extracting accessions from the GeneBank for targeted phenotyping. Although potentially useful landraces can now be identified, their use in breeding is often hindered by unfavourable linkages. Efficient breeding designs are much needed to transfer the useful diversity to breeding. Multi-parent Advanced Generation InterCross (MAGIC) is a breeding design to produce highly recombined populations. The MAGIC approach can be used to generate pre-breeding populations with increased genotypic diversity and reduced linkage drag. Allele mining combined with a multi-parent breeding design can help convert useful diversity into breeding-ready genetic resources.

2.
BMC Genomics ; 11: 308, 2010 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-20470436

RESUMO

BACKGROUND: The third, or wobble, position in a codon provides a high degree of possible degeneracy and is an elegant fault-tolerance mechanism. Nucleotide biases between organisms at the wobble position have been documented and correlated with the abundances of the complementary tRNAs. We and others have noticed a bias for cytosine and guanine at the third position in a subset of transcripts within a single organism. The bias is present in some plant species and warm-blooded vertebrates but not in all plants, or in invertebrates or cold-blooded vertebrates. RESULTS: Here we demonstrate that in certain organisms the amount of GC at the wobble position (GC3) can be used to distinguish two classes of genes. We highlight the following features of genes with high GC3 content: they (1) provide more targets for methylation, (2) exhibit more variable expression, (3) more frequently possess upstream TATA boxes, (4) are predominant in certain classes of genes (e.g., stress responsive genes) and (5) have a GC3 content that increases from 5'to 3'. These observations led us to formulate a hypothesis to explain GC3 bimodality in grasses. CONCLUSIONS: Our findings suggest that high levels of GC3 typify a class of genes whose expression is regulated through DNA methylation or are a legacy of accelerated evolution through gene conversion. We discuss the three most probable explanations for GC3 bimodality: biased gene conversion, transcriptional and translational advantage and gene methylation.


Assuntos
Códon/química , Códon/genética , Poaceae/genética , Composição de Bases , Metilação de DNA , Conversão Gênica , Regulação da Expressão Gênica de Plantas , Genes de Plantas/genética , Variação Genética , Genômica , Íntrons/genética , Oryza/genética , Homologia de Sequência do Ácido Nucleico , Sorghum/genética , TATA Box/genética , Zea mays/genética
3.
OMICS ; 13(2): 139-51, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19231992

RESUMO

The availability of complete or nearly complete genome sequences, a large number of 5' expressed sequence tags, and significant public expression data allow for a more accurate identification of cis-elements regulating gene expression. We have implemented a global approach that takes advantage of available expression data, genomic sequences, and transcript information to predict cis-elements associated with specific expression patterns. The key components of our approach are: (1) precise identification of transcription start sites, (2) specific locations of cis-elements relative to the transcription start site, and (3) assessment of statistical significance for all sequence motifs. By applying our method to promoters of Arabidopsis thaliana and Mus musculus, we have identified motifs that affect gene expression under specific environmental conditions or in certain tissues. We also found that the presence of the TATA box is associated with increased variability of gene expression. Strong correlation between our results and experimentally determined motifs shows that the method is capable of predicting new functionally important cis-elements in promoter sequences.


Assuntos
Expressão Gênica , Estudo de Associação Genômica Ampla , Regiões Promotoras Genéticas , Algoritmos , Animais , Arabidopsis/genética , Camundongos
4.
Plant Mol Biol ; 69(1-2): 179-94, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18937034

RESUMO

We present a large portion of the transcriptome of Zea mays, including ESTs representing 484,032 cDNA clones from 53 libraries and 36,565 fully sequenced cDNA clones, out of which 31,552 clones are non-redundant. These and other previously sequenced transcripts have been aligned with available genome sequences and have provided new insights into the characteristics of gene structures and promoters within this major crop species. We found that although the average number of introns per gene is about the same in corn and Arabidopsis, corn genes have more alternatively spliced isoforms. Examination of the nucleotide composition of coding regions reveals that corn genes, as well as genes of other Poaceae (Grass family), can be divided into two classes according to the GC content at the third position in the amino acid encoding codons. Many of the transcripts that have lower GC content at the third position have dicot homologs but the high GC content transcripts tend to be more specific to the grasses. The high GC content class is also enriched with intronless genes. Together this suggests that an identifiable class of genes in plants is associated with the Poaceae divergence. Furthermore, because many of these genes appear to be derived from ancestral genes that do not contain introns, this evolutionary divergence may be the result of horizontal gene transfer from species not only with different codon usage but possibly that did not have introns, perhaps outside of the plant kingdom. By comparing the cDNAs described herein with the non-redundant set of corn mRNAs in GenBank, we estimate that there are about 50,000 different protein coding genes in Zea. All of the sequence data from this study have been submitted to DDBJ/GenBank/EMBL under accession numbers EU940701-EU977132 (FLI cDNA) and FK944382-FL482108 (EST).


Assuntos
DNA Complementar/genética , Genes de Plantas , Zea mays/genética , Processamento Alternativo , Sequência de Bases , Primers do DNA , Etiquetas de Sequências Expressas , Regiões Promotoras Genéticas , Transcrição Gênica
5.
Plant Mol Biol ; 60(1): 69-85, 2006 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-16463100

RESUMO

Arabidopsis is currently the reference genome for higher plants. A new, more detailed statistical analysis of Arabidopsis gene structure is presented including intron and exon lengths, intergenic distances, features of promoters, and variant 5'-ends of mRNAs transcribed from the same transcription unit. We also provide a statistical characterization of Arabidopsis transcripts in terms of their size, UTR lengths, 3'-end cleavage sites, splicing variants, and coding potential. These analyses were facilitated by scrutiny of our collection of sequenced full-length cDNAs and much larger collection of 5'-ESTs, together with another set of full-length cDNAs from Salk/Stanford/Plant Gene Expression Center/RIKEN. Examples of alternative splicing are observed for transcripts from 7% of the genes and many of these genes display multiple spliced isoforms. Most splicing variants lie in non-coding regions of the transcripts. Non-canonical splice sites constitute less than 1% of all splice sites. Genes with fewer than four introns display reduced average mRNA levels. Putative alternative transcription start sites were observed in 30% of highly expressed genes and in more than 50% of the genes with low expression. Transcription start sites correlate remarkably well with a CG skew peak in the DNA sequences. The intergenic distances vary considerably, those where genes are transcribed towards one another being significantly shorter. New transcripts, missing in the current TIGR genome annotation and ESTs that are non-coding, including those antisense to known genes, are derived and cataloged in the Supplementary Material. They identify 148 new loci in the Arabidopsis genome. The conclusions drawn provide a better understanding of the Arabidopsis genome and how the gene transcripts are processed. The results also allow better predictions to be made for, as yet, poorly defined genes and provide a reference for comparisons with other plant genomes whose complete sequences are currently being determined. Some comparisons with rice are included in this paper.


Assuntos
Arabidopsis/genética , DNA Complementar/genética , Genes de Plantas/genética , Genoma de Planta , Processamento Alternativo , Sequência de Bases , DNA Intergênico , DNA de Plantas/genética , Éxons/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Íntrons/genética , Sítio de Iniciação de Transcrição
6.
Int J Bioinform Res Appl ; 1(3): 335-50, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-18048140

RESUMO

Determination of protein function and biological pathway is one of the most challenging problems in the post-genomic era. To address this challenge, we have developed a new integrated probabilistic method for cellular function prediction using microarray gene expression profiles, in conjunction with predicted protein-protein interactions and annotations of known proteins. Our approach is based on a novel assessment for the relationship between correlation of two genes' expression profiles and their functional relationship in terms of the Gene Ontology (GO) hierarchy. We applied the method for function prediction of hypothetical genes in Arabidopsis. We have also extended our method using Dijkstra's algorithm to identify the components and topology of signaling pathway of phosphatidic acid as a second messenger in Arabidopsis.


Assuntos
Arabidopsis , Proteínas , Algoritmos , Arabidopsis/genética , Perfilação da Expressão Gênica , Genoma , Genômica , Proteínas/metabolismo
7.
J Mol Biol ; 339(3): 647-78, 2004 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-15147847

RESUMO

The assignment of protein domains from three-dimensional structure is critically important in understanding protein evolution and function, yet little quality assurance has been performed. Here, the differences in the assignment of structural domains are evaluated using six common assignment methods. Three human expert methods (AUTHORS (authors' annotation), CATH and SCOP) and three fully automated methods (DALI, DomainParser and PDP) are investigated by analysis of individual methods against the author's assignment as well as analysis based on the consensus among groups of methods (only expert, only automatic, combined). The results demonstrate that caution is recommended in using current domain assignments, and indicates where additional work is needed. Specifically, the major factors responsible for conflicting domain assignments between methods, both experts and automatic, are: (1) the definition of very small domains; (2) splitting secondary structures between domains; (3) the size and number of discontinuous domains; (4) closely packed or convoluted domain-domain interfaces; (5) structures with large and complex architectures; and (6) the level of significance placed upon structural, functional and evolutionary concepts in considering structural domain definitions. A web-based resource that focuses on the results of benchmarking and the analysis of domain assignments is available at


Assuntos
Proteínas/química , Algoritmos , Modelos Moleculares , Conformação Proteica
8.
Genome Biol ; 4(8): R51, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-12914659

RESUMO

Using an integrative genome annotation pipeline (iGAP) for proteome-wide protein structure and functional domain assignment, we analyzed all the proteins of Arabidopsis thaliana. Three-dimensional structures at the level of the domain are assigned by fold recognition and threading based on a novel fold library that extends common domain classifications. iGAP is being applied to proteins from all available proteomes as part of a comparative proteomics resource. The database is accessible from the web.


Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Genoma de Planta , Proteômica/métodos , Proteínas de Arabidopsis/classificação , Proteoma/genética , Proteômica/classificação , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA