RESUMO
BACKGROUND: Investigation of tomato genetic resources is a crucial issue for better straight evolution and genetic studies as well as tomato breeding strategies. Traditional Vesuviano and San Marzano varieties grown in Campania region (Southern Italy) are famous for their remarkable fruit quality. Owing to their economic and social importance is crucial to understand the genetic basis of their unique traits. RESULTS: Here, we present the draft genome sequences of tomato Vesuviano and San Marzano genome. A 40x genome coverage was obtained from a hybrid Illumina paired-end reads assembling that combines de novo assembly with iterative mapping to the reference S. lycopersicum genome (SL2.40). Insertions, deletions and SNP variants were carefully measured. When assessed on the basis of the reference annotation, 30% of protein-coding genes are predicted to have variants in both varieties. Copy genes number and gene location were assessed by mRNA transcripts mapping, showing a closer relationship of San Marzano with reference genome. Distinctive variations in key genes and transcription/regulation factors related to fruit quality have been revealed for both cultivars. CONCLUSIONS: The effort performed highlighted varieties relationships and important variants in fruit key processes useful to dissect the path from sequence variant to phenotype.
Assuntos
Genoma de Planta , Solanum lycopersicum/genética , Mapeamento Cromossômico , Frutas/metabolismo , Deleção de Genes , Sequenciamento de Nucleotídeos em Larga Escala , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA , Especificidade da EspécieRESUMO
BACKGROUND: Glutathione S-transferases (GSTs) represent a ubiquitous gene family encoding detoxification enzymes able to recognize reactive electrophilic xenobiotic molecules as well as compounds of endogenous origin. Anthocyanin pigments require GSTs for their transport into the vacuole since their cytoplasmic retention is toxic to the cell. Anthocyanin accumulation in Citrus sinensis (L.) Osbeck fruit flesh determines different phenotypes affecting the typical pigmentation of Sicilian blood oranges. In this paper we describe: i) the characterization of the GST gene family in C. sinensis through a systematic EST analysis; ii) the validation of the EST assembly by exploiting the genome sequences of C. sinensis and C. clementina and their genome annotations; iii) GST gene expression profiling in six tissues/organs and in two different sweet orange cultivars, Cadenera (common) and Moro (pigmented). RESULTS: We identified 61 GST transcripts, described the full- or partial-length nature of the sequences and assigned to each sequence the GST class membership exploiting a comparative approach and the classification scheme proposed for plant species. A total of 23 full-length sequences were defined. Fifty-four of the 61 transcripts were successfully aligned to the C. sinensis and C. clementina genomes. Tissue specific expression profiling demonstrated that the expression of some GST transcripts was 'tissue-affected' and cultivar specific. A comparative analysis of C. sinensis GSTs with those from other plant species was also considered. Data from the current analysis are accessible at http://biosrv.cab.unina.it/citrusGST/, with the aim to provide a reference resource for C. sinensis GSTs. CONCLUSIONS: This study aimed at the characterization of the GST gene family in C. sinensis. Based on expression patterns from two different cultivars and on sequence-comparative analyses, we also highlighted that two sequences, a Phi class GST and a Mapeg class GST, could be involved in the conjugation of anthocyanin pigments and in their transport into the vacuole, specifically in fruit flesh of the pigmented cultivar.
Assuntos
Citrus sinensis/enzimologia , Citrus sinensis/genética , Etiquetas de Sequências Expressas , Glutationa Transferase/genética , Regulação da Expressão Gênica de Plantas , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismoRESUMO
BACKGROUND: Wild potato Solanum bulbocastanum is a rich source of genetic resistance against a variety of pathogens. It belongs to a taxonomic group of wild potato species sexually isolated from cultivated potato. Consistent with genetic isolation, previous studies suggested that the genome of S. bulbocastanum (B genome) is structurally distinct from that of cultivated potato (A genome). However, the genome architecture of the species remains largely uncharacterized. The current study employed Diversity Arrays Technology (DArT) to generate a linkage map for S. bulbocastanum and compare its genome architecture with those of potato and tomato. RESULTS: Two S. bulbocastanum parental linkage maps comprising 458 and 138 DArT markers were constructed. The integrated map comprises 401 non-redundant markers distributed across 12 linkage groups for a total length of 645 cM. Sequencing and alignment of DArT clones to reference physical maps from tomato and cultivated potato allowed direct comparison of marker orders between species. A total of nine genomic segments informative in comparative genomic studies were identified. Seven genome rearrangements correspond to previously-reported structural changes that have occurred since the speciation of tomato and potato. We also identified two S. bulbocastanum genomic regions that differ from cultivated potato, suggesting possible chromosome divergence between Solanum A and B genomes. CONCLUSIONS: The linkage map developed here is the first medium density map of S. bulbocastanum and will assist mapping of agronomical genes and QTLs. The structural comparison with potato and tomato physical maps is the first genome wide comparison between Solanum A and B genomes and establishes a foundation for further investigation of B genome-specific structural chromosome rearrangements.
Assuntos
Cromossomos de Plantas/genética , Solanum/genética , Mapeamento Cromossômico , Ligação Genética , Marcadores Genéticos , Genoma de Planta , Locos de Características Quantitativas , Análise de Sequência de DNARESUMO
Cross-species comparative genomics approaches have been employed to map and clone many important disease resistance (R) genes from Solanum species-especially wild relatives of potato and tomato. These efforts will increase with the recent release of potato genome sequence and the impending release of tomato genome sequence. Most R genes belong to the prominent nucleotide binding site-leucine rich repeat (NBS-LRR) class and conserved NBS-LRR protein motifs enable survey of the R gene space of a plant genome by generation of resistance gene analogs (RGA), polymerase chain reaction fragments derived from R genes. We generated a collection of 97 RGA from the disease-resistant wild potato S. bulbocastanum, complementing smaller collections from other Solanum species. To further comparative genomics approaches, we combined all known Solanum RGA and cloned solanaceous NBS-LRR gene sequences, nearly 800 sequences in total, into a single meta-analysis. We defined R gene diversity bins that reflect both evolutionary relationships and DNA cross-hybridization results. The resulting framework is amendable and expandable, providing the research community with a common vocabulary for present and future study of R gene lineages. Through a series of sequence and hybridization experiments, we demonstrate that all tested R gene lineages are of ancient origin, are shared between Solanum species, and can be successfully accessed via comparative genomics approaches.
Assuntos
Hibridização Genômica Comparativa , Resistência à Doença/genética , Genoma de Planta/genética , Doenças das Plantas/imunologia , Solanum/genética , Motivos de Aminoácidos , Sequência de Bases , Evolução Molecular , Genes de Plantas/genética , Genômica , Proteínas de Repetições Ricas em Leucina , Dados de Sequência Molecular , Filogenia , Doenças das Plantas/microbiologia , Proteínas de Plantas/genética , Proteínas/genética , Análise de Sequência de DNA , Solanum/imunologia , Solanum tuberosum/genética , Solanum tuberosum/imunologia , Especificidade da EspécieRESUMO
Luminal-like breast tumor cells express estrogen receptor alpha (ERalpha), a member of the nuclear receptor family of ligand-activated transcription factors that controls their proliferation, survival, and functional status. To identify the molecular determinants of this hormone-responsive tumor phenotype, a comprehensive genome-wide analysis was performed in estrogen stimulated MCF-7 and ZR-75.1 cells by integrating time-course mRNA expression profiling with global mapping of genomic ERalpha binding sites by chromatin immunoprecipitation coupled to massively parallel sequencing, microRNA expression profiling, and in silico analysis of transcription units and receptor binding regions identified. All 1270 genes that were found to respond to 17beta-estradiol in both cell lines cluster in 33 highly concordant groups, each of which showed defined kinetics of RNA changes. This hormone-responsive gene set includes several direct targets of ERalpha and is organized in a gene regulation cascade, stemming from ligand-activated receptor and reaching a large number of downstream targets via AP-2gamma, B-cell activating transcription factor, E2F1 and 2, E74-like factor 3, GTF2IRD1, hairy and enhancer of split homologue-1, MYB, SMAD3, RARalpha, and RXRalpha transcription factors. MicroRNAs are also integral components of this gene regulation network because miR-107, miR-424, miR-570, miR-618, and miR-760 are regulated by 17beta-estradiol along with other microRNAs that can target a significant number of transcripts belonging to one or more estrogen-responsive gene clusters.
Assuntos
Neoplasias da Mama/metabolismo , Receptor alfa de Estrogênio/fisiologia , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , MicroRNAs/genética , Fatores de Transcrição/metabolismo , Sítios de Ligação , Linhagem Celular Tumoral , Imunoprecipitação da Cromatina , Estradiol/metabolismo , Receptor alfa de Estrogênio/metabolismo , Humanos , Cinética , MicroRNAs/metabolismo , Modelos Biológicos , Análise de Sequência com Séries de Oligonucleotídeos , RNA/metabolismoRESUMO
BACKGROUND: Since no genome sequences of solanaceous plants have yet been completed, expressed sequence tag (EST) collections represent a reliable tool for broad sampling of Solanaceae transcriptomes, an attractive route for understanding Solanaceae genome functionality and a powerful reference for the structural annotation of emerging Solanaceae genome sequences. DESCRIPTION: We describe the SolEST database http://biosrv.cab.unina.it/solestdb which integrates different EST datasets from both cultivated and wild Solanaceae species and from two species of the genus Coffea. Background as well as processed data contained in the database, extensively linked to external related resources, represent an invaluable source of information for these plant families. Two novel features differentiate SolEST from other resources: i) the option of accessing and then visualizing Solanaceae EST/TC alignments along the emerging tomato and potato genome sequences; ii) the opportunity to compare different Solanaceae assemblies generated by diverse research groups in the attempt to address a common complaint in the SOL community. CONCLUSION: Different databases have been established worldwide for collecting Solanaceae ESTs and are related in concept, content and utility to the one presented herein. However, the SolEST database has several distinguishing features that make it appealing for the research community and facilitates a "one-stop shop" for the study of Solanaceae transcriptomes.
Assuntos
Bases de Dados Genéticas , Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica , Solanaceae/genética , Mapeamento Cromossômico , DNA de Plantas/genética , Genoma de Planta , Repetições de Microssatélites , Alinhamento de Sequência , Análise de Sequência de DNA , Interface Usuário-ComputadorRESUMO
BACKGROUND: Present-day '-omics' technologies produce overwhelming amounts of data which include genome sequences, information on gene expression (transcripts and proteins) and on cell metabolic status. These data represent multiple aspects of a biological system and need to be investigated as a whole to shed light on the mechanisms which underpin the system functionality. The gathering and convergence of data generated by high-throughput technologies, the effective integration of different data-sources and the analysis of the information content based on comparative approaches are key methods for meaningful biological interpretations. In the frame of the International Solanaceae Genome Project, we propose here ISOLA, an Italian SOLAnaceae genomics resource. RESULTS: ISOLA (available at http://biosrv.cab.unina.it/isola) represents a trial platform and it is conceived as a multi-level computational environment.ISOLA currently consists of two main levels: the genome and the expression level. The cornerstone of the genome level is represented by the Solanum lycopersicum genome draft sequences generated by the International Tomato Genome Sequencing Consortium. Instead, the basic element of the expression level is the transcriptome information from different Solanaceae species, mainly in the form of species-specific comprehensive collections of Expressed Sequence Tags (ESTs). The cross-talk between the genome and the expression levels is based on data source sharing and on tools that enhance data quality, that extract information content from the levels' under parts and produce value-added biological knowledge. CONCLUSIONS: ISOLA is the result of a bioinformatics effort that addresses the challenges of the post-genomics era. It is designed to exploit '-omics' data based on effective integration to acquire biological knowledge and to approach a systems biology view. Beyond providing experimental biologists with a preliminary annotation of the tomato genome, this effort aims to produce a trial computational environment where different aspects and details are maintained as they are relevant for the analysis of the organization, the functionality and the evolution of the Solanaceae family.
Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Genoma de Planta/genética , Genômica/métodos , Proteínas de Plantas/fisiologia , Solanum lycopersicum/fisiologia , Fatores de Transcrição/metabolismo , Interface Usuário-Computador , Armazenamento e Recuperação da Informação/métodos , Internet , Itália , Fatores de Transcrição/genéticaRESUMO
BACKGROUND: The structure annotation of a genome is based either on ab initio methodologies or on similaritiy searches versus molecules that have been already annotated. Ab initio gene predictions in a genome are based on a priori knowledge of species-specific features of genes. The training of ab initio gene finders is based on the definition of a data-set of gene models. To accomplish this task the common approach is to align species-specific full length cDNA and EST sequences along the genomic sequences in order to define exon/intron structure of mRNA coding genes. RESULTS: GeneModelEST is the software here proposed for defining a data-set of candidate gene models using exclusively evidence derived from cDNA/EST sequences.GeneModelEST requires the genome coordinates of the spliced-alignments of ESTs and of contigs (tentative consensus sequences) generated by an EST clustering/assembling procedure to be formatted in a General Feature Format (GFF) standard file. Moreover, the alignments of the contigs versus a protein database are required as an NCBI BLAST formatted report file. The GeneModelEST analysis aims to i) evaluate each exon as defined from contig spliced alignments onto the genome sequence; ii) classify the contigs according to quality levels in order to select candidate gene models; iii) assign to the candidate gene models preliminary functional annotations. We discuss the application of the proposed methodology to build a data-set of gene models of Solanum lycopersicum, whose genome sequencing is an ongoing effort by the International Tomato Genome Sequencing Consortium. CONCLUSION: The contig classification procedure used by GeneModelEST supports the detection of candidate gene models, the identification of potential alternative transcripts and it is useful to filter out ambiguous information. An automated procedure, such as the one proposed here, is fundamental to support large scale analysis in order to provide species-specific gene models, that could be useful as a training data-set for ab initio gene finders and/or as a reference gene list for a human curated annotation.
Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Genoma de Planta/genética , Modelos Genéticos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Solanum lycopersicum/genética , Etiquetas de Sequências ExpressasRESUMO
BACKGROUND: Sparganosis is an infection with a larval Diphyllobothriidea tapeworm. From a rare cerebral case presented at a clinic in the UK, DNA was recovered from a biopsy sample and used to determine the causative species as Spirometra erinaceieuropaei through sequencing of the cox1 gene. From the same DNA, we have produced a draft genome, the first of its kind for this species, and used it to perform a comparative genomics analysis and to investigate known and potential tapeworm drug targets in this tapeworm. RESULTS: The 1.26 Gb draft genome of S. erinaceieuropaei is currently the largest reported for any flatworm. Through investigation of ß-tubulin genes, we predict that S. erinaceieuropaei larvae are insensitive to the tapeworm drug albendazole. We find that many putative tapeworm drug targets are also present in S. erinaceieuropaei, allowing possible cross application of new drugs. In comparison to other sequenced tapeworm species we observe expansion of protease classes, and of Kuntiz-type protease inhibitors. Expanded gene families in this tapeworm also include those that are involved in processes that add post-translational diversity to the protein landscape, intracellular transport, transcriptional regulation and detoxification. CONCLUSIONS: The S. erinaceieuropaei genome begins to give us insight into an order of tapeworms previously uncharacterized at the genome-wide level. From a single clinical case we have begun to sketch a picture of the characteristics of these organisms. Finally, our work represents a significant technological achievement as we present a draft genome sequence of a rare tapeworm, and from a small amount of starting material.
Assuntos
Diphyllobothrium/genética , Genoma , Esparganose/genética , Spirometra/genética , Animais , Sequência de Bases , Biópsia , Encéfalo/parasitologia , Encéfalo/patologia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Esparganose/parasitologia , Spirometra/parasitologia , Reino UnidoRESUMO
Excessive soil salinity is a major ecological and agronomical problem, the adverse effects of which are becoming a serious issue in regions where saline water is used for irrigation. Plants can employ regulatory strategies, such as DNA methylation, to enable relatively rapid adaptation to new conditions. In this regard, cytosine methylation might play an integral role in the regulation of gene expression at both the transcriptional and post-transcriptional levels. Rapeseed, which is the most important oilseed crop in Europe, is classified as being tolerant of salinity, although cultivars can vary substantially in their levels of tolerance. In this study, the Methylation Sensitive Amplified Polymorphism (MSAP) approach was used to assess the extent of cytosine methylation under salinity stress in salinity-tolerant (Exagone) and salinity-sensitive (Toccata) rapeseed cultivars. Our data show that salinity affected the level of DNA methylation. In particular methylation decreased in Exagone and increased in Toccata. Nineteen DNA fragments showing polymorphisms related to differences in methylation were sequenced. In particular, two of these were highly similar to genes involved in stress responses (Lacerata and trehalose-6-phosphatase synthase S4) and were chosen to further characterization. Bisulfite sequencing and quantitative RT-PCR analysis of selected MSAP loci showed that cytosine methylation changes under salinity as well as gene expression varied. In particular, our data show that salinity stress influences the expression of the two stress-related genes. Moreover, we quantified the level of trehalose in Exagone shoots and found that it was correlated to TPS4 expression and, therefore, to DNA methylation. In conclusion, we found that salinity could induce genome-wide changes in DNA methylation status, and that these changes, when averaged across different genotypes and developmental stages, accounted for 16.8% of the total site-specific methylation differences in the rapeseed genome, as detected by MSAP analysis.
Assuntos
Brassica napus/genética , Brassica rapa/genética , Metilação de DNA/genética , Polimorfismo Genético/genética , Sais/metabolismo , Estresse Fisiológico/genética , Regulação da Expressão Gênica de Plantas/genética , Genoma de Planta/genética , Genótipo , Folhas de Planta/genética , Raízes de Plantas/genética , Salinidade , Tolerância ao Sal/genética , Cloreto de Sódio/metabolismoRESUMO
Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.
RESUMO
The consortium responsible for the sequencing of the tomato (Solanum lycopersicum) genome initially focused on the sequencing of the euchromatic regions using a BAC-by-BAC strategy. We analyzed the compositional features of the whole collection of BAC sequences publically available. This analysis highlights specific peculiarities of heterochromatic and euchromatic BACs, in particular: the whole BAC collection has i) a large variability in repeat and gene content, ii) a positive and significant correlation of LTR retrotransposons of the Gypsy class with the repeat content and iii) the preferential location of the SINEs (short interspersed nuclear elements) in BAC sequences showing a low repeat content. Our results point out a typical design of the tomato chromosomes and pave the way for further investigations on the relationship between DNA primary structure and chromatin organization in Solanaceae genomes.