Search | VHL Regional Portal

Two major chromosome evolution events with unrivaled conserved gene content in pomegranate.

Akparov, Zeynal; Hajiyeva, Sabina; Abbasov, Mehraj; Kaur, Sukhjiwan; Hamwieh, Aladdin; Alsamman, Alsamman M; Hajiyev, Elchin; Babayeva, Sevda; Izzatullayeva, Vusala; Mustafayeva, Ziyafat; Mehdiyeva, Sabina; Mustafayev, Orkhan; Shahmuradov, Ilham; Kosarev, Peter; Solovyev, Victor; Salamov, Asaf; Jighly, Abdulqader.

Front Plant Sci ; 14: 1039211, 2023.

Article in English | MEDLINE | ID: mdl-36993855

ABSTRACT

Pomegranate has a unique evolutionary history given that different cultivars have eight or nine bivalent chromosomes with possible crossability between the two classes. Therefore, it is important to study chromosome evolution in pomegranate to understand the dynamics of its population. Here, we de novo assembled the Azerbaijani cultivar "Azerbaijan guloyshasi" (AG2017; 2n = 16) and re-sequenced six cultivars to track the evolution of pomegranate and to compare it with previously published de novo assembled and re-sequenced cultivars. High synteny was observed between AG2017, Bhagawa (2n = 16), Tunisia (2n = 16), and Dabenzi (2n = 18), but these four cultivars diverged from the cultivar Taishanhong (2n = 18) with several rearrangements indicating the presence of two major chromosome evolution events. Major presence/absence variations were not observed as >99% of the five genomes aligned across the cultivars, while >99% of the pan-genic content was represented by Tunisia and Taishanhong only. We also revisited the divergence between soft- and hard-seeded cultivars with less structured population genomic data, compared to previous studies, to refine the selected genomic regions and detect global migration routes for pomegranate. We reported a unique admixture between soft- and hard-seeded cultivars that can be exploited to improve the diversity, quality, and adaptability of local pomegranate varieties around the world. Our study adds body knowledge to understanding the evolution of the pomegranate genome and its implications for the population structure of global pomegranate diversity, as well as planning breeding programs aiming to develop improved cultivars.

Finding the missing honey bee genes: lessons learned from a genome upgrade.

Elsik, Christine G; Worley, Kim C; Bennett, Anna K; Beye, Martin; Camara, Francisco; Childers, Christopher P; de Graaf, Dirk C; Debyser, Griet; Deng, Jixin; Devreese, Bart; Elhaik, Eran; Evans, Jay D; Foster, Leonard J; Graur, Dan; Guigo, Roderic; Hoff, Katharina Jasmin; Holder, Michael E; Hudson, Matthew E; Hunt, Greg J; Jiang, Huaiyang; Joshi, Vandita; Khetani, Radhika S; Kosarev, Peter; Kovar, Christie L; Ma, Jian; Maleszka, Ryszard; Moritz, Robin F A; Munoz-Torres, Monica C; Murphy, Terence D; Muzny, Donna M; Newsham, Irene F; Reese, Justin T; Robertson, Hugh M; Robinson, Gene E; Rueppell, Olav; Solovyev, Victor; Stanke, Mario; Stolle, Eckart; Tsuruda, Jennifer M; Vaerenbergh, Matthias Van; Waterhouse, Robert M; Weaver, Daniel B; Whitfield, Charles W; Wu, Yuanqing; Zdobnov, Evgeny M; Zhang, Lan; Zhu, Dianhui; Gibbs, Richard A.

BMC Genomics ; 15: 86, 2014 Jan 30.

Article in English | MEDLINE | ID: mdl-24479613

ABSTRACT

BACKGROUND: The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes. RESULTS: Here, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes ~5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data. CONCLUSIONS: Lessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination.

Subject(s)

Bees/genetics , Genes, Insect , Animals , Base Composition , Databases, Genetic , Interspersed Repetitive Sequences/genetics , Molecular Sequence Annotation , Open Reading Frames/genetics , Peptides/analysis , Sequence Analysis, RNA , Sequence Homology, Amino Acid

Large-scale cis-element detection by analysis of correlated expression and sequence conservation between Arabidopsis and Brassica oleracea.

Haberer, Georg; Mader, Michael T; Kosarev, Peter; Spannagl, Manuel; Yang, Li; Mayer, Klaus F X.

Plant Physiol ; 142(4): 1589-602, 2006 Dec.

Article in English | MEDLINE | ID: mdl-17028152

ABSTRACT

The rapidly increasing amount of plant genomic sequences allows for the detection of cis-elements through comparative methods. In addition, large-scale gene expression data for Arabidopsis (Arabidopsis thaliana) have recently become available. Coexpression and evolutionarily conserved sequences are criteria widely used to identify shared cis-regulatory elements. In our study, we employ an integrated approach to combine two sources of information, coexpression and sequence conservation. Best-candidate orthologous promoter sequences were identified by a bidirectional best blast hit strategy in genome survey sequences from Brassica oleracea. The analysis of 779 microarrays from 81 different experiments provided detailed expression information for Arabidopsis genes coexpressed in multiple tissues and under various conditions and developmental stages. We discovered candidate transcription factor binding sites in 64% of the Arabidopsis genes analyzed. Among them, we detected experimentally verified binding sites and showed strong enrichment of shared cis-elements within functionally related genes. This study demonstrates the value of partially shotgun sequenced genomes and their combinatorial use with functional genomics data to address complex questions in comparative genomics.

Subject(s)

Arabidopsis/genetics , Brassica/genetics , Plant Proteins/genetics , Promoter Regions, Genetic , Arabidopsis/metabolism , Base Sequence , Binding Sites , Brassica/metabolism , Computational Biology , Conserved Sequence , Genomics , Molecular Sequence Data , Oligonucleotide Array Sequence Analysis , Plant Proteins/chemistry , Plant Proteins/metabolism

Automatic annotation of eukaryotic genes, pseudogenes and promoters.

Solovyev, Victor; Kosarev, Peter; Seledsov, Igor; Vorobyev, Denis.

Genome Biol ; 7 Suppl 1: S10.1-12, 2006.

Article in English | MEDLINE | ID: mdl-16925832

ABSTRACT

BACKGROUND: The ENCODE gene prediction workshop (EGASP) has been organized to evaluate how well state-of-the-art automatic gene finding methods are able to reproduce the manual and experimental gene annotation of the human genome. We have used Softberry gene finding software to predict genes, pseudogenes and promoters in 44 selected ENCODE sequences representing approximately 1% (30 Mb) of the human genome. Predictions of gene finding programs were evaluated in terms of their ability to reproduce the ENCODE-HAVANA annotation. RESULTS: The Fgenesh++ gene prediction pipeline can identify 91% of coding nucleotides with a specificity of 90%. Our automatic pseudogene finder (PSF program) found 90% of the manually annotated pseudogenes and some new ones. The Fprom promoter prediction program identifies 80% of TATA promoters sequences with one false positive prediction per 2,000 base-pairs (bp) and 50% of TATA-less promoters with one false positive prediction per 650 bp. It can be used to identify transcription start sites upstream of annotated coding parts of genes found by gene prediction software. CONCLUSION: We review our software and underlying methods for identifying these three important structural and functional genome components and discuss the accuracy of predictions, recent advances and open problems in annotating genomic sequences. We have demonstrated that our methods can be effectively used for initial automatic annotation of the eukaryotic genome.

Subject(s)

Computational Biology/methods , Genes , Genomics/methods , Promoter Regions, Genetic , Pseudogenes , Animals , Base Sequence , Chromosome Mapping , Humans , Molecular Sequence Data , Sequence Analysis, DNA , Sequence Analysis, Protein , Sequence Analysis, RNA , Software

Evaluation and classification of RING-finger domains encoded by the Arabidopsis genome.

Kosarev, Peter; Mayer, Klaus F X; Hardtke, Christian S.

Genome Biol ; 3(4): RESEARCH0016, 2002.

Article in English | MEDLINE | ID: mdl-11983057

ABSTRACT

BACKGROUND: In computational analysis, the RING-finger domain is one of the most frequently detected domains in the Arabidopsis proteome. In fact, it is more abundant in Arabidopsis than in other eukaryotic genomes. However, computational analysis might classify ambiguous domains of the closely related PHD and LIM motifs as RING domains by mistake. Thus, we set out to define an ordered set of Arabidopsis RING domains by evaluating predicted domains on the basis of recent structural data. RESULTS: Inspection of the proteome with a current InterPro release predicts 446 RING domains. We evaluated each detected domain and as a result eliminated 59 false positives. The remaining 387 domains were grouped by cluster analysis and according to their metal-ligand arrangement. We further defined novel patterns for additional computational analyses of the proteome. They were based on recent structural data that enable discrimination between the related RING, PHD and LIM domains. These patterns allow us to predict with different degrees of certainty whether a particular domain is indeed likely to form a RING finger. CONCLUSIONS: In summary, 387 domains have a significant potential to form a RING-type cross-brace structure. Many of these RING domains overlap with predicted PHD domains; however, the RING domain signature mostly prevails. Thus, the abundance of PHD domains in Arabidopsis has been significantly overestimated. Cluster analysis of the RING domains defines groups of proteins, which frequently show significant similarity outside the RING domain. These groups document a common evolutionary origin of their members and potentially represent genes of overlapping functionality.

Subject(s)

Arabidopsis Proteins/chemistry , Amino Acid Motifs , Arabidopsis/genetics , Arabidopsis Proteins/classification , Computational Biology , Genome, Plant , Molecular Sequence Data , Protein Structure, Tertiary , Proteome/analysis , Sequence Analysis, Protein

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL