RESUMO
SUMMARY: Large-scale comparative genomic studies have provided important insights into species evolution and diversity, but also lead to a great challenge to visualize. Quick catching or presenting key information hidden in the vast amount of genomic data and relationships among multiple genomes requires an efficient visualization tool. However, current tools for such visualization remain inflexible in layout and/or require advanced computation skills, especially for visualization of genome-based synteny. Here, we developed an easy-to-use and flexible layout tool, NGenomeSyn [multiple (N) Genome Synteny], for publication-ready visualization of syntenic relationships of the whole genome or local region and genomic features (e.g. repeats, structural variations, genes) across multiple genomes with a high customization. NGenomeSyn provides an easy way for its users to visualize a large amount of data with a rich layout by simply adjusting options for moving, scaling, and rotation of target genomes. Moreover, NGenomeSyn could be applied on the visualization of relationships on non-genomic data with similar input formats. AVAILABILITY AND IMPLEMENTATION: NGenomeSyn is freely available at GitHub (https://github.com/hewm2008/NGenomeSyn) and Zenodo (https://doi.org/10.5281/zenodo.7645148).
Assuntos
Genoma , Software , Sintenia , GenômicaRESUMO
Large genomic data sets are becoming the new normal in phylogenetic research, but the identification of true orthologous genes and the exclusion of problematic paralogs is still challenging when applying commonly used sequencing methods such as target enrichment. Here, we compared conventional ortholog detection using OrthoFinder with ortholog detection through genomic synteny in a data set of 11 representative diploid Brassicaceae whole-genome sequences spanning the entire phylogenetic space. Then, we evaluated the resulting gene sets regarding gene number, functional annotation, and gene and species tree resolution. Finally, we used the syntenic gene sets for comparative genomics and ancestral genome analysis. The use of synteny resulted in considerably more orthologs and also allowed us to reliably identify paralogs. Surprisingly, we did not detect notable differences between species trees reconstructed from syntenic orthologs when compared with other gene sets, including the Angiosperms353 set and a Brassicaceae-specific target enrichment gene set. However, the synteny data set comprised a multitude of gene functions, strongly suggesting that this method of marker selection for phylogenomics is suitable for studies that value downstream gene function analysis, gene interaction, and network studies. Finally, we present the first ancestral genome reconstruction for the Core Brassicaceae which predating the Brassicaceae lineage diversification â¼25 million years ago.
Assuntos
Brassicaceae , Brassicaceae/genética , Sintenia , Filogenia , Genômica/métodos , GenomaRESUMO
Brassica oleracea displays remarkable morphological variations. It intrigued researchers to study the underlying cause of the enormous diversification of this organism. However, genomic variations in complex heading traits are less known in B. oleracea. Herein, we performed a comparative population genomics analysis to explore structural variations (SVs) responsible for heading trait formation in B. oleracea. Synteny analysis showed that chromosomes C1 and C2 of B. oleracea (CC) shared strong collinearity with A01 and A02 of B. rapa (AA), respectively. Two historical events, whole genome triplication (WGT) of Brassica species and differentiation time between AA and CC genomes, were observed clearly by phylogenetic and Ks analysis. By comparing heading and non-heading populations of B. oleracea genomes, we found extensive SVs during the diversification of the B. oleracea genome. We identified 1205 SVs that have an impact on 545 genes and might be associated with the heading trait of cabbage. Overlapping the genes affected by SVs and the differentially expressed genes identified by RNA-seq analysis, we identified six vital candidate genes that may be related to heading trait formation in cabbage. Further, qRT-PCR experiments also verified that six genes were differentially expressed between heading leaves and non-heading leaves, respectively. Collectively, we used available genomes to conduct a comparison population genome analysis and identify candidate genes for the heading trait of cabbage, which provides insight into the underlying reason for heading trait formation in B. oleracea.
Assuntos
Brassica , Genoma de Planta , Filogenia , Brassica/genética , SinteniaRESUMO
The MYB gene family widely exists in the plant kingdom and participates in the regulation of plant development and stress response. Pearl millet (Pennisetum glaucum (L.) R. Br.), as one of the most important cereals, is not only considered a good source of protein and nutrients but also has excellent tolerances to various abiotic stresses (e.g., salinity, water deficit, etc.). Although the genome sequence of pearl millet was recently published, bioinformatics and expression pattern analysis of the MYB gene family are limited. Here, we identified 208 PgMYB genes in the pearl millet genome and employed 193 high-confidence candidates for downstream analysis. Phylogenetic and structural analysis classified these PgMYBs into four subgroups. Eighteen pairs of segmental duplications of the PgMYB gene were found using synteny analysis. Collinear analysis revealed pearl millet had the closest evolutionary relationship with foxtail millet. Nucleotide substitution analysis (Ka/Ks) revealed PgMYB genes were under purifying positive selection pressure. Reverse transcription-quantitative PCR analysis of eleven R2R3-type PgMYB genes revealed they were preferentially expressed in shoots and seeds and actively responded to various environment stimuli. Current results provide insightful information regarding the molecular features of the MYB family in pearl millet to support further functional characterizations.
Assuntos
Pennisetum , Pennisetum/genética , Genes myb , Filogenia , Sintenia , Estresse Fisiológico , Regulação da Expressão Gênica de PlantasRESUMO
MOTIVATION: The phylogenetic signal of structural variation informs a more comprehensive understanding of evolution. As (near-)complete genome assembly becomes more commonplace, the next methodological challenge for inferring genome rearrangement trees is the identification of syntenic blocks of orthologous sequences. In this article, we studied 94 reference quality genomes of primarily Mycobacterium tuberculosis (Mtb) isolates as a benchmark to evaluate these methods. The clonal nature of Mtb evolution, the manageable genome sizes, along with substantial levels of structural variation make this an ideal benchmarking dataset. RESULTS: We tested several methods for detecting homology and obtaining syntenic blocks and two methods for inferring phylogenies from them, then compared the resulting trees to the standard method's tree, inferred from nucleotide substitutions. We found that, not only the choice of methods, but also their parameters can impact results, and that the tree inference method had less impact than the block determination method. Interestingly, a rearrangement tree based on blocks from the Cactus whole-genome aligner was fully compatible with the highly supported branches of the substitution-based tree, enabling the combination of the two into a high-resolution supertree. Overall, our results indicate that accurate trees can be inferred using genome rearrangements, but the choice of the methods for inferring homology requires care. AVAILABILITY AND IMPLEMENTATION: Analysis scripts and code written for this study are available at https://gitlab.com/LPCDRP/rearrangement-homology.pub and https://gitlab.com/LPCDRP/syntement. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Mycobacterium tuberculosis , Filogenia , Mycobacterium tuberculosis/genética , Genoma , SinteniaRESUMO
The CoGe software suite at genomevolution.org hosts a number of tools that facilitate genomic research on plant and animal whole-genome multiplication-polyploidy. SynMap permits analysis and visualization of two-way syntenic dotplot alignments of genomes, includes many options and data/graphics download possibilities, and even permits three-genome synteny maps and interactive views. FractBias is a tool that operates within SynMap that permits calculation and graphic display of genome fragments (such as chromosomes) of one species mapped to another, displaying both blockwise homology depths and the extent of syntenic gene (syntelog) loss following polyploidy events. SynMap macrosynteny results can segue into the microsynteny tool GEvo, which provides genome-browser-like views of homologous genome blocks. CoGe FeatView allows call-up of given gene features already stored in the CoGe resource, and CoGeBlast permits searches for additional features that can be analyzed or downloaded further. Links from these tools can be fed into SynFind, which can find syntenic blocks surrounding a feature across multiple specified genomes while also simultaneously providing overall genome-wide syntenic depth calculations that can be interpreted to reflect polyploidy levels. Here, we describe basic use of these tools on the CoGe software suite.
Assuntos
Genômica , Poliploidia , Animais , Software , SinteniaRESUMO
Basic helix-loop-helix (bHLH) proteins are dimeric transcription factors (TFs) involved in various plant physiological and biological processes. Despite this, little is known about the molecular properties and roles of bHLH TFs in pitaya betalain biosynthesis. Here we report the identification of 165 HubHLH genes in H. undantus genome, their chromosomal distribution, physiochemical characteristics, conserved motifs, gene structure, phylogeny and synteny of HubHLH genes. Based on phylogenetic relationship analysis, the 165 HubHLHs were divided into 26 subfamilies and unequally distributed on the 11 chromosomes of pitaya. Based on the pitaya transcriptome data, a candidate gene HubHLH159 was obtained, and the real-time quantitative PCR analysis confirmed that HubHLH159 showed a high expression level in 'Guanhuahong' pitaya (red-pulp) at mature stage, indicating its role in betalain biosynthesis. HubHLH159 is a Group II protein and contains a bHLH domain. It is a nuclear protein with transcriptional activation activity. Dual luciferase reporter assays and virus-induced gene silencing (VIGS) experiments showed that HubHLH159 promotes betalain biosynthesis by activating the expression of HuADH1, HuCYP76AD1-1, and HuDODA1. The results of the present study lay a new theoretical reference for the regulation of pitaya betalain biosynthesis and also provides as essential basis for the future analysis of the functions of HubHLH gene family.
Assuntos
Betalaínas , Transcriptoma , Filogenia , Betalaínas/metabolismo , Sintenia , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Regulação da Expressão Gênica de Plantas , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismoRESUMO
Karyotypes are generally conserved between closely related species and large chromosome rearrangements typically have negative fitness consequences in heterozygotes, potentially driving speciation. In the order Lepidoptera, most investigated species have the ancestral karyotype and gene synteny is often conserved across deep divergence, although examples of extensive genome reshuffling have recently been demonstrated. The genus Leptidea has an unusual level of chromosome variation and rearranged sex chromosomes, but the extent of restructuring across the rest of the genome is so far unknown. To explore the genomes of the wood white (Leptidea) species complex, we generated eight genome assemblies using a combination of 10X linked reads and HiC data, and improved them using linkage maps for two populations of the common wood white (L. sinapis) with distinct karyotypes. Synteny analysis revealed an extensive amount of rearrangements, both compared to the ancestral karyotype and between the Leptidea species, where only one of the three Z chromosomes was conserved across all comparisons. Most restructuring was explained by fissions and fusions, while translocations appear relatively rare. We further detected several examples of segregating rearrangement polymorphisms supporting a highly dynamic genome evolution in this clade. Fusion breakpoints were enriched for LINEs and LTR elements, which suggests that ectopic recombination might be an important driver in the formation of new chromosomes. Our results show that chromosome count alone may conceal the extent of genome restructuring and we propose that the amount of genome evolution in Lepidoptera might still be underestimated due to lack of taxonomic sampling.
Assuntos
Borboletas , Animais , Borboletas/genética , Madeira , Mapeamento Cromossômico , Genoma , Sintenia , Cromossomos Sexuais , Evolução MolecularRESUMO
Poa annua L. is a globally distributed grass with economic and horticultural significance as a weed and as a turfgrass. This dual significance, and its phenotypic plasticity and ecological adaptation, have made P. annua an intriguing plant for genetic and evolutionary studies. Because of the lack of genomic resources and its allotetraploid (2n = 4x = 28) nature, a reference genome sequence would be a valuable asset to better understand the significance and polyploid origin of P. annua. Here we report a genome assembly with scaffolds representing the 14 haploid chromosomes that are 1.78â Gb in length with an N50 of 112â Mb and 96.7% of BUSCO orthologs. Seventy percent of the genome was identified as repetitive elements, 91.0% of which were Copia- or Gypsy-like long-terminal repeats. The genome was annotated with 76,420 genes spanning 13.3% of the 14 chromosomes. The two subgenomes originating from Poa infirma (Knuth) and Poa supina (Schrad) were sufficiently divergent to be distinguishable but syntenic in sequence and annotation with repetitive elements contributing to the expansion of the P. infirma subgenome.
Assuntos
Poa , Poa/genética , Sequências Repetitivas de Ácido Nucleico , Sintenia , Genoma de Planta , Cromossomos , Anotação de Sequência MolecularRESUMO
SUMMARY: Interpreting and visualizing synteny relationships across several genomes is a challenging task. We previously proposed a network-based approach for better visualization and interpretation of large-scale microsynteny analyses. Here, we present syntenet, an R package to infer and analyze synteny networks from whole-genome protein sequence data. The package offers a simple and complete framework, including data preprocessing, synteny detection and network inference, network clustering and phylogenomic profiling, and microsynteny-based phylogeny inference. Graphical functions are also available to create publication-ready plots. Synteny networks inferred with syntenet can highlight taxon-specific gene clusters that likely contributed to the evolution of important traits, and microsynteny-based phylogenies can help resolve phylogenetic relationships under debate. AVAILABILITY AND IMPLEMENTATION: syntenet is available on Bioconductor (https://bioconductor.org/packages/syntenet), and the source code is available on a GitHub repository (https://github.com/almeidasilvaf/syntenet). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genoma , Software , Sintenia , FilogeniaRESUMO
Containing the largest number of species, the orchid family provides not only materials for studying plant evolution and environmental adaptation, but economically and culturally important ornamental plants for human society. Previously, we collected genome and transcriptome information of Dendrobium catenatum, Phalaenopsis equestris, and Apostasia shenzhenica which belong to two different subfamilies of Orchidaceae, and developed user-friendly tools to explore the orchid genetic sequences in the OrchidBase 4.0. The OrchidBase 4.0 offers the opportunity for plant science community to compare orchid genomes and transcriptomes and retrieve orchid sequences for further study.In the year 2022, two whole-genome sequences of Orchidoideae species, Platanthera zijinensis and Platanthera guangdongensis, were de novo sequenced, assembled and analyzed. In addition, systemic transcriptomes from these two species were also established. Therefore, we included these datasets to develop the new version of OrchidBase 5.0. In addition, three new functions including synteny, gene order, and miRNA information were also developed for orchid genome comparisons and miRNA characterization.OrchidBase 5.0 extended the genetic information to three orchid subfamilies (including five orchid species) and provided new tools for orchid researchers to analyze orchid genomes and transcriptomes. The online resources can be accessed at https://cosbi.ee.ncku.edu.tw/orchidbase5/.
Assuntos
MicroRNAs , Orchidaceae , Ordem dos Genes , Bases de Conhecimento , MicroRNAs/genética , Orchidaceae/genética , SinteniaRESUMO
We have sequenced the chloroplast genome of red spruce (Picea rubens) for the first time using the single-end, short-reads (44 bp) Illumina sequences, assembled and functionally annotated it, and identified simple sequence repeats (SSRs). The contigs were assembled using SOAPdenovo2 following the retrieval of chloroplast genome sequences using the black spruce (Picea mariana) chloroplast genome as the reference. The assembled genome length was 122,115 bp (gaps included). Comparatively, the P. rubens chloroplast genome reported here may be considered a near-complete draft. Global genome alignment and phylogenetic analysis based on the whole chloroplast genome sequences of Picea rubens and 10 other Picea species revealed high sequence synteny and conservation among 11 Picea species and phylogenetic relationships consistent with their known classical interrelationships and published molecular phylogeny. The P. rubens chloroplast genome sequence showed the highest similarity with that of P. mariana and the lowest with that of P. sitchensis. We have annotated 107 genes including 69 protein-coding genes, 28 tRNAs, 4 rRNAs, few pseudogenes, identified 42 SSRs, and successfully designed primers for 26 SSRs. Mononucleotide A/T repeats were the most common followed by dinucleotide AT repeats. A similar pattern of microsatellite repeats occurrence was found in the chloroplast genomes of 11 Picea species.
Assuntos
Genoma de Cloroplastos , Picea , Picea/genética , Filogenia , Repetições de Microssatélites/genética , Sintenia , Anotação de Sequência MolecularRESUMO
The veiled chameleon (Chamaeleo calyptratus) is a typical member of the family Chamaeleonidae and a promising object for comparative cytogenetics and genomics. The karyotype of C. calyptratus differs from the putative ancestral chameleon karyotype (2n = 36) due to a smaller chromosome number (2n = 24) resulting from multiple chromosome fusions. The homomorphic sex chromosomes of an XX/XY system were described recently using male-specific RADseq markers. However, the chromosomal pair carrying these markers was not identified. Here we obtained chromosome-specific DNA libraries of C. calyptratus by chromosome flow sorting that were assigned by FISH and sequenced. Sequence comparison with three squamate reptiles reference genomes revealed the ancestral syntenic regions in the C. calyptratus chromosomes. We demonstrated that reducing the chromosome number in the C. calyptratus karyotype occurred through two fusions between microchromosomes and four fusions between micro-and macrochromosomes. PCR-assisted mapping of a previously described Y-specific marker indicates that chromosome 5 may be the sex chromosome pair. One of the chromosome 5 conserved synteny blocks shares homology with the ancestral pleurodont X chromosome, assuming parallelism in the evolution of sex chromosomes from two basal Iguania clades (pleurodonts and acrodonts). The comparative chromosome map produced here can serve as the foundation for future genome assembly of chameleons and vertebrate-wide comparative genomic studies.
Assuntos
Lagartos , Animais , Masculino , Sintenia/genética , Lagartos/genética , Cromossomos Sexuais/genética , Cromossomos , Genoma , Cariótipo , Evolução MolecularRESUMO
As a first step in innate immunity, pattern recognition receptors (PRRs) recognize the distinct pathogen and herbivore-associated molecular patterns and mediate activation of immune responses, but specific steps in the evolution of new PRR sensing functions are not well understood. We employed comparative genomic and functional analyses to define evolutionary events leading to the sensing of the herbivore-associated peptide inceptin (In11) by the PRR inceptin receptor (INR) in legume plant species. Existing and de novo genome assemblies revealed that the presence of a functional INR gene corresponded with ability to respond to In11 across ~53 million years (my) of evolution. In11 recognition is unique to the clade of Phaseoloid legumes, and only a single clade of INR homologs from Phaseoloids was functional in a heterologous model. The syntenic loci of several non-Phaseoloid outgroup species nonetheless contain non-functional INR-like homologs, suggesting that an ancestral gene insertion event and diversification preceded the evolution of a specific INR receptor function ~28 my ago. Chimeric and ancestrally reconstructed receptors indicated that 16 amino acid differences in the C1 leucine-rich repeat domain and C2 intervening motif mediate gain of In11 recognition. Thus, high PRR diversity was likely followed by a small number of mutations to expand innate immune recognition to a novel peptide elicitor. Analysis of INR evolution provides a model for functional diversification of other germline-encoded PRRs.
The health status of a plant depends on the immune system it inherits from its parents. Plants have many receptor proteins that can recognize distinct molecules from insects and microbes, and trigger an immune response. Inheriting the right set of receptors allows plants to detect certain threats and to cope with diseases and pests. Soybeans, chickpeas and other closely-related crop plants belong to a family of plants known as the legumes. Previous studies have found that, unlike other plants, some legumes are able to respond to oral secretions from caterpillars. These plants have a receptor known as INR that binds to a molecule called inceptin in the secretions. However, it remained unclear how or when INR evolved. To address this gap, Snoeck et al. tested immune responses to inceptin in the leaves of 22 species of legume. The experiments revealed that only members of a subgroup of legumes called the Phaseoloids were able to recognize the molecule. Analyzing the genomes of several legume species revealed that the gene encoding INR first emerged around 28 million years ago. Among the descendants of the legumes that first evolved this receptor, only the crop plant soybean and a few other species were unable to respond to inceptin. The genomic data indicated that these species had in fact lost the gene encoding INR over evolutionary time. Snoeck et al. then combined data from genes encoding modern-day receptors to reconstruct the sequence of building blocks that make up the 28-million-year-old version of INR. This ancestral receptor was able to respond to inceptin in the caterpillar secretion, whereas an older version of the protein, which had a slightly different set of building blocks, could not. This suggests that INR evolved the ability to respond to inceptin as a result of small mutations in the gene encoding a more ancient receptor. The work of Snoeck et al. reveals how the Phaseoloids evolved to respond to caterpillars, and how this ability has been lost in soybeans and other members of the subgroup. In the future, these findings may aid plant breeding or genetic engineering approaches for enhancing soybeans and other crops resistance to caterpillar pests.
Assuntos
Imunidade Inata , Receptores de Reconhecimento de Padrão , Receptores de Reconhecimento de Padrão/genética , Receptores de Reconhecimento de Padrão/metabolismo , Plantas/genética , Plantas/metabolismo , SinteniaRESUMO
MOTIVATION: Whole-genome duplication events have long been discovered throughout the evolution of eukaryotes, contributing to genome complexity and biodiversity and leaving traces in the descending organisms. Therefore, an accurate and rapid phylogenomic method is needed to identify the retained duplicated genes on various lineages across the target taxonomy. RESULTS: Here, we present Tree2GD, an integrated method to identify large-scale gene duplication events by automatically perform multiple procedures, including sequence alignment, recognition of homolog, gene tree/species tree reconciliation, Ks distribution of gene duplicates and synteny analyses. Application of Tree2GD on 2 datasets, 12 metazoan genomes and 68 angiosperms, successfully identifies all reported whole-genome duplication events exhibited by these species, showing effectiveness and efficiency of Tree2GD on phylogenomic analyses of large-scale gene duplications. AVAILABILITY AND IMPLEMENTATION: Tree2GD is written in Python and C++ and is available at https://github.com/Dee-chen/Tree2gd. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Eucariotos , Duplicação Gênica , Animais , Filogenia , Sintenia , Alinhamento de SequênciaRESUMO
Decrypting the rearrangements that drive mammalian chromosome evolution is critical to understanding the molecular bases of speciation, adaptation, and disease susceptibility. Using 8 scaffolded and 26 chromosome-scale genome assemblies representing 23/26 mammal orders, we computationally reconstructed ancestral karyotypes and syntenic relationships at 16 nodes along the mammalian phylogeny. Three different reference genomes (human, sloth, and cattle) representing phylogenetically distinct mammalian superorders were used to assess reference bias in the reconstructed ancestral karyotypes and to expand the number of clades with reconstructed genomes. The mammalian ancestor likely had 19 pairs of autosomes, with nine of the smallest chromosomes shared with the common ancestor of all amniotes (three still conserved in extant mammals), demonstrating a striking conservation of synteny for â¼320 My of vertebrate evolution. The numbers and types of chromosome rearrangements were classified for transitions between the ancestral mammalian karyotype, descendent ancestors, and extant species. For example, 94 inversions, 16 fissions, and 14 fusions that occurred over 53 My differentiated the therian from the descendent eutherian ancestor. The highest breakpoint rate was observed between the mammalian and therian ancestors (3.9 breakpoints/My). Reconstructed mammalian ancestor chromosomes were found to have distinct evolutionary histories reflected in their rates and types of rearrangements. The distributions of genes, repetitive elements, topologically associating domains, and actively transcribed regions in multispecies homologous synteny blocks and evolutionary breakpoint regions indicate that purifying selection acted over millions of years of vertebrate evolution to maintain syntenic relationships of developmentally important genes and regulatory landscapes of gene-dense chromosomes.
Assuntos
Evolução Molecular , Cariótipo , Mamíferos , Sintenia , Animais , Bovinos/genética , Cromossomos de Mamíferos/genética , Eutérios/genética , Humanos , Mamíferos/genética , Filogenia , Bichos-Preguiça/genética , Sintenia/genéticaRESUMO
The development of multiple chromosome-scale reference genome sequences in many taxonomic groups has yielded a high-resolution view of the patterns and processes of molecular evolution. Nonetheless, leveraging information across multiple genomes remains a significant challenge in nearly all eukaryotic systems. These challenges range from studying the evolution of chromosome structure, to finding candidate genes for quantitative trait loci, to testing hypotheses about speciation and adaptation. Here, we present GENESPACE, which addresses these challenges by integrating conserved gene order and orthology to define the expected physical position of all genes across multiple genomes. We demonstrate this utility by dissecting presence-absence, copy-number, and structural variation at three levels of biological organization: spanning 300 million years of vertebrate sex chromosome evolution, across the diversity of the Poaceae (grass) plant family, and among 26 maize cultivars. The methods to build and visualize syntenic orthology in the GENESPACE R package offer a significant addition to existing gene family and synteny programs, especially in polyploid, outbred, and other complex genomes.
The genome is the complete DNA sequence of an individual. It is a crucial foundation for many studies in medicine, agriculture, and conservation biology. Advances in genetics have made it possible to rapidly sequence, or read out, the genome of many organisms. For closely related species, scientists can then do detailed comparisons, revealing similar genes with a shared past or a common role, but comparing more distantly related organisms remains difficult. One major challenge is that genes are often lost or duplicated over evolutionary time. One way to be more confident is to look at 'synteny', or how genes are organized or ordered within the genome. In some groups of species, synteny persists across millions of years of evolution. Combining sequence similarity with gene order could make comparisons between distantly related species more robust. To do this, Lovell et al. developed GENESPACE, a software that links similarities between DNA sequences to the order of genes in a genome. This allows researchers to visualize and explore related DNA sequences and determine whether genes have been lost or duplicated. To demonstrate the value of GENESPACE, Lovell et al. explored evolution in vertebrates and flowering plants. The software was able to highlight the shared sequences between unique sex chromosomes in birds and mammals, and it was able to track the positions of genes important in the evolution of grass crops including maize, wheat, and rice. Exploring the genetic code in this way could lead to a better understanding of the evolution of important sections of the genome. It might also allow scientists to find target genes for applications like crop improvement. Lovell et al. have designed the GENESPACE software to be easy for other scientists to use, allowing them to make graphics and perform analyses with few programming skills.
Assuntos
Variações do Número de Cópias de DNA , Evolução Molecular , Dosagem de Genes , Genoma de Planta , Locos de Características Quantitativas , SinteniaRESUMO
Corticotropin-releasing hormone (CRH) was discovered for its role as a brain neurohormone controlling the corticotropic axis in vertebrates. An additional crh gene, crh2, paralog of crh (crh1), and likely resulting from the second round (2R) of vertebrate whole genome duplication (WGD), was identified in a holocephalan chondrichthyan, in basal mammals, various sauropsids and a non-teleost actinopterygian holostean. It was suggested that crh2 has been recurrently lost in some vertebrate groups including teleosts. We further investigated the fate of crh1 and crh2 in vertebrates with a special focus on teleosts. Phylogenetic and synteny analyses showed the presence of duplicated crh1 paralogs, crh1a and crh1b, in most teleosts, resulting from the teleost-specific WGD (3R). Crh1b is conserved in all teleosts studied, while crh1a has been lost independently in some species. Additional crh1 paralogs are present in carps and salmonids, resulting from specific WGD in these lineages. We identified crh2 gene in additional vertebrate groups such as chondrichthyan elasmobranchs, sarcopterygians including dipnoans and amphibians, and basal actinoperygians, Polypteridae and Chondrostei. We also revealed the presence of crh2 in teleosts, including elopomorphs, osteoglossomorphs, clupeiforms, and ostariophysians, while it would have been lost in Euteleostei along with some other groups. To get some insights on the functional evolution of the crh paralogs, we compared their primary and 3D structure, and by qPCR their tissue distribution, in two representative species, the European eel, which possesses three crh paralogs (crh1a, crh1b, crh2), and the Atlantic salmon, which possesses four crh paralogs of the crh1-type. All peptides conserved the structural characteristics of human CRH. Eel crh1b and both salmon crh1b genes were mainly expressed in the brain, supporting the major role of crh1b paralogs in controlling the corticotropic axis in teleosts. In contrast, crh1a paralogs were mainly expressed in peripheral tissues such as muscle and heart, in eel and salmon, reflecting a striking subfunctionalization between crh1a and b paralogs. Eel crh2 was weakly expressed in the brain and peripheral tissues. These results revisit the repertoire of crh in teleosts and highlight functional divergences that may have contributed to the differential conservation of various crh paralogs in teleosts.
Assuntos
Hormônio Liberador da Corticotropina , Salmo salar , Animais , Encéfalo , Hormônio Liberador da Corticotropina/genética , Humanos , Mamíferos , Filogenia , SinteniaRESUMO
Djulis (Chenopodium formosanum Koidz.) is a crop grown since antiquity in Taiwan. It is a BCD-genome hexaploid (2n = 6x = 54) domesticated form of lambsquarters (C. album L.) and a relative of the allotetraploid (AABB) C. quinoa. As with quinoa, djulis seed contains a complete protein profile and many nutritionally important vitamins and minerals. While still sold locally in Taiwanese markets, its traditional culinary uses are being lost as diets of younger generations change. Moreover, indigenous Taiwanese peoples who have long safeguarded djulis are losing their traditional farmlands. We used PacBio sequencing and Hi-C-based scaffolding to produce a chromosome-scale, reference-quality assembly of djulis. The final genome assembly spans 1.63â Gb in 798 scaffolds, with 97.8% of the sequence contained in 27 scaffolds representing the nine haploid chromosomes of each sub-genome of the species. Benchmarking of universal, single-copy orthologs indicated that 98.5% of the conserved orthologous genes for Viridiplantae are complete within the assembled genome, with 92.9% duplicated, as expected for a polyploid. A total of 67.8% of the assembly is repetitive, with the most common repeat being Gypsy long terminal repeat retrotransposons, which had significantly expanded in the B sub-genome. Gene annotation using Iso-Seq data from multiple tissues identified 75,056 putative gene models. Comparisons to quinoa showed strong patterns of synteny which allowed for the identification of homoeologous chromosomes, and sub-genome-specific sequences were used to assign homoeologs to each sub-genome. These results represent the first hexaploid genome assembly and the first assemblies of the C and D genomes of the Chenopodioideae subfamily.
Assuntos
Chenopodium , Chenopodium/genética , Cromossomos de Plantas/genética , Genoma de Planta , Poliploidia , SinteniaRESUMO
MOTIVATION: Synteny analysis is a widely used framework in comparative genomics studies, which provides valuable information to reveal chromosome collinearity in both intra-species and inter-species. Most analysis pipelines, however, are command line-based, making it challenging for biologists to run the algorithms and visualize the results. Existing visualization tools either provide static plots or can only be run on web-based servers and lack efficient visualization methods for associating macro-synteny blocks with individual gene pairs in a micro-synteny region. RESULTS: We developed ShinySyn, a Shiny/R application built on the MCscan framework that provides an easy-to-use graphic interface for synteny analyses without requiring any programming skills, to reduce technical barriers. ShinySyn not only provides interactive visualization for macro-synteny, micro-synteny and genome-level dot views, but it also creates an intuitive representation with a dynamic zooming feature from macro-synteny to individual homologous genes. AVAILABILITY AND IMPLEMENTATION: The source code and installation instructions for ShinySyn can be accessed via https://github.com/obenno/ShinySyn. A pre-built docker image is also available at https://hub.docker.com/r/obenno/shinysyn. The application can be used locally or seamlessly integrated into any Shiny application server.