ABSTRACT
Weevils, classified in the family Curculionidae (true weevils), constitute a group of phytophagous insects of which many species are considered significant pests of crops. Within this family, the red palm weevil (RPW), Rhynchophorus ferrugineus, has an integral role in destroying crops and has invaded all countries of the Middle East and many in North Africa, Southern Europe, Southeast Asia, Oceania, and the Caribbean Islands. Simple sequence repeats (SSRs), also termed microsatellites, have become the DNA marker technology most applied to study population structure, evolution, and genetic diversity. Although these markers have been widely examined in many mammalian and plant species, and draft genome assemblies are available for many species of true weevils, very little is yet known about SSRs in weevil genomes. Here we carried out a comparative analysis examining and comparing the relative abundance, relative density, and GC content of SSRs in previously sequenced draft genomes of nine true weevils, with an emphasis on R. ferrugineus. We also used Illumina paired-end sequencing to generate draft sequence for adult female RPW and characterized it in terms of perfect SSRs with 1-6 bp nucleotide motifs. Among weevil genomes, mono- to trinucleotide SSRs were the most frequent, and mono-, di-, and hexanucleotide SSRs exhibited the highest GC content. In these draft genomes, SSR number and genome size were significantly correlated. This work will aid our understanding of the genome architecture and evolution of Curculionidae weevils and facilitate exploring SSR molecular marker development in these species.
Subject(s)
Coleoptera , Weevils , Animals , Base Composition , Coleoptera/genetics , Forests , Humans , Mammals/genetics , Microsatellite Repeats/genetics , Weevils/geneticsABSTRACT
BACKGROUND: Transposable elements (TEs) are common features in eukaryotic genomes that are known to affect genome evolution critically and to play roles in gene regulation. Vertebrate genomes are dominated by TEs, which can reach copy numbers in the hundreds of thousands. To date, details regarding the presence and characteristics of TEs in camelid genomes have not been made available. RESULTS: We conducted a genome-wide comparative analysis of camelid TEs, focusing on the identification of TEs and elucidation of transposition histories in four species: Camelus dromedarius, C. bactrianus, C. ferus, and Vicugna pacos. Our TE library was created using both de novo structure-based and homology-based searching strategies ( https://github.com/kacst-bioinfo-lab/TE_ideintification_pipeline ). Annotation results indicated a similar proportion of each genomes comprising TEs (35-36%). Class I LTR retrotransposons comprised 16-20% of genomes, and mostly consisted of the endogenous retroviruses (ERVs) groups ERVL, ERVL-MaLR, ERV_classI, and ERV_classII. Non-LTR elements comprised about 12% of genomes and consisted of SINEs (MIRs) and the LINE superfamilies LINE1, LINE2, L3/CR1, and RTE clades. Least represented were the Class II DNA transposons (2%), consisting of hAT-Charlie, TcMar-Tigger, and Helitron elements and comprising about 1-2% of each genome. CONCLUSIONS: The findings of the present study revealed that the distribution of transposable elements across camelid genomes is approximately similar. This investigation presents a characterization of TE content in four camelid to contribute to developing a better understanding of camelid genome architecture and evolution.
Subject(s)
Camelus , DNA Transposable Elements , Animals , DNA Transposable Elements/genetics , Evolution, Molecular , Retroelements/genetics , Short Interspersed Nucleotide ElementsABSTRACT
Intrinsically disordered proteins/regions (IDPs/IDRs) fail to fold completely into 3D structures, but have major roles in determining protein function. While natively disordered proteins/regions have been found to fulfill a wide variety of primary cellular roles, the functions of many disordered proteins in numerous species remain to be uncovered. Here, we perform the first large-scale study of IDPs/IDRs in the genus Camelus, one of the most important mammalians in Asia and North Africa, in order to explore the biological roles of these proteins. The study includes the prediction of disordered proteins/regions in Camelus species and in humans using multiple state-of-the-art prediction tools. Additionally, we provide a comparative analysis of Camelus and Homo sapiens IDPs/IDRs for the sake of highlighting the distinctive use of disorder in each genus. Our findings indicate that the human proteome is more disordered than the Camelus proteome. Gene Ontology analysis also revealed that Camelus IDPs are enriched in glutathione catabolism and lactose biosynthesis.
Subject(s)
Camelus , Genomics , Intrinsically Disordered Proteins/genetics , Animals , Computational Biology , Genome , Humans , Intrinsically Disordered Proteins/chemistry , Protein Conformation , Proteome/metabolism , Proteomics , Species SpecificityABSTRACT
The Camelidae family, ranging from southwest Asia to north Africa, South America, and Australia, includes key domesticated species adapted to diverse environments. Among these, the Arabian camel (Camelus dromedarius) is vital to the cultural and economic landscape of the Arabian Peninsula. This review explores the mitochondrial DNA of the dromedary camel, focusing on the D-loop region to understand its genetic diversity, maternal inheritance, and evolutionary history. We aim to investigate the unique characteristics of Arabian camel mtDNA, analyze the D-loop for genetic diversity and maternal lineage patterns, and explore the implications of mitochondrial genomic studies for camel domestication and adaptation. Key findings on mtDNA structure and variation highlight significant genetic differences and adaptive traits. The D-loop, essential for mtDNA replication and transcription, reveals extensive polymorphisms and haplotypes, providing insights into dromedary camel domestication and breeding history. Comparative analyses with other camelid species reveal unique genetic signatures in the Arabian camel, reflecting its evolutionary and adaptive pathways. Finally, this review integrates recent advancements in mitochondrial genomics, demonstrating camel genetic diversity and potential applications in conservation and breeding programs. Through comprehensive mitochondrial genome analysis, we aim to enhance the understanding of Camelidae genetics and contribute to the preservation and improvement of these vital animals.
ABSTRACT
The red palm weevil (RPW), Rhynchophorus ferrugineus (Coleoptera: Curculionidae), is the most devastating pest of palm trees worldwide. Mitigation of the economic and biodiversity impact it causes is an international priority that could be greatly aided by a better understanding of its biology and genetics. Despite its relevance, the biology of the RPW remains poorly understood, and research on management strategies often focuses on outdated empirical methods that produce sub-optimal results. With the development of omics approaches in genetic research, new avenues for pest control are becoming increasingly feasible. For example, genetic engineering approaches become available once a species's target genes are well characterized in terms of their sequence, but also population variability, epistatic interactions, and more. In the last few years alone, there have been major advances in omics studies of the RPW. Multiple draft genomes are currently available, along with short and long-read transcriptomes, and metagenomes, which have facilitated the identification of genes of interest to the RPW scientific community. This review describes omics approaches previously applied to RPW research, highlights findings that could be impactful for pest management, and emphasizes future opportunities and challenges in this area of research.
ABSTRACT
Middle East respiratory syndrome is a severe respiratory illness caused by an infectious coronavirus. This virus is associated with a high mortality rate, but there is as of yet no effective vaccine or antibody available for human immunity/treatment. Drug design relies on understanding the 3D structures of viral proteins; however, arriving at such understanding is difficult for intrinsically disordered proteins, whose disorder-dependent functions are key to the virus's biology. Disorder is suggested to provide viral proteins with highly flexible structures and diverse functions that are utilized when invading host organisms and adjusting to new habitats. To date, the functional roles of intrinsically disordered proteins in the mechanisms of MERS-CoV pathogenesis, transmission, and treatment remain unclear. In this study, we performed structural analysis to evaluate the abundance of intrinsic disorder in the MERS-CoV proteome and in individual proteins derived from the MERS-CoV genome. Moreover, we detected disordered protein binding regions, namely, molecular recognition features and short linear motifs. Studying disordered proteins/regions in MERS-CoV could contribute to unlocking the complex riddles of viral infection, exploitation strategies, and drug development approaches in the near future by making it possible to target these important (yet challenging) unstructured regions.
Subject(s)
Coronavirus Infections/virology , Intrinsically Disordered Proteins/chemistry , Middle East Respiratory Syndrome Coronavirus/immunology , Viral Nonstructural Proteins/chemistry , Databases, Protein , Humans , Protein DomainsABSTRACT
Background: Transposable elements (TEs) are the largest component of the genetic material of most eukaryotes and can play roles in shaping genome architecture and regulating phenotypic variation; thus, understanding genome evolution is only possible if we comprehend the contributions of TEs. However, the quantitative and qualitative contributions of TEs can vary, even between closely related lineages. For palm species, in particular, the dynamics of the process through which TEs have differently shaped their genomes remains poorly understood because of a lack of comparative studies. Materials and methods: We conducted a genome-wide comparative analysis of palm TEs, focusing on identifying and classifying TEs using the draft assemblies of four palm species: Phoenix dactylifera, Cocos nucifera, Calamus simplicifolius, and Elaeis oleifera. Our TE library was generated using both de novo structure-based and homology-based methodologies. Results: The generated libraries revealed the TE component of each assembly, which varied from 41-81%. Class I retrotransposons covered 36-75% of these species' draft genome sequences and primarily consisted of LTR retroelements, while non-LTR elements covered about 0.56-2.31% of each assembly, mainly as LINEs. The least represented were Class DNA transposons, comprising 1.87-3.37%. Conclusion: The current study contributes to a detailed identification and characterization of transposable elements in Palmae draft genome assemblies.
Subject(s)
DNA Transposable Elements , Evolution, Molecular , DNA Transposable Elements/genetics , Retroelements/geneticsABSTRACT
The red palm weevil Rhynchophorus ferrugineus (Coleoptera: Curculionidae) is an economically-important invasive species that attacks multiple species of palm trees around the world. A better understanding of gene content and function in R. ferrugineus has the potential to inform pest control strategies and thereby mitigate economic and biodiversity losses caused by this species. Using 10x Genomics linked-read sequencing, we produced a haplotype-resolved diploid genome assembly for R. ferrugineus from a single heterozygous individual with modest sequencing coverage ([Formula: see text] 62x). Benchmarking against conserved single-copy Arthropod orthologs suggests both pseudo-haplotypes in our R. ferrugineus genome assembly are highly complete with respect to gene content, and do not suffer from haplotype-induced duplication artifacts present in a recently published hybrid assembly for this species. Annotation of the larger pseudo-haplotype in our assembly provides evidence for 23,413 protein-coding loci in R. ferrugineus, including over 13,000 predicted proteins annotated with Gene Ontology terms and over 6000 loci independently supported by high-quality Iso-Seq transcriptomic data. Our assembly also includes 95% of R. ferrugineus chemosensory, detoxification and neuropeptide-related transcripts identified previously using RNA-seq transcriptomic data, and provides a platform for the molecular analysis of these and other functionally-relevant genes that can help guide management of this widespread insect pest.
Subject(s)
Genome, Insect , Weevils/genetics , Animals , Female , Genetic Association Studies , Haplotypes , MaleABSTRACT
The 15,619 bp mitochondrial genome of Jebusaea hammerschmidtii was assembled from short reads, annotated, and compared to the genomes of other longhorn beetles (Cerambycidae). Gene content was typical of animal mitochondrial genomes and contained 13 protein-coding, 22 tRNA, and 2 rRNA genes. Gene organization was identical to that of other longhorn beetles. Phylogenetic analysis placed J. hammerschmidtii within the subfamily Cerambycinae, and strongly supported the monophyly of the Cerambycinae, Lamiinae, and Prioninae subfamilies.
ABSTRACT
BACKGROUND: Microsatellites or simple sequence repeats (SSRs) have become the most significant DNA marker technology used in genetic research. The availability of complete draft genomes for a number of Palmae species has made it possible to perform genome-wide analysis of SSRs in these species. Palm trees are tropical and subtropical plants with agricultural and economic importance due to the nutritional value of their fruit cultivars. OBJECTIVE: This is the first comprehensive study examining and comparing microsatellites in completely-sequenced draft genomes of Palmae species. METHODS: We identified and compared perfect SSRs with 1-6 bp nucleotide motifs to characterize microsatellites in Palmae species using PERF v0.2.5. We analyzed their relative abundance, relative density, and GC content in five palm species: Phoenix dactylifera, Cocos nucifera, Calamus simplicifolius, Elaeis oleifera, and Elaeis guineensis. RESULTS: A total of 118241, 328189, 450753, 176608, and 70694 SSRs were identified, respectively. The six repeat types were not evenly distributed across the five genomes. Mono- and dinucleotide SSRs were the most abundant, and GC content was highest in tri- and hexanucleotide SSRs. CONCLUSION: We envisage that this analysis would further substantiate more in-depth computational, biochemical, and molecular studies on the roles SSRs may play in the genome organization of the palm species. The current study contributes a detailed characterization of simple sequence repeats in palm genomes.
Subject(s)
Arecaceae/genetics , Microsatellite Repeats , Arecaceae/classification , Genome, Plant , PhylogenyABSTRACT
Middle East respiratory syndrome coronavirus (MERS-CoV) causes severe respiratory illness in humans; the second-largest and most deadly outbreak to date occurred in Saudi Arabia. The dromedary camel is considered a possible host of the virus and also to act as a reservoir, transmitting the virus to humans. Here, we studied evolutionary relationships for 31 complete genomes of betacoronaviruses, including eight newly sequenced MERS-CoV genomes isolated from dromedary camels in Saudi Arabia. Through bioinformatics tools, we also used available sequences and 3D structure of MERS-CoV spike glycoprotein to predict MERS-CoV epitopes and assess antibody binding affinity. Phylogenetic analysis showed the eight new sequences have close relationships with existing strains detected in camels and humans in Arabian Gulf countries. The 2019-nCov strain appears to have higher homology to both bat coronavirus and SARS-CoV than to MERS-CoV strains. The spike protein tree exhibited clustering of MERS-CoV sequences similar to the complete genome tree, except for one sequence from Qatar (KF961222). B cell epitope analysis determined that the MERS-CoV spike protein has 24 total discontinuous regions from which just six epitopes were selected with score values of >80%. Our results suggest that the virus circulates by way of camels crossing the borders of Arabian Gulf countries. This study contributes to finding more effective vaccines in order to provide long-term protection against MERS-CoV and identifying neutralizing antibodies.
Subject(s)
Camelus/virology , Coronavirus Infections/virology , Middle East Respiratory Syndrome Coronavirus/genetics , Spike Glycoprotein, Coronavirus/genetics , Amino Acid Sequence , Animals , Betacoronavirus/classification , Betacoronavirus/genetics , Betacoronavirus/isolation & purification , Biological Evolution , DNA, Complementary/chemistry , DNA, Viral/chemistry , Epitopes/analysis , Epitopes/chemistry , Epitopes/genetics , Gene Library , Humans , Middle East Respiratory Syndrome Coronavirus/classification , Middle East Respiratory Syndrome Coronavirus/isolation & purification , Phylogeny , RNA, Viral/analysis , RNA, Viral/chemistry , RNA, Viral/isolation & purification , Saudi ArabiaABSTRACT
Camelus dromedarius has played a pivotal role in both culture and way of life in the Arabian peninsula, particularly in arid regions where other domestic animals cannot be easily domesticated. Although, the mitochondrial genomes have recently been sequenced for several camelid species, wider phylogenetic studies are yet to be performed. The features of conserved gene elements, rapid evolutionary rate, and rare recombination make the mitochondrial genome a useful molecular marker for phylogenetic studies of closely related species. Here we carried out a comparative analysis of previously sequenced mitochondrial genomes of camelids with an emphasis on C. dromedarius, revealing a number of noticeable findings. First, the arrangement of mitochondrial genes in C. dromedarius is similar to those of the other camelids. Second, multiple sequence alignment of intergenic regions shows up to 90% similarity across different kinds of camels, with dromedary camels to reach 99%. Third, we successfully identified the three domains (termination-associated sequence, conserved domain and conserved sequence block) of the control region structure. The phylogenetic tree analysis showed that C. dromedarius mitogenomes were significantly clustered in the same clade with Lama pacos mitogenome. These findings will enhance our understanding of the nucleotide composition and molecular evolution of the mitogenomes of the genus Camelus, and provide more data for comparative mitogenomics in the family Camelidae.
Subject(s)
Camelus/genetics , Genome, Mitochondrial , Animals , DNA, Intergenic/genetics , Evolution, Molecular , Genes, rRNA/genetics , Mitochondrial Proteins/genetics , Molecular Sequence Annotation , Phylogeny , RNA, Transfer/genetics , Sequence AlignmentABSTRACT
Highly conserved noncoding elements (CNEs) constitute a significant proportion of the genomes of multicellular eukaryotes. The function of most CNEs remains elusive, but growing evidence indicates they are under some form of purifying selection. Noncoding regions in many species also harbor large numbers of transposable element (TE) insertions, which are typically lineage specific and depleted in exons because of their deleterious effects on gene function or expression. However, it is currently unknown whether the landscape of TE insertions in noncoding regions is random or influenced by purifying selection on CNEs. Here, we combine comparative and population genomic data in Drosophila melanogaster to show that the abundance of TE insertions in intronic and intergenic CNEs is reduced relative to random expectation, supporting the idea that selective constraints on CNEs eliminate a proportion of TE insertions in noncoding regions. However, we find no evidence for differences in the allele frequency spectra for polymorphic TE insertions in CNEs versus those in unconstrained spacer regions, suggesting that the distribution of fitness effects acting on observable TE insertions is similar across different functional compartments in noncoding DNA. Our results provide evidence that selective constraints on CNEs contribute to shaping the landscape of TE insertion in eukaryotic genomes, and provide further evidence that CNEs are indeed functionally constrained and not simply mutational cold spots.
Subject(s)
Conserved Sequence/genetics , DNA Transposable Elements/genetics , Drosophila melanogaster/genetics , Drosophila/genetics , Genome, Insect/genetics , RNA, Untranslated/genetics , Animals , Eukaryota/genetics , Exons/genetics , Gene Frequency/genetics , Mutation/genetics , Selection, Genetic/geneticsABSTRACT
Small heat shock protein beta-1 (HSPB-1) plays an essential role in the protection of cells against environmental stress.Elucidation of its molecular, structural, and biological characteristics in a naturally wild-type model is essential. Although the sequence information of the HSPB-1 gene is available for many mammalian species, the HSPB-1 gene of Arabian camel (Arabian camel HSPB-1) has not yet been structurally characterized. We cloned and functionally characterized a full-length of Arabian camel HSPB-1 cDNA. It is 791 bp long, with a 5'-untranslated region (UTR) of 34 bp, a 3'-UTR of 151 bp with a poly(A) tail, and an open reading frame (ORF) of 606 bp encoding a protein of 201 amino acids (accession number: MF278354). The tissue-specific expression analysis of Arabian camel HSPB-1 mRNA was examined using quantitative real-time PCR (qRT-PCR); which suggested that Arabian camel HSPB-1 mRNA was constitutionally expressed in all examined tissues of Arabian camel, with the predominately level in the esophagus tissue. Peptide mass fingerprint-mass spectrometry (PMF-MS) analysis of the purified Arabian camel HSPB-1 protein confirmed the identity of this protein. Phylogenetic analysis showed that the HSPB-1 protein of Arabian camel is grouped together with those of Bactrian camel and Alpaca. Comparing the modelled 3D structure of Arabian camel HSPB-1 protein with the available protein 3D structure of HSPB-1 from human confirmed the presence of α-crystallin domain, and high similarities were noted between the two structures by using super secondary structure prediction.
Subject(s)
Camelus/genetics , Computational Biology , Heat-Shock Proteins/genetics , 5' Untranslated Regions , Amino Acid Sequence , Animals , Chromatography, Liquid , Cloning, Molecular , DNA, Complementary/genetics , Gene Expression , Heat-Shock Proteins/chemistry , Models, Molecular , Phylogeny , Protein Structure, Secondary , RNA, Messenger/genetics , Real-Time Polymerase Chain Reaction , Sequence Homology, Amino Acid , Spectrometry, Mass, Matrix-Assisted Laser Desorption-IonizationABSTRACT
Bactrian camel (Camelus bactrianus), dromedary (Camelus dromedarius) and alpaca (Vicugna pacos) are economically important livestock. Although the Bactrian camel and dromedary are large, typically arid-desert-adapted mammals, alpacas are adapted to plateaus. Here we present high-quality genome sequences of these three species. Our analysis reveals the demographic history of these species since the Tortonian Stage of the Miocene and uncovers a striking correlation between large fluctuations in population size and geological time boundaries. Comparative genomic analysis reveals complex features related to desert adaptations, including fat and water metabolism, stress responses to heat, aridity, intense ultraviolet radiation and choking dust. Transcriptomic analysis of Bactrian camels further reveals unique osmoregulation, osmoprotection and compensatory mechanisms for water reservation underpinned by high blood glucose levels. We hypothesize that these physiological mechanisms represent kidney evolutionary adaptations to the desert environment. This study advances our understanding of camelid evolution and the adaptation of camels to arid-desert environments.
Subject(s)
Adaptation, Physiological/genetics , Biological Evolution , Camelus/genetics , Genome , Transcriptome , Adipose Tissue/metabolism , Animals , Blood Glucose/chemistry , Desert Climate , Environment , Female , Gene Expression Profiling , Humans , Male , Molecular Sequence Data , Osmoregulation , Phylogeny , Sodium/metabolism , Species Specificity , Transcription, Genetic , Ultraviolet Rays , Water/chemistryABSTRACT
Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and approximately 40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism.