RESUMO
Bacteriophages or bacteria infecting viruses are genetically diverse. Due to the emergence of antimicrobial-resistant bacteria, lytic bacteriophages are gaining enormous attention for treating superbug infections. Klebsiella pneumoniae is one of the eight most significant nosocomial pathogens and is addressed as a critical priority pathogen by WHO, requiring alternative treatment options. We reported two highly lytic bacteriophages, Klebsiella phage Kpn BM7 and the novel Klebsiella phage Kpn BU9, isolated from hospital wastewater and exhibiting lytic activity against different clinical isolates. Whole-genome analysis revealed that phages BM7 and BU9 belong to class Caudoviricetes. Phage BM7, with a genome length of 170,558 bp, is a member of the genus Marfavirus and the species Marfavirus F48, while phage BU9, with a genome length of 60,450 bp, remains unclassified. Neither phage harbors any lysogenic, toxin, or antimicrobial resistance genes. Both phages can steadily survive up to 40°C and at pH 5-7. The optimal MOI was 0.1 for BM7 and 1 for BU9, with short latent periods of 10 and 25 min and burst sizes of 85 PFU/cell and 12 PFU/cell, respectively. This is the first carbapenem-resistant K. pneumoniae (CRKP) targeting lytic phages to be reported from Bangladesh. This study suggests that BM7 and BU9 are potential candidates for targeting carbapenem-resistant K. pneumoniae.
RESUMO
Salmonella Typhimurium is an invasive gastrointestinal pathogen for both humans and animals. To investigate the genetic framework and diversity of S. Typhimurium, a total of 194 S. Typhimurium isolates were collected from patients in a tertiary hospital between 2020 and 2021. Antimicrobial susceptibility testing was used to confirm the resistance phenotype. Whole-genome sequencing and bioinformatics analysis were performed to determine the sequence type, phylogenetic relationships, resistance gene profiles, Salmonella pathogenicity island (SPI) and the diversity of the core and pan genome. The result showed that 57.22% of S. Typhimurium isolates were multidrug resistant and resistance of total isolates to the first-line drug ciprofloxacin was identified in 60.82%. The population structure of S. Typhimurium was categorized into three lineages: ST19 (20.10%, 39/194), ST34-1 (47.42%, 92/194) and ST34-2 (40.65%, 63/194), with the population size exhibiting increasing trends. All lineages harbored variety of fimbrial operons, prophages, SPIs and effectors that contributed to the virulence and long-term infections of S. Typhimurium. Importantly, ST34-1 lineage might potentially be more invasive due to the possession of SPI1-effector gene sopE which was essential for the proliferation, internalization and intracellular presence of S. Typhimurium in hosts. Multiple antimicrobial resistance genes were characteristically distributed across three lineages, especially carbapenem genes only detected in ST34-1&2 lineages. The distinct functional categories of pan genome among three lineages were observed in metabolism, signaling and gene information processing. This study provides a theoretical foundation for the evolved adaptation and genetic diversity of S. Typhimurium ST19 and ST34, among which ST34 lineages with multidrug resistance and potential hypervirulence need to pay more attention to epidemiological surveillance.
RESUMO
Investigating plant genomes offers crucial foundational resources for exploring various aspects of plant biology and applications, such as functional genomics and breeding practices. With the development in sequencing and assembly technology, several Nicotiana tabacum genomes have been published. In this paper, we reviewed the progress on N. tabacum genome assembly and quality, from the initial draft genomes to the recent high-quality chromosome-level assemblies. The application of long-read sequencing, optical mapping, and Hi-C technologies has significantly improved the contiguity and completeness of N. tabacum genome assemblies, with the latest assemblies having a contig N50 size over 50 Mb. Despite these advancements, further improvements are still required and possible, particularly on the development of pan-genome and telomere-to-telomere (T2T) genomes. These new genomes will capture the genomic diversity and variations among different N. tabacum cultivars and species, and provide a comprehensive view of the N. tabacum genome structure and gene content, so to deepen our understanding of the N. tabacum genome and facilitate precise breeding and functional genomics.
RESUMO
Bacillus subtilis is an important industrial and environmental microorganism known to occupy many niches and produce many compounds of interest. Although it is one of the best-studied organisms, much of this focus including the reconstruction of genome-scale metabolic models has been placed on a few key laboratory strains. Here, we substantially expand these prior models to pan-genome-scale, representing 481 genomes of B. subtilis with 2,315 orthologous gene clusters, 1,874 metabolites, and 2,239 reactions. Furthermore, we incorporate data from carbon utilization experiments for eight strains to refine and validate its metabolic predictions. This comprehensive pan-genome model enables the assessment of strain-to-strain differences related to nutrient utilization, fermentation outputs, robustness, and other metabolic aspects. Using the model and phenotypic predictions, we divide B. subtilis strains into five groups with distinct patterns of behavior that correlate across these features. The pan-genome model offers deep insights into B. subtilis' metabolism as it varies across environments and provides an understanding as to how different strains have adapted to dynamic habitats. IMPORTANCE: As the volume of genomic data and computational power have increased, so has the number of genome-scale metabolic models. These models encapsulate the totality of metabolic functions for a given organism. Bacillus subtilis strain 168 is one of the first bacteria for which a metabolic network was reconstructed. Since then, several updated reconstructions have been generated for this model microorganism. Here, we expand the metabolic model for a single strain into a pan-genome-scale model, which consists of individual models for 481 B. subtilis strains. By evaluating differences between these strains, we identified five distinct groups of strains, allowing for the rapid classification of any particular strain. Furthermore, this classification into five groups aids the rapid identification of suitable strains for any application.
RESUMO
Antibiotic resistance in bacteria leads to high mortality rates and healthcare costs, a significant concern for public health. A colonizer of the human respiratory system, Stenotrophomonas maltophilia is frequently associated with hospital-acquired infections in individuals with cystic fibrosis, cancer, and other chronic illnesses. The importance of this study is underscored by its capacity to meet the critical demand for effective preventive strategies against this pathogen, particularly among susceptible groups of cystic fibrosis and those undergoing cancer treatment. In this study, we engineered a multi-epitope vaccine targeting S. maltophilia through genomic analysis, reverse vaccination strategies, and immunoinformatic techniques by examining a total of 81 complete genomes of S. maltophilia strains. Our investigation revealed 1945 core protein-coding genes alongside their corresponding proteomic sequences, with 191 of these genes predicted to exhibit virulence characteristics. Out of the filtered proteins, three best antigenic proteins were selected for epitope prediction while seven epitopes each from CTL, HTL, and B cell were chosen for vaccine development. The vaccine was refined and validated, showing highly antigenic and desirable physicochemical features. Molecular docking assessments revealed stable binding with TLR-4. Molecular dynamic simulation demonstrated stable dynamics with minor alterations. The originality of this investigation is rooted in the thorough techniques aimed at designing a vaccine that directly targets S. maltophilia, a microorganism of considerable clinical relevance that currently lacks an available vaccine. This study not only responds to a pressing public health crisis but also lays the groundwork for subsequent research endeavors focused on the prevention of S. maltophilia outbreaks. Further evidence from studies in mice models is needed to confirm immune protection against S. maltophilia.
RESUMO
Purpose: Staphylococcus warneri is an opportunistic pathogen responsible for hospital-acquired infections (HAIs). The aim of this study was to describe an outbreak caused by S. warneri infection in a neonatal intensive care unit (NICU) and provide investigation, prevention and control strategies for this outbreak. Methods: We conducted an epidemiological investigation of the NICU S. warneri outbreak, involving seven neonates, staff, and environmental screening, to identify the source of infection. WGS analyses were performed on S. warneri isolates, including species identification, core genome single-nucleotide polymorphism (cgSNP) analysis, pan-genome analysis, and genetic characterization assessment of the prevalence of specific antibiotic resistance and virulence genes. Results: Eight S. warneri strains were isolated from this outbreak, with seven from neonates and one from environment. Six clinical cases within three days in 2021 were linked to one strain isolated from environmental samples; isolates varied by 0-69 SNPs and were confirmed to be from an outbreak through WGS. Multiple infection prevention measures were implemented, including comprehensive environmental disinfection and stringent protocols, and all affected neonates were transferred to the isolation wards. Following these interventions, no further cases of S. warneri infections were observed. Furthermore, pan-genome analysis results suggested that in human S. warneri may exhibit host specificity. Conclusion: The investigation has revealed that the outbreak was linked to the milk preparation workbench by the WGS. It is recommended that there be a stronger focus on environmental disinfection management in order to raise awareness, improve identification, and prevention of healthcare-associated infections that are associated with the hospital environment.
RESUMO
Streptomyces clavuligerus is a species used worldwide to industrially produce clavulanic acid (CA), a molecule that enhances antibiotic effectiveness against ß-lactamase-producing bacterial strains. Despite its low inherent CA production, hyper-producing strains have been developed. However, genomic analyses specific to S. clavuligerus and CA biosynthesis are limited. Genomic variations that may influence CA yield were explored using S. clavuligerus strain genomes from diverse sources. Despite the slight differences obtained by similarity index calculation, pan-genome estimation revealed that only half of the genes identified were present in all strains. As expected, core genes were associated with primary metabolism, while the remaining genes were linked to secondary metabolism. Differences at the sequence level were more likely to be found in regions close to the tips of the linear chromosome. Wild-type strains preserved larger chromosomal and plasmid regions compared to industrial and/or hyper-producing strains; such a grouping pattern was also found through refined phylogenetic analyses. These results provide essential insights for the development of hyper-producing S. clavuligerus strains, attending to the critical demand for this antibiotic enhancer and contributing to future strategies for CA production optimization.
Assuntos
Ácido Clavulânico , Genoma Bacteriano , Filogenia , Streptomyces , Streptomyces/genética , Streptomyces/metabolismo , Ácido Clavulânico/biossíntese , Variação Genética , Genômica/métodos , Plasmídeos/genéticaRESUMO
Crop wild relatives of perennial fruit crops have a wealth of untapped genetic diversity that can be utilized for cultivar development. However, barriers such as linkage drag, long juvenility, and high heterozygosity have hindered their utilization. Advancements in genome sequencing technologies and assembly methods, combined with the integration of chromosome conformation capture have made it possible to construct high-quality reference genomes. These genome assemblies can be combined into pan-genomes, capturing inter- and intraspecific variations across coding and non-coding regions. Pan-genomes of perennial fruit crops are being developed to identify the genetic basis of traits. This will help overcome breeding challenges, enabling faster and more targeted development of new cultivars with novel traits through breeding and biotechnology.
RESUMO
Genomic regions that play a role in parasite defense are often found to be highly variable, with the major histocompatibility complex serving as an iconic example. Single nucleotide polymorphisms may represent only a small portion of this variability, with Indel polymorphisms and copy number variation further contributing. In extreme cases, haplotypes may no longer be recognized as orthologous. Understanding the evolution of such highly divergent regions is challenging because the most extreme variation is not visible using reference-assisted genomic approaches. Here we analyze the case of the Pasteuria Resistance Complex in the crustacean Daphnia magna, a defense complex in the host against the common and virulent bacterium Pasteuria ramosa. Two haplotypes of this region have been previously described, with parts of it being nonhomologous, and the region has been shown to be under balancing selection. Using pan-genome analysis and tree reconciliation methods to explore the evolution of the Pasteuria Resistance Complex and its characteristics within and between species of Daphnia and other Cladoceran species, our analysis revealed a remarkable diversity in this region even among host species, with many nonhomologous hyper-divergent haplotypes. The Pasteuria Resistance Complex is characterized by extensive duplication and losses of Fucosyltransferase (FuT) and Galactosyltransferase (GalT) genes that are believed to play a role in parasite defense. The Pasteuria Resistance Complex region can be traced back to common ancestors over 250 million years. The unique combination of an ancient resistance complex and a dynamic, hyper-divergent genomic environment presents a fascinating opportunity to investigate the role of such regions in the evolution and long-term maintenance of resistance polymorphisms. Our findings offer valuable insights into the evolutionary forces shaping disease resistance and adaptation, not only in the genus Daphnia, but potentially across the entire Cladocera class.
Assuntos
Daphnia , Evolução Molecular , Pasteuria , Animais , Daphnia/genética , Daphnia/microbiologia , Pasteuria/genética , Pasteuria/patogenicidade , Haplótipos , Resistência à Doença/genética , Variação GenéticaRESUMO
BACKGROUND: Unveiling genetic diversity features and understanding the genetic mechanisms of diverse goat phenotypes are pivotal in facilitating the preservation and utilization of these genetic resources. However, the total genetic diversity within a species can't be captured by the reference genome of a single individual. The pan-genome is a collection of all the DNA sequences that occur in a species, and it is expected to capture the total genomic diversity of the specific species. RESULTS: We constructed a goat pan-genome using map-to-pan assemble based on 813 individuals, including 723 domestic goats and 90 samples from their wild relatives, which presented a broad regional and global representation. In total, 146 Mb sequences and 974 genes were identified as absent from the reference genome (ARS1.2; GCF_001704415.2). We identified 3,190 novel single nucleotide polymorphisms (SNPs) using the pan-genome analysis. These novel SNPs could properly reveal the population structure of domestic goats and their wild relatives. Presence/absence variation (PAV) analysis revealed gene loss and intense negative selection during domestication and improvement. CONCLUSIONS: Our research highlights the importance of the goat pan-genome in capturing the missing genetic variations. It reveals the changes in genomic architecture during goat domestication and improvement, such as gene loss. This improves our understanding of the evolutionary and breeding history of goats.
RESUMO
Ash trees (Fraxinus) exhibit rich genetic diversity and wide adaptation to various ecological environments, several of which are highly salt-tolerant. Dissecting the genomic basis underlying ash tree salt adaptation is vital for its resistance breeding. Here, we presented eleven high-quality chromosome-level genome assemblies for Fraxinus species, revealing two unequal sub-genome compositions and two more recent whole-genome triplication events in evolutionary history. A Fraxinus structural variation-based pan-genome was constructed and revealed that presence-absence variations (PAVs) of transmembrane transport genes likely contribute to Fraxinus salt adaptation. Through whole-genome resequencing of an inter-species cross F1-population of F. velutina 'Lula 3' (salt-tolerant) × F. pennsylvanica 'Lula 5' (salt-sensitive), we performed a salt tolerance PAV-based quantitative trait loci (QTL) mapping and pinpointed two PAV-QTLs and candidate genes associated with Fraxinus salt tolerance. Mechanismly, FvbHLH85 enhanced salt tolerance by mediating reactive oxygen species and Na+/K+ homeostasis, while FvSWEET5 by mediating osmotic homeostasis. Collectively, these findings provide valuable genomic resources for Fraxinus salt resistance breeding and research community.
RESUMO
BACKGROUND: ââThe genus Fusarium poses significant threats to food security and safety worldwide because numerous species of the fungus cause destructive diseases and/or mycotoxin contamination in crops. The adverse effects of climate change are exacerbating some existing threats and causing new problems. These challenges highlight the need for innovative solutions, including the development of advanced tools to identify targets for control strategies. DESCRIPTION: In response to these challenges, we developed the Fusarium Protein Toolkit (FPT), a web-based tool that allows users to interrogate the structural and variant landscape within the Fusarium pan-genome. The tool displays both AlphaFold and ESMFold-generated protein structure models from six Fusarium species. The structures are accessible through a user-friendly web portal and facilitate comparative analysis, functional annotation inference, and identification of related protein structures. Using a protein language model, FPT predicts the impact of over 270 million coding variants in two of the most agriculturally important species, Fusarium graminearum and F. verticillioides. To facilitate the assessment of naturally occurring genetic variation, FPT provides variant effect scores for proteins in a Fusarium pan-genome based on 22 diverse species. The scores indicate potential functional consequences of amino acid substitutions and are displayed as intuitive heatmaps using the PanEffect framework. CONCLUSION: FPT fills a knowledge gap by providing previously unavailable tools to assess structural and missense variation in proteins produced by Fusarium. FPT has the potential to deepen our understanding of pathogenic mechanisms in Fusarium, and aid the identification of genetic targets for control strategies that reduce crop diseases and mycotoxin contamination. Such targets are vital to solving the agricultural problems incited by Fusarium, particularly evolving threats resulting from climate change. Thus, FPT has the potential to contribute to improving food security and safety worldwide.
Assuntos
Proteínas Fúngicas , Fusarium , Internet , Fusarium/genética , Fusarium/metabolismo , Fusarium/classificação , Proteínas Fúngicas/genética , Proteínas Fúngicas/química , Proteínas Fúngicas/metabolismo , Genoma Fúngico/genética , Variação Genética , Modelos Moleculares , Software , Conformação ProteicaRESUMO
BACKGROUND: Bacillus anthracis is a highly pathogenic bacterium that can cause lethal infection in animals and humans, making it a significant concern as a pathogen and biological agent. Consequently, accurate diagnosis of B. anthracis is critically important for public health. However, the identification of specific marker genes encoded in the B. anthracis chromosome is challenging due to the genetic similarity it shares with B. cereus and B. thuringiensis. METHODS: The complete genomes of B. anthracis, B. cereus, B. thuringiensis, and B. weihenstephanensis were de novo annotated with Prokka, and these annotations were used by Roary to produce the pan-genome. B. anthracis exclusive genes were identified by Perl script, and their specificity was examined by nucleotide BLAST search. A local BLAST alignment was performed to confirm the presence of the identified genes across various B. anthracis strains. Multiplex polymerase chain reactions (PCR) were established based on the identified genes. RESULT: The distribution of genes among 151 whole-genome sequences exhibited three distinct major patterns, depending on the bacterial species and strains. Further comparative analysis between the three groups uncovered thirty chromosome-encoded genes exclusively present in B. anthracis strains. Of these, twenty were found in known lambda prophage regions, and ten were in previously undefined region of the chromosome. We established three distinct multiplex PCRs for the specific detection of B. anthracis by utilizing three of the identified genes, BA1698, BA5354, and BA5361. CONCLUSION: The study identified thirty chromosome-encoded genes specific to B. anthracis, encompassing previously described genes in known lambda prophage regions and nine newly discovered genes from an undefined gene region to the best of our knowledge. Three multiplex PCR assays offer an accurate and reliable alternative method for detecting B. anthracis. Furthermore, these genetic markers have value in anthrax vaccine development, and understanding the pathogenicity of B. anthracis.
Assuntos
Bacillus anthracis , Cromossomos Bacterianos , Genoma Bacteriano , Reação em Cadeia da Polimerase Multiplex , Bacillus anthracis/genética , Bacillus anthracis/isolamento & purificação , Reação em Cadeia da Polimerase Multiplex/métodos , Cromossomos Bacterianos/genética , Marcadores Genéticos , Antraz/microbiologia , Antraz/diagnóstico , Humanos , Sequenciamento Completo do Genoma/métodosRESUMO
Background: Next-generation sequencing of Mycobacterium tuberculosis, the infectious agent causing tuberculosis, is improving the understanding of genomic diversity of circulating lineages and strain-types, and informing knowledge of drug resistance mutations. An increasingly popular approach to characterizing M. tuberculosis genomes (size: 4.4 Mbp) and variants (e.g., single nucleotide polymorphisms (SNPs)) involves the de novo assembly of sequence data. Methods: We compared the performance of genome assembly tools (Unicycler, RagOut, and RagTag) on sequence data from nine drug resistant M. tuberculosis isolates (multi-drug (MDR) n = 1; pre-extensively-drug (pre-XDR) n = 8) generated using Illumina HiSeq, Oxford Nanopore Technology (ONT) PromethION, and PacBio platforms. Results: Our investigation found that Unicycler-based assemblies had significantly higher genome completeness (~98.7%; p values = 0.01) compared to other assembler tools (RagOut = 98.6%, and RagTag = 98.6%). The genome assembly sizes (bp) across isolates and sequencers based on RagOut was significantly longer (p values < 0.001) (4,418,574 ± 8,824 bp) than Unicycler and RagTag assemblies (Unicycler = 4,377,642 ± 55,257 bp, and RagTag = 4,380,711 ± 51,164 bp). RagOut-based assemblies had the fewest contigs (~32) and the longest genome size (4,418,574 bp; vs. H37Rv reference size 4,411,532 bp) and therefore were chosen for downstream analysis. Pan-genome analysis of Illumina and PacBio hybrid assemblies revealed the greatest number of detected genes (4,639 genes; H37Rv reference contains 3,976 genes), while Illumina and ONT hybrid assemblies produced the highest number of SNPs. The number of genes from hybrid assemblies with ONT and PacBio long-reads (mean: 4,620 genes) was greater than short-read assembly alone (4,478 genes). All nine RagOut hybrid genome assemblies detected known mutations in genes associated with MDR-TB and pre-XDR-TB. Conclusions: Unicycler software performed the best in terms of achieving contiguous genomes, whereas RagOut improved the quality of Unicycler's genome assemblies by providing a longer genome size. Overall, our approach has demonstrated that short-read and long-read hybrid assembly can provide a more complete genome assembly than short-read assembly alone by detecting pan-genomes and more genes, including IS6110, and SNPs.
Assuntos
Genoma Bacteriano , Sequenciamento de Nucleotídeos em Larga Escala , Mycobacterium tuberculosis , Mycobacterium tuberculosis/genética , Genoma Bacteriano/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodosRESUMO
Transposable elements (TEs) significantly contribute to the evolution and diversity of plant genomes. In this study, we explored the roles of TEs in the genomes of Citrus and Citrus-related genera by constructing a pan-genome TE library from 20 published genomes of Citrus and Citrus-related accessions. Our results revealed an increase in TE content and the number of TE types compared to the original annotations, as well as a decrease in the content of unclassified TEs. The average length of TEs per assembly was approximately 194.23 Mb, representing 41.76% (Murraya paniculata) to 64.76% (Citrus gilletiana) of the genomes, with a mean value of 56.95%. A significant positive correlation was found between genome size and both the number of TE types and TE content. Consistent with the difference in mean whole-genome size (39.83 Mb) between Citrus and Citrus-related genera, Citrus genomes contained an average of 34.36 Mb more TE sequences than Citrus-related genomes. Analysis of the estimated insertion time and half-life of long terminal repeat retrotransposons (LTR-RTs) suggested that TE removal was not the primary factor contributing to the differences among genomes. These findings collectively indicate that TEs are the primary determinants of genome size and play a major role in shaping genome structures. Principal coordinate analysis (PCoA) of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) identifiers revealed that the fragmented TEs were predominantly derived from ancestral genomes, while intact TEs were crucial in the recent evolutionary diversification of Citrus. Moreover, the presence or absence of intact TEs near the AdhE superfamily was closely associated with the bitterness trait in the Citrus species. Overall, this study enhances TE annotation in Citrus and Citrus-related genomes and provides valuable data for future genetic breeding and agronomic trait research in Citrus.
RESUMO
BACKGROUND: Kiwifruit, belonging to the genus Actinidia, represents a unique fruit crop characterized by its modern cultivars being genetically diverse and exhibiting remarkable variations in morphological traits and adaptability to harsh environments. However, the genetic mechanisms underlying such morphological diversity remain largely elusive. RESULTS: We report the high-quality genomes of five Actinidia species, including Actinidia longicarpa, A. macrosperma, A. polygama, A. reticulata, and A. rufa. Through comparative genomics analyses, we identified three whole genome duplication events shared by the Actinidia genus and uncovered rapidly evolving gene families implicated in the development of characteristic kiwifruit traits, including vitamin C (VC) content and fruit hairiness. A range of structural variations were identified, potentially contributing to the phenotypic diversity in kiwifruit. Notably, phylogenomic analyses revealed 76 cis-regulatory elements within the Actinidia genus, predominantly associated with stress responses, metabolic processes, and development. Among these, five motifs did not exhibit similarity to known plant motifs, suggesting the presence of possible novel cis-regulatory elements in kiwifruit. Construction of a pan-genome encompassing the nine Actinidia species facilitated the identification of gene DTZ79_23g14810 specific to species exhibiting extraordinarily high VC content. Expression of DTZ79_23g14810 is significantly correlated with the dynamics of VC concentration, and its overexpression in the transgenic roots of kiwifruit plants resulted in increased VC content. CONCLUSIONS: Collectively, the genomes and pan-genome of diverse Actinidia species not only enhance our understanding of fruit development but also provide a valuable genomic resource for facilitating the genome-based breeding of kiwifruit.
Assuntos
Actinidia , Genoma de Planta , Filogenia , Actinidia/genética , Actinidia/crescimento & desenvolvimento , Frutas/genética , Frutas/crescimento & desenvolvimento , Genes de PlantasRESUMO
Jumbo phages are characterized by their remarkably large-sized genome and unique life cycles. Jumbo phages belonging to Chimalliviridae family protect the replicating phage DNA from host immune systems like CRISPR-Cas and restriction-modification system through a phage nucleus structure. Several recent studies have provided new insights into jumbo phage infection biology, but the pan-genome diversity of jumbo phages and their relationship with CRISPR-Cas targeting beyond Chimalliviridae are not well understood. In this study, we used pan-genome analysis to identify orthologous gene families shared among 331 jumbo phages with complete genomes. We show that jumbo phages lack a universally conserved set of core genes but identified seven "soft-core genes" conserved in over 50% of these phages. These genes primarily govern DNA-related activities, such as replication, repair, or nucleotide synthesis. Jumbo phages exhibit a wide array of accessory and unique genes, underscoring their genetic diversity. Phylogenetic analyses of the soft-core genes revealed frequent horizontal gene transfer events between jumbo phages, non-jumbo phages, and occasionally even giant eukaryotic viruses, indicating a polyphyletic evolutionary nature. We categorized jumbo phages into 11 major viral clusters (VCs) spanning 130 sub-clusters, with the majority being multi-genus jumbo phage clusters. Moreover, through the analysis of hallmark genes related to CRISPR-Cas targeting, we predict that many jumbo phages can evade host immune systems using both known and yet-to-be-identified mechanisms. In summary, our study enhances our understanding of jumbo phages, shedding light on their pan-genome diversity and remarkable genome protection capabilities. IMPORTANCE: Jumbo phages are large bacterial viruses known for more than 50 years. However, only in recent years, a significant number of complete genome sequences of jumbo phages have become available. In this study, we employed comparative genomic approaches to investigate the genomic diversity and genome protection capabilities of the 331 jumbo phages. Our findings revealed that jumbo phages exhibit high genetic diversity, with only a few genes being relatively conserved across jumbo phages. Interestingly, our data suggest that jumbo phages employ yet-to-be-identified strategies to protect their DNA from the host immune system, such as CRISPR-Cas.
Assuntos
Bacteriófagos , Sistemas CRISPR-Cas , Variação Genética , Genoma Viral , Genômica , Filogenia , Sistemas CRISPR-Cas/genética , Genoma Viral/genética , Bacteriófagos/genética , Bacteriófagos/fisiologia , Transferência Genética Horizontal/genéticaRESUMO
MAIN CONCLUSION: Leveraging advanced breeding and multi-omics resources is vital to position millet as an essential "nutricereal resource," aligning with IYoM goals, alleviating strain on global cereal production, boosting resilience to climate change, and advancing sustainable crop improvement and biodiversity. The global challenges of food security, nutrition, climate change, and agrarian sustainability demand the adoption of climate-resilient, nutrient-rich crops to support a growing population amidst shifting environmental conditions. Millets, also referred to as "Shree Anna," emerge as a promising solution to address these issues by bolstering food production, improving nutrient security, and fostering biodiversity conservation. Their resilience to harsh environments, nutritional density, cultural significance, and potential to enhance dietary quality index made them valuable assets in global agriculture. Recognizing their pivotal role, the United Nations designated 2023 as the "International Year of Millets (IYoM 2023)," emphasizing their contribution to climate-resilient agriculture and nutritional enhancement. Scientific progress has invigorated efforts to enhance millet production through genetic and genomic interventions, yielding a wealth of advanced molecular breeding technologies and multi-omics resources. These advancements offer opportunities to tackle prevailing challenges in millet, such as anti-nutritional factors, sensory acceptability issues, toxin contamination, and ancillary crop improvements. This review provides a comprehensive overview of molecular breeding and multi-omics resources for nine major millet species, focusing on their potential impact within the framework of IYoM. These resources include whole and pan-genome, elucidating adaptive responses to abiotic stressors, organelle-based studies revealing evolutionary resilience, markers linked to desirable traits for efficient breeding, QTL analysis facilitating trait selection, functional gene discovery for biotechnological interventions, regulatory ncRNAs for trait modulation, web-based platforms for stakeholder communication, tissue culture techniques for genetic modification, and integrated omics approaches enabled by precise application of CRISPR/Cas9 technology. Aligning these resources with the seven thematic areas outlined by IYoM catalyzes transformative changes in millet production and utilization, thereby contributing to global food security, sustainable agriculture, and enhanced nutritional consequences.
Assuntos
Mudança Climática , Produtos Agrícolas , Genômica , Milhetes , Melhoramento Vegetal , Milhetes/genética , Melhoramento Vegetal/métodos , Produtos Agrícolas/genética , Biodiversidade , Segurança Alimentar , Agricultura/métodos , MultiômicaRESUMO
The graph of sequences represents the genetic variations of pan-genome concisely and space-efficiently than multiple linear reference genome. In order to accelerate aligning reads to the graph, an index of graph-based reference genomes is used to obtain candidate locations. However, the potential combinatorial explosion of nodes on the sequence graph leads to increasing the index space and maximum memory usage of alignment process considerably, especially for large-scale datasets. For this, existing methods typically attempt to prune complex regions, or extend the length of seeds, which sacrifices the recall of alignment algorithm despite reducing space usage slightly. We present the Sparse-index of Graph (SIG) and alignment algorithm SIG-Aligner, capable of indexing and aligning at the lower memory cost. SIG builds the non-overlapping minimizers index inside nodes of sequence graph and SIG-Aligner filters out most of the false positive matches by the method based on the pigeonhole principle. Compared to Giraffe, the results of computational experiments show that SIG achieves a significant reduction in index memory space ranging from 50% to 75% for the human pan-genome graphs, while still preserving superior or comparable accuracy of alignment and the faster alignment time.
Assuntos
Algoritmos , Alinhamento de Sequência , Análise de Sequência de DNA , Humanos , Alinhamento de Sequência/métodos , Alinhamento de Sequência/estatística & dados numéricos , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/estatística & dados numéricos , Genoma Humano , Software , Genômica/métodos , Genômica/estatística & dados numéricos , GenomaRESUMO
Conventional breeding approaches have played a significant role in meeting the food demand remarkably well until now. However, the increasing population, yield plateaus in certain crops, and limited recombination necessitate using genomic resources for genomics-assisted crop improvement programs. As a result of advancements in the next-generation sequence technology, GABs have developed dramatically to characterize allelic variants and facilitate their rapid and efficient incorporation in crop improvement programs. Genomics-assisted breeding (GAB) has played an important role in harnessing the potential of modern genomic tools, exploiting allelic variation from genetic resources and developing cultivars over the past decade. The availability of pangenomes for major crops has been a significant development, albeit with varying degrees of completeness. Even though adopting these technologies is essentially determined on economic grounds and cost-effective assays, which create a wealth of information that can be successfully used to exploit the latent potential of crops. GAB has been instrumental in harnessing the potential of modern genomic resources and exploiting allelic variation for genetic enhancement and cultivar development. GAB strategies will be indispensable for designing future crops and are expected to play a crucial role in breeding climate-smart crop cultivars with higher nutritional value.