ABSTRACT
Structural variants contribute substantially to genetic diversity and are important evolutionarily and medically, but they are still understudied. Here we present a comprehensive analysis of structural variation in the Human Genome Diversity panel, a high-coverage dataset of 911 samples from 54 diverse worldwide populations. We identify, in total, 126,018 variants, 78% of which were not identified in previous global sequencing projects. Some reach high frequency and are private to continental groups or even individual populations, including regionally restricted runaway duplications and putatively introgressed variants from archaic hominins. By de novo assembly of 25 genomes using linked-read sequencing, we discover 1,643 breakpoint-resolved unique insertions, in aggregate accounting for 1.9 Mb of sequence absent from the GRCh38 reference. Our results illustrate the limitation of a single human reference and the need for high-quality genomes from diverse populations to fully discover and understand human genetic variation.
Subject(s)
Genetics, Population , Genomic Structural Variation , Alleles , Databases, Genetic , Gene Dosage , Gene Duplication , Gene Frequency/genetics , Genetic Variation , Genome, Human , HumansABSTRACT
Japanese green tea, an essential beverage in Japanese culture, is characterized by the initial steaming of freshly harvested leaves during production. This process efficiently inactivates endogenous enzymes such as polyphenol oxidases, resulting in the production of sencha, gyokuro and matcha that preserves the vibrant green color of young leaves. Although genome sequences of several tea cultivars and germplasms have been published, no reference genome sequences are available for Japanese green tea cultivars. Here, we constructed a reference genome sequence of the cultivar 'Seimei', which is used to produce high-quality Japanese green tea. Using the PacBio HiFi and Hi-C technologies for chromosome-scale genome assembly, we obtained 15 chromosome sequences with a total genome size of 3.1 Gb and an N50 of 214.9 Mb. By analyzing the genomic diversity of 23 Japanese tea cultivars and lines, including the leading green tea cultivars 'Yabukita' and 'Saemidori', it was revealed that several candidate genes could be related to the characteristics of Japanese green tea. The reference genome of 'Seimei' and information on genomic diversity of Japanese green tea cultivars should provide crucial information for effective breeding of such cultivars in the future.
Subject(s)
Camellia sinensis , Chromosomes, Plant , Genome, Plant , Camellia sinensis/genetics , Chromosomes, Plant/genetics , Tea/genetics , Japan , Plant Leaves/geneticsABSTRACT
BACKGROUND: Mycoplasma pneumoniae (M. pneumoniae) is an important pathogen of community-acquired pneumonia in children. The factors contributing to the severity of illness caused by M. pneumoniae infection are still under investigation. We aimed to evaluate the sensitivity of common M. pneumoniae detection methods, as well as to analyze the clinical manifestations, genotypes, macrolide resistance, respiratory microenvironment, and their relationship with the severity of illness in children with M. pneumoniae pneumonia in Wuhan. RESULTS: Among 1,259 clinical samples, 461 samples were positive for M. pneumoniae via quantitative polymerase chain reaction (qPCR). Furthermore, we found that while serological testing is not highly sensitive in detecting M. pneumoniae infection, but it may serve as an indicator for predicting severe cases. We successfully identified the adhesin P1 (P1) genotypes of 127 samples based on metagenomic and Sanger sequencing, with P1-type 1 (113/127, 88.98%) being the dominant genotype. No significant difference in pathogenicity was observed among different genotypes. The macrolide resistance rate of M. pneumoniae isolates was 96% (48/50) and all mutations were A2063G in domain V of 23S rRNA gene. There was no significant difference between the upper respiratory microbiome of patients with mild and severe symptoms. CONCLUSIONS: During the period of this study, the main circulating M. pneumoniae was P1-type 1, with a resistance rate of 96%. Key findings include the efficacy of qPCR in detecting M. pneumoniae, the potential of IgM titers exceeding 1:160 as indicators for illness severity, and the lack of a direct correlation between disease severity and genotypic characteristics or respiratory microenvironment. This study is the first to characterize the epidemic and genomic features of M. pneumoniae in Wuhan after the COVID-19 outbreak in 2020, which provides a scientific data basis for monitoring and infection prevention and control of M. pneumoniae in the post-pandemic era.
Subject(s)
Mycoplasma pneumoniae , Pneumonia, Mycoplasma , Child , Humans , Mycoplasma pneumoniae/genetics , Anti-Bacterial Agents/pharmacology , Anti-Bacterial Agents/therapeutic use , Molecular Epidemiology , Macrolides/pharmacology , Drug Resistance, Bacterial/genetics , Pneumonia, Mycoplasma/diagnosis , Pneumonia, Mycoplasma/epidemiology , Pneumonia, Mycoplasma/drug therapy , RNA, Ribosomal, 23S/genetics , PandemicsABSTRACT
Kashmir cattle, which were kept by local pastoralists for centuries, are exceptionally resilient and adaptive to harsh environments. Despite its significance, the genomic characteristics of this cattle breed remain elusive. This study utilized whole genome sequences of Kashmir cattle (n = 20; newly sequenced) alongside published whole genomes of 32 distinct breeds and seven core cattle populations (n = 135). The analysis identified ~25.87 million biallelic single nucleotide polymorphisms in Kashmir cattle, predominantly in intergenic and intron regions. Population structure analyses revealed distinct clustering patterns of Kashmir cattle with proximity to the South Asian, African and Chinese indicine cattle populations. Genetic diversity analysis of Kashmir cattle demonstrated lower inbreeding and greater nucleotide diversity than analyzed global breeds. Homozygosity runs indicated less consanguineous mating in Kashmir cattle compared with European taurine breeds. Furthermore, six selection sweep detection methods were used within Kashmir cattle and other cattle populations to identify genes associated with vital traits, including immunity (BOLA-DQA5, BOLA-DQB, TNFAIP8L, FCRL4, AOAH, HIF1AN, FBXL3, MPEG1, CDC40, etc.), reproduction (GOLGA4, BRWD1, OSBP2, LEO1 ADCY5, etc.), growth (ADPRHL1, NRG2, TCF12, TMOD4, GBP4, IGF2, RSPO3, SCD, etc.), milk composition (MRPS30 and CSF1) and high-altitude adaptation (EDNRA, ITPR2, AGBL4 and SCG3). These findings provide essential genetic insights into the characteristics and establish the foundation for the scientific conservation and utilization of Kashmir cattle breed.
Subject(s)
Phylogeny , Polymorphism, Single Nucleotide , Animals , Cattle/genetics , Whole Genome Sequencing/veterinary , Genetic Variation , Breeding , IndiaABSTRACT
Infectious bursal disease virus (IBDV) is an immunosuppressive pathogen causing enormous economic losses to the poultry industry across the globe. As a double-stranded RNA virus, IBDV undergoes genetic mutation or recombination in replication during circulation among flocks, leading to the generation and spread of variant or recombinant strains. In particular, the recent emergence of variant IBDV causes severe immunosuppression in chickens, affecting the efficacy of other vaccines. It seems that the genetic mutation of IBDV during the battle against host response is an effective strategy to help itself to survive. Therefore, a comprehensive understanding of the viral genome diversity will definitely help to develop effective measures for prevention and control of infectious bursal disease (IBD). In recent years, considerable progress has been made in understanding the relation of genetic mutation and genomic recombination of IBDV to its pathogenesis using the reverse genetic technique. Therefore, this review focuses on our current genetic insight into the IBDV's genetic typing and viral genomic variation.
Subject(s)
Birnaviridae Infections , Infectious bursal disease virus , Poultry Diseases , Viral Vaccines , Animals , Chickens , Infectious bursal disease virus/genetics , Viral Vaccines/genetics , Genomics , Birnaviridae Infections/prevention & control , Poultry Diseases/genetics , Poultry Diseases/prevention & controlABSTRACT
Genomic changes in Mycoplasma pneumoniae caused by adaptation to environmental or ecologic pressures are poorly understood. We collected M. pneumoniae from children who had confirmed pneumonia in Taiwan during 2017-2020. We used whole-genome sequencing to compare these isolates with a worldwide collection of current and historical clinical strains for characterizing population structures. A phylogenetic tree for 284 strains showed that all sequenced strains consisted of 5 clades: T1-1 (sequence type [ST]1), T1-2 (mainly ST3), T1-3 (ST17), T2-1 (mainly ST2), and T2-2 (mainly ST14). We identified a putative recombination block containing 6 genes (MPN366â371). Macrolide resistance involving 23S rRNA mutations was detected for each clade. Clonal expansion of macrolide resistance occurred mostly within subtype 1 strains, of which clade T1-2 showed the highest recombination rate and genome diversity. Functional characterization of recombined regions provided clarification of the biologic role of these recombination events in the evolution of M. pneumoniae.
Subject(s)
Mycoplasma pneumoniae , Pneumonia, Mycoplasma , Anti-Bacterial Agents/pharmacology , Child , Drug Resistance, Bacterial/genetics , Humans , Macrolides , Mycoplasma pneumoniae/genetics , Phylogeny , Pneumonia, Mycoplasma/epidemiology , RNA, Ribosomal, 23S , Recombination, GeneticABSTRACT
All vertebrate genomes have been colonized by retroviruses along their evolutionary trajectory. Although endogenous retroviruses (ERVs) can contribute important physiological functions to contemporary hosts, such benefits are attributed to long-term coevolution of ERV and host because germline infections are rare and expansion is slow, and because the host effectively silences them. The genomes of several outbred species including mule deer (Odocoileus hemionus) are currently being colonized by ERVs, which provides an opportunity to study ERV dynamics at a time when few are fixed. We previously established the locus-specific distribution of cervid ERV (CrERV) in populations of mule deer. In this study, we determine the molecular evolutionary processes acting on CrERV at each locus in the context of phylogenetic origin, genome location, and population prevalence. A mule deer genome was de novo assembled from short- and long-insert mate pair reads and CrERV sequence generated at each locus. We report that CrERV composition and diversity have recently measurably increased by horizontal acquisition of a new retrovirus lineage. This new lineage has further expanded CrERV burden and CrERV genomic diversity by activating and recombining with existing CrERV. Resulting interlineage recombinants then endogenize and subsequently expand. CrERV loci are significantly closer to genes than expected if integration were random and gene proximity might explain the recent expansion of one recombinant CrERV lineage. Thus, in mule deer, retroviral colonization is a dynamic period in the molecular evolution of CrERV that also provides a burst of genomic diversity to the host population.
Subject(s)
Deer , Endogenous Retroviruses , Animals , Biological Evolution , Deer/genetics , Endogenous Retroviruses/genetics , Evolution, Molecular , Phylogeny , Recombination, GeneticABSTRACT
Thermophilic Campylobacter, in particular Campylobacter jejuni, C. coli and C. lari are the main relevant Campylobacter species for human infections. Due to their high capacity of genetic exchange by horizontal gene transfer (HGT), rapid adaptation to changing environmental and host conditions contribute to successful spreading and persistence of these foodborne pathogens. However, extensive HGT can exert dangerous side effects for the bacterium, such as the incorporation of gene fragments leading to disturbed gene functions. Here we discuss mechanisms of HGT, notably natural transformation, conjugation and bacteriophage transduction and limiting regulatory strategies of gene transfer. In particular, we summarize the current knowledge on how the DNA macromolecule is exchanged between single cells. Mechanisms to stimulate and to limit HGT obviously coevolved and maintained an optimal balance. Chromosomal rearrangements and incorporation of harmful mutations are risk factors for survival and can result in drastic loss of fitness. In Campylobacter, the restricted recognition and preferential uptake of free DNA from relatives are mediated by a short methylated DNA pattern and not by a classical DNA uptake sequence as found in other bacteria. A class two CRISPR-Cas system is present but also other DNases and restriction-modification systems appear to be important for Campylobacter genome integrity. Several lytic and integrated bacteriophages have been identified, which contribute to genome diversity. Furthermore, we focus on the impact of gene transfer on the spread of antibiotic resistance genes (resistome) and persistence factors. We discuss remaining open questions in the HGT field, supposed to be answered in the future by current technologies like whole-genome sequencing and single-cell approaches.
Subject(s)
Bacteriophages , Campylobacter jejuni , Campylobacter , Bacteriophages/genetics , Campylobacter/genetics , Campylobacter jejuni/genetics , Drug Resistance, Microbial , Gene Transfer, Horizontal , HumansABSTRACT
BACKGROUND: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) originated in Wuhan, China, in early December 2019 has rapidly widespread worldwide, becoming one of the major global public health issues of the last centuries. Key Messages: Over the course of the pandemic, due to the advanced whole-genome sequencing technologies, an unprecedented amount of genomes have been generated, providing invaluable insights into the ongoing evolution and epidemiology of the virus during the pandemic. Therefore, this large amount of data played an important role in the SARS-CoV-2 mitigation and control strategies. Key Messages: The active monitoring and characterization of the SARS-CoV-2 lineages circulating worldwide is useful for a more specific diagnosis, better care, and timely treatment. In this review, a concise characterization of all the lineages and sub-lineages circulating and co-circulating across the world has been presented in order to determine the magnitude of the SARS-CoV-2 threat and to better understand the virus genetic diversity and its dispersion dynamics.
Subject(s)
COVID-19 , Communicable Disease Control/methods , SARS-CoV-2 , COVID-19/epidemiology , COVID-19/prevention & control , COVID-19/virology , COVID-19 Nucleic Acid Testing/methods , Epidemiological Monitoring , Genome, Viral , Global Health , Humans , SARS-CoV-2/classification , SARS-CoV-2/genetics , SARS-CoV-2/isolation & purificationABSTRACT
Protozoan Plasmodium parasites are the causative agents of malaria, a deadly disease that continues to afflict hundreds of millions of people every year. Infections with malaria parasites can be asymptomatic, with mild or severe symptoms, or fatal, depending on many factors such as parasite virulence and host immune status. Malaria can be treated with various drugs, with artemisinin-based combination therapies (ACTs) being the first-line choice. Recent advances in genetics and genomics of malaria parasites have contributed greatly to our understanding of parasite population dynamics, transmission, drug responses, and pathogenesis. However, knowledge gaps in parasite biology and host-parasite interactions still remain. Parasites resistant to multiple antimalarial drugs have emerged, while advanced clinical trials have shown partial efficacy for one available vaccine. Here we discuss genetic and genomic studies of Plasmodium biology, host-parasite interactions, population structures, mosquito infectivity, antigenic variation, and targets for treatment and immunization. Knowledge from these studies will advance our understanding of malaria pathogenesis, epidemiology, and evolution and will support work to discover and develop new medicines and vaccines.
Subject(s)
Antimalarials/pharmacology , Drug Resistance/genetics , Evolution, Molecular , Genome, Protozoan/genetics , Malaria/epidemiology , Malaria/parasitology , Plasmodium/drug effects , Plasmodium/genetics , Humans , Plasmodium/classification , Plasmodium/pathogenicityABSTRACT
BACKGROUND: Despite the importance of characterizing genetic variation among coral individuals for understanding phenotypic variation, the correlation between coral genomic diversity and phenotypic expression is still poorly understood. RESULTS: In this study, we detected a high frequency of genes showing presence-absence polymorphisms (PAPs) for single-copy genes in Acropora digitifera. Among 10,455 single-copy genes, 516 (5%) exhibited PAPs, including 32 transposable element (TE)-related genes. Five hundred sixteen genes exhibited a homozygous absence in one (102) or more than one (414) individuals (n = 33), indicating that most of the absent alleles were not rare variants. Among genes showing PAPs (PAP genes), roughly half were expressed in adults and/or larvae, and the PAP status was associated with differential expression among individuals. Although 85% of PAP genes were uncharacterized or had ambiguous annotations, 70% of these genes were specifically distributed in cnidarian lineages in eumetazoa, suggesting that these genes have functional roles related to traits related to cnidarians or the family Acroporidae or the genus Acropora. Indeed, four of these genes encoded toxins that are usually components of venom in cnidarian-specific cnidocytes. At least 17% of A. digitifera PAP genes were also PAPs in A. tenuis, the basal lineage in the genus Acropora, indicating that PAPs were shared among species in Acropora. CONCLUSIONS: Expression differences caused by a high frequency of PAP genes may be a novel genomic feature in the genus Acropora; these findings will contribute to improve our understanding of correlation between genetic and phenotypic variation in corals.
Subject(s)
Anthozoa/genetics , Gene Dosage , Genome , Polymorphism, Genetic , Animals , Cloning, Molecular , Computational Biology/methods , Evolution, Molecular , Genomics/methods , Reproducibility of Results , Sequence Analysis, DNAABSTRACT
Bacterial populations differentiate over time and space to form distinct genetic units. The mechanisms governing this diversification are presumed to result from the ecological context of living units to adapt to specific niches. Recently, a model assuming the acquisition of advantageous genes among populations rather than whole genome sweeps has emerged to explain population differentiation. However, the characteristics of these exchanged, or flexible, genes and whether their evolution is driven by adaptive or neutral processes remain controversial. By analysing the flexible genome of single-amplified genomes of co-occurring populations of the marine Prochlorococcus HLII ecotype, we highlight that genomic compartments - rather than population units - are characterized by different evolutionary trajectories. The dynamics of gene fluxes vary across genomic compartments and therefore the effectiveness of selection depends on the fluctuation of the effective population size along the genome. Taken together, these results support the drift-barrier model of bacterial evolution.
Subject(s)
Genome, Bacterial , Prochlorococcus , Bacteria/genetics , Evolution, Molecular , Genomics , Prochlorococcus/geneticsABSTRACT
BACKGROUND: Miniature inverted-repeat transposable elements (MITEs) and long terminal repeat (LTR) retrotransposons are ubiquitous in plants genomes, and highly important in their evolution and diversity. However, their mechanisms of insertion/amplification and roles in Citrus genome's evolution/diversity are still poorly understood. RESULTS: To address this knowledge gap, we developed different computational pipelines to analyze, annotate and classify MITEs and LTR retrotransposons in six different sequenced Citrus species. We identified 62,010 full-length MITEs from 110 distinguished families. We observed MITEs tend to insert in gene related regions and enriched in promoters. We found that DTM63 is possibly an active Mutator-like MITE family in the traceable past and may still be active in Citrus. The insertion of MITEs resulted in massive polymorphisms and played an important role in Citrus genome diversity and gene structure variations. In addition, 6630 complete LTR retrotransposons and 13,371 solo-LTRs were identified. Among them, 12 LTR lineages separated before the differentiation of mono- and dicotyledonous plants. We observed insertion and deletion of LTR retrotransposons was accomplished with a dynamic balance, and their half-life in Citrus was ~ 1.8 million years. CONCLUSIONS: These findings provide insights into MITEs and LTR retrotransposons and their roles in genome diversity in different Citrus genomes.
Subject(s)
Citrus/genetics , DNA Transposable Elements/genetics , Genome, Plant/genetics , Inverted Repeat Sequences/genetics , Retroelements/genetics , Terminal Repeat Sequences/genetics , Genetic VariationABSTRACT
The recent global decline in Western honeybee (Apis mellifera) populations is of great concern for pollination and honey production worldwide. Declining honeybee populations are frequently infected by the microsporidian pathogen Nosema ceranae. This species was originally described in the Asiatic honeybee (Apis cerana), and its identification in global A. mellifera hives could result from a recent host transfer. Recent genome studies have found that global populations of this parasite are polyploid and that humans may have fueled their global expansion. To better understand N. ceranae biology, we investigated its genetic diversity within part of their native range (Thailand) and among different hosts (A. mellifera, A. cerana) using both PCR and genome-based methods. We find that Thai N. ceranae populations share many SNPs with other global populations and appear to be clonal. However, in stark contrast with previous studies, we found that these populations also carry many SNPs not found elsewhere, indicating that these populations have evolved in their current geographic location for some time. Our genome analyses also indicate the potential presence of diploidy within Thai populations of N. ceranae.
Subject(s)
Bees/microbiology , Genome, Fungal , Nosema/genetics , Polymorphism, Single Nucleotide , Animals , Genomics , Polymerase Chain Reaction , ThailandABSTRACT
Endogenous retroviruses (ERVs) have contributed to more than 8% of the human genome. The majority of these elements lack function due to accumulated mutations or internal recombination resulting in a solitary (solo) LTR, although members of one group of human ERVs (HERVs), HERV-K, were recently active with members that remain nearly intact, a subset of which is present as insertionally polymorphic loci that include approximately full-length (2-LTR) and solo-LTR alleles in addition to the unoccupied site. Several 2-LTR insertions have intact reading frames in some or all genes that are expressed as functional proteins. These properties reflect the activity of HERV-K and suggest the existence of additional unique loci within humans. We sought to determine the extent to which other polymorphic insertions are present in humans, using sequenced genomes from the 1000 Genomes Project and a subset of the Human Genome Diversity Project panel. We report analysis of a total of 36 nonreference polymorphic HERV-K proviruses, including 19 newly reported loci, with insertion frequencies ranging from <0.0005 to >0.75 that varied by population. Targeted screening of individual loci identified three new unfixed 2-LTR proviruses within our set, including an intact provirus present at Xq21.33 in some individuals, with the potential for retained infectivity.
Subject(s)
Alleles , Endogenous Retroviruses/genetics , Genetic Loci , Mutagenesis, Insertional , Polymorphism, Genetic , Terminal Repeat Sequences , Female , Humans , MaleABSTRACT
BACKGROUND: Synthetic hexaploid wheat (SHW) is a reconstitution of hexaploid wheat from its progenitors (Triticum turgidum ssp. durum L.; AABB x Aegilops tauschii Coss.; DD) and has novel sources of genetic diversity for broadening the genetic base of elite bread wheat (BW) germplasm (T. aestivum L). Understanding the diversity and population structure of SHWs will facilitate their use in wheat breeding programs. Our objectives were to understand the genetic diversity and population structure of SHWs and compare the genetic diversity of SHWs with elite BW cultivars and demonstrate the potential of SHWs to broaden the genetic base of modern wheat germplasm. RESULTS: The genotyping-by-sequencing of SHW provided 35,939 high-quality single nucleotide polymorphisms (SNPs) that were distributed across the A (33%), B (36%), and D (31%) genomes. The percentage of SNPs on the D genome was nearly same as the other two genomes, unlike in BW cultivars where the D genome polymorphism is generally much lower than the A and B genomes. This indicates the presence of high variation in the D genome in the SHWs. The D genome gene diversity of SHWs was 88.2% higher than that found in a sample of elite BW cultivars. Population structure analysis revealed that SHWs could be separated into two subgroups, mainly differentiated by geographical location of durum parents and growth habit of the crop (spring and winter type). Further population structure analysis of durum and Ae. parents separately identified two subgroups, mainly based on type of parents used. Although Ae. tauschii parents were divided into two sub-species: Ae. tauschii ssp. tauschii and ssp. strangulate, they were not clearly distinguished in the diversity analysis outcome. Population differentiation between SHWs (Spring_SHW and Winter_SHW) samples using analysis of molecular variance indicated 17.43% of genetic variance between populations and the remainder within populations. CONCLUSIONS: SHWs were diverse and had a clearly distinguished population structure identified through GBS-derived SNPs. The results of this study will provide valuable information for wheat genetic improvement through inclusion of novel genetic variation and is a prerequisite for association mapping and genomic selection to unravel economically important marker-trait associations and for cultivar development.
Subject(s)
Polymorphism, Single Nucleotide , Sequence Analysis, DNA/methods , Triticum/genetics , Chromosome Mapping , Genetics, Population , Plant Breeding , PolyploidyABSTRACT
The 944 individuals of the CEPH human genome diversity panel (HGDP-CEPH), a standard sample set of 51 globally distributed populations, were sequenced using the Illumina ForenSeq™ DNA Signature Prep Kit. The ForenSeq™ system is a single multiplex for the MiSeq/FGx™ massively parallel sequencing instrument, comprising: amelogenin, 27 autosomal STRs, 24 Y-STRs, 7 X-STRs, and 94 SNPforID+Kiddlab autosomal ID-SNPs (plus optionally detected ancestry and phenotyping SNP sets). We report in detail the patterns of sequence variation observed in the repeat regions of the 58 forensic STR loci typed by the ForenSeq™ system. Sequence alleles were characterized and repeat region structures annotated by aligning the ForenSeq™ sequence output to the latest GRCh38 human reference sequence, necessitating the reversal and re-alignment of STR allele sequences reported by the Forenseq™ system in 20 of 58 STRs (plus the reverse alleles in two Y-STRs with duplicated-inverted repeat regions). Individual population sample sizes of the HGDP-CEPH panel do not allow reliable inferences to be made about levels of genetic variability in low frequency STR alleles-where particular sequence variants are found in only a few individuals; but we assessed the occurrence of both population-specific sequence variants and singleton observations; finding each of these in a sizeable proportion of HGDP-CEPH samples, with consequences for planning the co-ordinated compilation of sequence variation on a much larger scale than was required before by forensic laboratories now adopting massively parallel sequencing.
Subject(s)
DNA Fingerprinting/methods , High-Throughput Nucleotide Sequencing/methods , Microsatellite Repeats , Female , Forensic Genetics/methods , Genome, Human , Genotype , Genotyping Techniques/methods , Humans , Male , Multigene FamilyABSTRACT
Background: Advances in next-generation sequencing (NGS) technologies allow comprehensive studies of genetic diversity over the entire genome of human cytomegalovirus (HCMV), a significant pathogen for immunocompromised individuals. Methods: Next-generation sequencing was performed on target enriched sequence libraries prepared directly from a variety of clinical specimens (blood, urine, breast milk, respiratory samples, biopsies, and vitreous humor) obtained longitudinally or from different anatomical compartments from 20 HCMV-infected patients (renal transplant recipients, stem cell transplant recipients, and congenitally infected children). Results: De novo-assembled HCMV genome sequences were obtained for 57 of 68 sequenced samples. Analysis of longitudinal or compartmental HCMV diversity revealed various patterns: no major differences were detected among longitudinal, intraindividual blood samples from 9 of 15 patients and in most of the patients with compartmental samples, whereas a switch of the major HCMV population was observed in 6 individuals with sequential blood samples and upon compartmental analysis of 1 patient with HCMV retinitis. Variant analysis revealed additional aspects of minor virus population dynamics and antiviral-resistance mutations. Conclusions: In immunosuppressed patients, HCMV can remain relatively stable or undergo drastic genomic changes that are suggestive of the emergence of minor resident strains or de novo infection.
Subject(s)
Cytomegalovirus Infections/virology , Cytomegalovirus/genetics , Genome, Viral/genetics , Immunocompromised Host , Adult , Aged , Cohort Studies , Cytomegalovirus/classification , Cytomegalovirus Infections/immunology , DNA, Viral/analysis , DNA, Viral/genetics , Drug Resistance, Viral/genetics , Female , Genetic Variation/genetics , Genomics , High-Throughput Nucleotide Sequencing , Humans , Infant , Infant, Newborn , Male , Middle Aged , Transplant RecipientsABSTRACT
BACKGROUND: Akkermansia muciniphila is one of the most dominant bacteria that resides on the mucus layer of intestinal tract and plays key role in human health, however, little is known about its genomic content. RESULTS: Herein, we for the first time characterized the genomic architecture of A. muciniphila based on whole-genome sequencing, assembling, and annotating of 39 isolates derived from human and mouse feces. We revealed a flexible open pangenome of A. muciniphila currently consisting of 5644 unique proteins. Phylogenetic analysis identified three species-level A. muciniphila phylogroups exhibiting distinct metabolic and functional features. Based on the comprehensive genome catalogue, we reconstructed 106 newly A. muciniphila metagenome assembled genomes (MAGs) from available metagenomic datasets of human, mouse and pig gut microbiomes, revealing a transcontinental distribution of A. muciniphila phylogroups across mammalian gut microbiotas. Accurate quantitative analysis of A. muciniphila phylogroups in human subjects further demonstrated its strong correlation with body mass index and anti-diabetic drug usage. Furthermore, we found that, during their mammalian gut evolution history, A. muciniphila acquired extra genes, especially antibiotic resistance genes, from symbiotic microbes via recent lateral gene transfer. CONCLUSIONS: The genome repertoire of A. muciniphila provided insights into population structure, evolutionary and functional specificity of this significant bacterium.
Subject(s)
Gastrointestinal Microbiome/genetics , Mammals/microbiology , Verrucomicrobia/genetics , Verrucomicrobia/physiology , Whole Genome Sequencing , Animals , Anti-Bacterial Agents/pharmacology , Drug Resistance, Bacterial/genetics , Evolution, Molecular , Humans , Mice , Molecular Sequence Annotation , Verrucomicrobia/drug effectsABSTRACT
Farmed Atlantic salmon (Salmo salar) is a globally important production species, including in Australia where breeding and selection has been in progress since the 1960s. The recent development of SNP genotyping platforms means genome-wide association and genomic prediction can now be implemented to speed genetic gain. As a precursor, this study collected genotypes at 218 132 SNPs in 777 fish from a Tasmanian breeding population to assess levels of genetic diversity, the strength of linkage disequilibrium (LD) and imputation accuracy. Genetic diversity in Tasmanian Atlantic salmon was lower than observed within European populations when compared using four diversity metrics. The distribution of allele frequencies also showed a clear difference, with the Tasmanian animals carrying an excess of low minor allele frequency variants. The strength of observed LD was high at short distances (<25 kb) and remained above background for marker pairs separated by large chromosomal distances (hundreds of kb), in sharp contrast to the European Atlantic salmon tested. Genotypes were used to evaluate the accuracy of imputation from low density (0.5 to 5 K) up to increased density SNP sets (78 K). This revealed high imputation accuracies (0.89-0.97), suggesting that the use of low density SNP sets will be a successful approach for genomic prediction in this population. The long-range LD, comparatively low genetic diversity and high imputation accuracy in Tasmanian salmon is consistent with known aspects of their population history, which involved a small founding population and an absence of subsequent introgression. The findings of this study represent an important first step towards the design of methods to apply genomics in this economically important population.