ABSTRACT
We present 109 near full-length HIV genomes amplified from blood serum samples obtained during early 1986 from across Uganda, which to our knowledge is the earliest and largest population sample from the initial phase of the HIV epidemic in Africa. Consensus sequences were made from paired-end Illumina reads with a target-capture approach to amplify HIV material following poor success with standard approaches. In comparisons with a smaller 'intermediate' genome dataset from 1998 to 1999 and a 'modern' genome dataset from 2007 to 2016, the proportion of subtype D was significantly higher initially, dropping from 67% (73/109), to 57% (26/46) to 17% (82/465) respectively (p < 0.0001). Subtype D has previously been shown to have a faster rate of disease progression than other subtypes in East African population studies, and to have a higher propensity to use the CXCR4 co-receptor ("X4 tropism"); associated with a decrease in time to AIDS. Here we find significant differences in predicted tropism between A1 and D subtypes in all three sample periods considered, which is particularly striking the 1986 sample: 66% (53/80) of subtype D env sequences were predicted to be X4 tropic compared with none of the 24 subtype A1. We also analysed the frequency of subtype in the envelope region of inter-subtype recombinants, and found that subtype A1 is over-represented in env, suggesting recombination and selection have acted to remove subtype D env from circulation. The reduction of subtype D frequency over three decades therefore appears to be a result of selective pressure against X4 tropism and its higher virulence. Lastly, we find a subtype D specific codon deletion at position 24 of the V3 loop, which may explain the higher propensity for subtype D to utilise X4 tropism.
Subject(s)
HIV Infections , HIV-1 , Receptors, CXCR4 , Viral Tropism , Humans , African People , HIV Infections/epidemiology , HIV Infections/virology , HIV-1/genetics , Receptors, CXCR4/genetics , UgandaABSTRACT
BACKGROUND: Host population structure is a key determinant of pathogen and infectious disease transmission patterns. Pathogen phylogenetic trees are useful tools to reveal the population structure underlying an epidemic. Determining whether a population is structured or not is useful in informing the type of phylogenetic methods to be used in a given study. We employ tree statistics derived from phylogenetic trees and machine learning classification techniques to reveal an underlying population structure. RESULTS: In this paper, we simulate phylogenetic trees from both structured and non-structured host populations. We compute eight statistics for the simulated trees, which are: the number of cherries; Sackin, Colless and total cophenetic indices; ladder length; maximum depth; maximum width, and width-to-depth ratio. Based on the estimated tree statistics, we classify the simulated trees as from either a non-structured or a structured population using the decision tree (DT), K-nearest neighbor (KNN) and support vector machine (SVM). We incorporate the basic reproductive number ([Formula: see text]) in our tree simulation procedure. Sensitivity analysis is done to investigate whether the classifiers are robust to different choice of model parameters and to size of trees. Cross-validated results for area under the curve (AUC) for receiver operating characteristic (ROC) curves yield mean values of over 0.9 for most of the classification models. CONCLUSIONS: Our classification procedure distinguishes well between trees from structured and non-structured populations using the classifiers, the two-sample Kolmogorov-Smirnov, Cucconi and Podgor-Gastwirth tests and the box plots. SVM models were more robust to changes in model parameters and tree size compared to KNN and DT classifiers. Our classification procedure was applied to real -world data and the structured population was revealed with high accuracy of [Formula: see text] using SVM-polynomial classifier.
Subject(s)
Machine Learning , Support Vector Machine , Algorithms , Phylogeny , ROC CurveABSTRACT
In modern applications of molecular epidemiology, genetic sequence data are routinely used to identify clusters of transmission in rapidly evolving pathogens, most notably HIV-1. Traditional 'shoe-leather' epidemiology infers transmission clusters by tracing chains of partners sharing epidemiological connections (e.g., sexual contact). Here, we present a computational tool for identifying a molecular transmission analog of such clusters: HIV-TRACE (TRAnsmission Cluster Engine). HIV-TRACE implements an approach inspired by traditional epidemiology, by identifying chains of partners whose viral genetic relatedness imply direct or indirect epidemiological connections. Molecular transmission clusters are constructed using codon-aware pairwise alignment to a reference sequence followed by pairwise genetic distance estimation among all sequences. This approach is computationally tractable and is capable of identifying HIV-1 transmission clusters in large surveillance databases comprising tens or hundreds of thousands of sequences in near real time, that is, on the order of minutes to hours. HIV-TRACE is available at www.hivtrace.org and from www.github.com/veg/hivtrace, along with the accompanying result visualization module from www.github.com/veg/hivtrace-viz. Importantly, the approach underlying HIV-TRACE is not limited to the study of HIV-1 and can be applied to study outbreaks and epidemics of other rapidly evolving pathogens.
Subject(s)
HIV Infections/transmission , HIV-1/genetics , Molecular Epidemiology/methods , Computational Biology , HIV Infections/epidemiology , Humans , SoftwareABSTRACT
Background: Harm reduction has dramatically reduced HIV incidence among people who inject drugs (PWID). In Glasgow, Scotland, <10 infections/year have been diagnosed among PWID since the mid-1990s. However, in 2015 a sharp rise in diagnoses was noted among PWID; many were subtype C with 2 identical drug-resistant mutations and some displayed low avidity, suggesting the infections were linked and recent. Methods: We collected Scottish pol sequences and identified closely related sequences from public databases. Genetic linkage was ascertained among 228 Scottish, 1820 UK, and 524 global sequences. The outbreak cluster was extracted to estimate epidemic parameters. Results: All 104 outbreak sequences originated from Scotland and contained E138A and V179E. Mean genetic distance was <1% and mean time between transmissions was 6.7 months. The average number of onward transmissions consistently exceeded 1, indicating that spread was ongoing. Conclusions: In contrast to other recent HIV outbreaks among PWID, harm reduction services were not clearly reduced in Scotland. Nonetheless, the high proportion of individuals with a history of homelessness (45%) suggests that services were inadequate for those in precarious living situations. The high prevalence of hepatitis C (>90%) is indicative of sharing of injecting equipment. Monitoring the epidemic phylogenetically in real time may accelerate public health action.
Subject(s)
HIV Infections/epidemiology , HIV Infections/transmission , HIV/pathogenicity , Substance Abuse, Intravenous/complications , Substance Abuse, Intravenous/virology , Adult , Disease Outbreaks , Epidemics , Female , Genetic Linkage/genetics , HIV Infections/genetics , HIV Infections/virology , Hepatitis C/epidemiology , Humans , Incidence , Male , Phylogeny , Prevalence , Scotland/epidemiologyABSTRACT
Viral phylogenetic methods contribute to understanding how HIV spreads in populations, and thereby help guide the design of prevention interventions. So far, most analyses have been applied to well-sampled concentrated HIV-1 epidemics in wealthy countries. To direct the use of phylogenetic tools to where the impact of HIV-1 is greatest, the Phylogenetics And Networks for Generalized HIV Epidemics in Africa (PANGEA-HIV) consortium generates full-genome viral sequences from across sub-Saharan Africa. Analyzing these data presents new challenges, since epidemics are principally driven by heterosexual transmission and a smaller fraction of cases is sampled. Here, we show that viral phylogenetic tools can be adapted and used to estimate epidemiological quantities of central importance to HIV-1 prevention in sub-Saharan Africa. We used a community-wide methods comparison exercise on simulated data, where participants were blinded to the true dynamics they were inferring. Two distinct simulations captured generalized HIV-1 epidemics, before and after a large community-level intervention that reduced infection levels. Five research groups participated. Structured coalescent modeling approaches were most successful: phylogenetic estimates of HIV-1 incidence, incidence reductions, and the proportion of transmissions from individuals in their first 3 months of infection correlated with the true values (Pearson correlation > 90%), with small bias. However, on some simulations, true values were markedly outside reported confidence or credibility intervals. The blinded comparison revealed current limits and strengths in using HIV phylogenetics in challenging settings, provided benchmarks for future methods' development, and supports using the latest generation of phylogenetic tools to advance HIV surveillance and prevention.
Subject(s)
HIV Infections/epidemiology , HIV Infections/virology , HIV-1/genetics , Africa South of the Sahara/epidemiology , Computer Simulation , Epidemics , Female , HIV Infections/prevention & control , HIV Infections/transmission , Humans , Incidence , Male , PhylogenyABSTRACT
BACKGROUND: Avian influenza virus (AIV) causes both severe outbreaks and endemic disease among poultry and has caused sporadic human infections in Asia, furthermore the routes of transmission in avian species between geographic regions can be numerous and complex. Using nucleotide sequences from the internal protein coding segments of AIV, we performed a Bayesian phylogeographic study to uncover regional routes of transmission and factors predictive of the rate of viral diffusion within China. RESULTS: We found that the Central area and Pan-Pearl River Delta were the two main sources of AIV diffusion, while the East Coast areas especially the Yangtze River delta, were the major targets of viral invasion. Next we investigated the extent to which economic, agricultural, environmental and climatic regional data was predictive of viral diffusion by fitting phylogeographic discrete trait models using generalised linear models. CONCLUSIONS: Our results highlighted that the economic-agricultural predictors, especially the poultry population density and the number of farm product markets, are the key determinants of spatial diffusion of AIV in China; high human density and freight transportation are also important predictors of high rates of viral transmission; Climate features (e.g. temperature) were correlated to the viral invasion in the destination to some degree; while little or no impacts were found from natural environment factors (such as surface water coverage). This study uncovers the risk factors and enhances our understanding of the spatial dynamics of AIV in bird populations.
Subject(s)
Influenza A virus , Influenza in Birds/virology , Animals , Bayes Theorem , China , Humans , Phylogeography , PoultryABSTRACT
BACKGROUND: The United Kingdom human immunodeficiency virus (HIV) epidemic was historically dominated by HIV subtype B transmission among men who have sex with men (MSM). Now 50% of diagnoses and prevalent infections are among heterosexual individuals and mainly involve non-B subtypes. Between 2002 and 2010, the prevalence of non-B diagnoses among MSM increased from 5.4% to 17%, and this study focused on the drivers of this change. METHODS: Growth between 2007 and 2009 in transmission clusters among 14 000 subtype A1, C, D, and G sequences from the United Kingdom HIV Drug Resistance Database was analysed by risk group. RESULTS: Of 1148 clusters containing at least 2 sequences in 2007, >75% were pairs and >90% were heterosexual. Most clusters (71.4%) did not grow during the study period. Growth was significantly lower for small clusters and higher for clusters of ≥7 sequences, with the highest growth observed for clusters comprising sequences from MSM and people who inject drugs (PWID). Risk group (P< .0001), cluster size (P< .0001), and subtype (P< .01) were predictive of growth in a generalized linear model. DISCUSSION: Despite the increase in non-B subtypes associated with heterosexual transmission, MSM and PWID are at risk for non-B infections. Crossover of subtype C from heterosexuals to MSM has led to the expansion of this subtype within the United Kingdom.
Subject(s)
HIV Infections/transmission , HIV Infections/virology , HIV-1/genetics , Homosexuality, Male/statistics & numerical data , Cluster Analysis , HIV Infections/epidemiology , Humans , Incidence , Male , Molecular Epidemiology , Phylogeny , United Kingdom/epidemiologyABSTRACT
Disease progression in HIV-infected individuals varies greatly, and while the environmental and host factors influencing this variation have been widely investigated, the viral contribution to variation in set-point viral load, a predictor of disease progression, is less clear. Previous studies, using transmission-pairs and analysis of phylogenetic signal in small numbers of individuals, have produced a wide range of viral genetic effect estimates. Here we present a novel application of a population-scale method based in quantitative genetics to estimate the viral genetic effect on set-point viral load in the UK subtype B HIV-1 epidemic, based on a very large data set. Analyzing the initial viral load and associated pol sequence, both taken before anti-retroviral therapy, of 8,483 patients, we estimate the proportion of variance in viral load explained by viral genetic effects to be 5.7% (CI 2.8-8.6%). We also estimated the change in viral load over time due to selection on the virus and environmental effects to be a decline of 0.05 log10 copies/mL/year, in contrast to recent studies which suggested a reported small increase in viral load over the last 20 years might be due to evolutionary changes in the virus. Our results suggest that in the UK epidemic, subtype B has a small but significant viral genetic effect on viral load. By allowing the analysis of large sample sizes, we expect our approach to be applicable to the estimation of the genetic contribution to traits in many organisms.
Subject(s)
HIV Infections/virology , HIV-1/genetics , Viral Load/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Female , Genotype , HIV Infections/blood , HIV Infections/epidemiology , HIV-1/classification , Humans , Male , Middle Aged , Molecular Sequence Data , Phylogeny , Young AdultABSTRACT
UNLABELLED: In recent years, genotype I (GI) of Japanese encephalitis virus (JEV) has displaced genotype III (GIII) as the dominant virus genotype throughout Asia. In this study, the largest collection of GIII and GI envelope gene-derived viral sequences assembled to date was used to reconstruct the spatiotemporal chronology of genotype displacement throughout Asia and to determine the evolutionary and epidemiological dynamics underlying this significant event. GI consists of two clades, GI-a and GI-b, with the latter being associated with displacement of GIII as the dominant JEV genotype throughout Asia in the 1990s. Phylogeographic analysis indicated that GI-a diverged in Thailand or Cambodia and has remained confined to tropical Asia, whereas GI-b diverged in Vietnam and then dispersed northwards to China, where it was subsequently dispersed to Japan, Korea, and Taiwan. Molecular adaptation was detected by more than one method at one site (residue 15), and coevolution was detected at two pairs of sites (residues 89 to 360 and 129 to 141) within the GI E gene protein alignment. Viral multiplication and temperature sensitivity analyses in avian and mosquito cells revealed that the GI-b isolate JE-91 had significantly higher infectivity titers in mosquito cells from 24 to 48 h postinfection than did the GI-a and GIII isolates. If the JE-91 isolate is indeed representative of GI-b, an increased multiplicative ability of GI-b viruses compared to that of GIII viruses early in mosquito infection may have resulted in a shortened extrinsic incubation period that led to an increased number of GI enzootic transmission cycles and the subsequent displacement of GIII. IMPORTANCE: Japanese encephalitis virus (JEV), a mosquito-borne flavivirus, represents the most significant etiology of childhood viral neurological infection in Asia. Despite the existence of effective vaccines, JEV is responsible for an estimated 68,000 human cases and a reported 10,000 to 15,000 deaths annually. Phylogenetic studies divided JEV into five geographically and epidemiologically distinct genotypes (GI to GV). GIII has been the source of numerous JEV epidemics throughout history and was the most frequently isolated genotype throughout most of Asia from 1935 until the 1990s. In recent years, GI has displaced GIII as the most frequently isolated virus genotype. To date, the mechanism of this genotype replacement has remained unknown. In this study, we have identified genetic determinants underlying the genotype displacement as it unfolded across Asia. JEV provides a paradigm for other flaviviruses, including West Nile, yellow fever, and dengue viruses, and the critical role of the selective advantages in the mosquito vector.
Subject(s)
Encephalitis Virus, Japanese/isolation & purification , Encephalitis, Japanese/virology , Asia/epidemiology , Encephalitis Virus, Japanese/classification , Encephalitis Virus, Japanese/genetics , Encephalitis, Japanese/epidemiology , Evolution, Molecular , Genotype , Humans , Phylogeny , PhylogeographyABSTRACT
Human immunodeficiency virus type 1 (HIV-1) is pandemic, but its contemporary global transmission network has not been characterized. A better understanding of the properties and dynamics of this network is essential for surveillance, prevention, and eventual eradication of HIV. Here, we apply a simple and computationally efficient network-based approach to all publicly available HIV polymerase sequences in the global database, revealing a contemporary picture of the spread of HIV-1 within and between countries. This approach automatically recovered well-characterized transmission clusters and extended other clusters thought to be contained within a single country across international borders. In addition, previously undescribed transmission clusters were discovered. Together, these clusters represent all known modes of HIV transmission. The extent of international linkage revealed by our comprehensive approach demonstrates the need to consider the global diversity of HIV, even when describing local epidemics. Finally, the speed of this method allows for near-real-time surveillance of the pandemic's progression.
Subject(s)
Disease Transmission, Infectious , Genetic Variation , HIV Infections/epidemiology , HIV Infections/transmission , HIV-1/classification , HIV-1/genetics , Pandemics , Cluster Analysis , Computational Biology/methods , Databases, Genetic , Global Health , HIV-1/isolation & purification , Humans , Molecular EpidemiologyABSTRACT
BACKGROUND: The segmented RNA genome of avian Influenza viruses (AIV) allows genetic reassortment between co-infecting viruses, providing an evolutionary pathway to generate genetic innovation. The genetic diversity (16 haemagglutinin and 9 neuraminidase subtypes) of AIV indicates an extensive reservoir of influenza viruses exists in bird populations, but how frequently subtypes reassort with each other is still unknown. Here we quantify the reassortment patterns among subtypes in the Eurasian avian viral pool by reconstructing the ancestral states of the subtypes as discrete states on time-scaled phylogenies with respect to the internal protein coding segments. We further analyzed how host species, the inferred evolutionary rates and the dN/dS ratio varied among segments and between discrete subtypes, and whether these factors may be associated with inter-subtype reassortment rate. RESULTS: The general patterns of reassortment are similar among five internal segments with the exception of segment 8, encoding the Non-Structural genes, which has a more divergent phylogeny. However, significant variation in rates between subtypes was observed. In particular, hemagglutinin-encoding segments of subtypes H5 to H9 reassort at a lower rate compared to those of H1 to H4, and Neuraminidase-encoding segments of subtypes N1 and N2 reassort less frequently than N3 to N9. Both host species and dN/dS ratio were significantly associated with reassortment rate, while evolutionary rate was not associated. The dN/dS ratio was negatively correlated with reassortment rate, as was the number of negatively selected sites for all segments. CONCLUSIONS: These results indicate that overall selective constraint and host species are both associated with reassortment rate. These results together identify the wild bird population as the major source of new reassortants, rather than domestic poultry. The lower reassortment rates observed for H5N1 and H9N2 may be explained by the large proportion of strains derived from domestic poultry populations. In contrast, the higher rates observed in the H1N1, H3N8 and H4N6 subtypes could be due to their primary origin as infections of wild birds with multiple low pathogenicity strains in the large avian reservoir.
Subject(s)
Hemagglutinin Glycoproteins, Influenza Virus/genetics , Influenza A virus/genetics , Influenza in Birds/virology , Neuraminidase/genetics , Recombination, Genetic , Viral Proteins/genetics , Animals , Birds , Evolution, Molecular , Genetic Variation , Host Specificity , Influenza A virus/classification , Influenza A virus/enzymology , Influenza A virus/physiology , Molecular Sequence Data , PhylogenyABSTRACT
West Central Africa has been implicated as the epicenter of the HIV-1 epidemic, and almost all group M subtypes can be found there. Previous analysis of early HIV-1 group M sequences from Kinshasa in the Democratic Republic of Congo, formerly Zaire, revealed that isolates from a number of individuals fall in different positions in phylogenetic trees constructed from sequences from opposite ends of the genome as a result of recombination between viruses of different subtypes. Here, we use discrete ancestral trait mapping to develop a procedure for quantifying HIV-1 group M intersubtype recombination across phylogenies, using individuals' gag (p17) and env (gp41) subtypes. The method was applied to previously described HIV-1 group M sequences from samples obtained in Kinshasa early in the global radiation of HIV. Nine different p17 and gp41 intersubtype recombinant combinations were present in the data set. The mean number of excess ancestral subtype transitions (NEST) required to map individuals' p17 subtypes onto the gp14 phylogeny samples, compared to the number required to map them onto the p17 phylogenies, and vice versa, indicated that excess subtype transitions occurred at a rate of approximately 7 × 10(-3) to 8 × 10(-3) per lineage per year as a result of intersubtype recombination. Our results imply that intersubtype recombination may have occurred in approximately 20% of lineages evolving over a period of 30 years and confirm intersubtype recombination as a substantial force in generating HIV-1 group M diversity.
Subject(s)
HIV Infections/epidemiology , HIV Infections/virology , HIV-1/classification , HIV-1/genetics , Recombination, Genetic , Cluster Analysis , Democratic Republic of the Congo/epidemiology , Genotype , HIV Antigens/genetics , HIV Envelope Protein gp41/genetics , HIV-1/isolation & purification , Humans , Molecular Epidemiology , Phylogeny , Sequence Analysis, DNA , gag Gene Products, Human Immunodeficiency Virus/geneticsABSTRACT
Molecular surveillance of viral pathogens and inference of transmission networks from genomic data play an increasingly important role in public health efforts, especially for HIV-1. For many methods, the genetic distance threshold used to connect sequences in the transmission network is a key parameter informing the properties of inferred networks. Using a distance threshold that is too high can result in a network with many spurious links, making it difficult to interpret. Conversely, a distance threshold that is too low can result in a network with too few links, which may not capture key insights into clusters of public health concern. Published research using the HIV-TRACE software package frequently uses the default threshold of 0.015 substitutions/site for HIV pol gene sequences, but in many cases, investigators heuristically select other threshold parameters to better capture the underlying dynamics of the epidemic they are studying. Here, we present a general heuristic scoring approach for tuning a distance threshold adaptively, which seeks to prevent the formation of giant clusters. We prioritize the ratio of the sizes of the largest and the second largest cluster, maximizing the number of clusters present in the network. We apply our scoring heuristic to outbreaks with different characteristics, such as regional or temporal variability, and demonstrate the utility of using the scoring mechanism's suggested distance threshold to identify clusters exhibiting risk factors that would have otherwise been more difficult to identify. For example, while we found that a 0.015 substitutions/site distance threshold is typical for US-like epidemics, recent outbreaks like the CRF07_BC subtype among men who have sex with men (MSM) in China have been found to have a lower optimal threshold of 0.005 to better capture the transition from injected drug use (IDU) to MSM as the primary risk factor. Alternatively, in communities surrounding Lake Victoria in Uganda, where there has been sustained hetero-sexual transmission for many years, we found that a larger distance threshold is necessary to capture a more risk factor-diverse population with sparse sampling over a longer period of time. Such identification may allow for more informed intervention action by respective public health officials.
ABSTRACT
Molecular surveillance of viral pathogens and inference of transmission networks from genomic data play an increasingly important role in public health efforts, especially for HIV-1. For many methods, the genetic distance threshold used to connect sequences in the transmission network is a key parameter informing the properties of inferred networks. Using a distance threshold that is too high can result in a network with many spurious links, making it difficult to interpret. Conversely, a distance threshold that is too low can result in a network with too few links, which may not capture key insights into clusters of public health concern. Published research using the HIV-TRACE software package frequently uses the default threshold of 0.015 substitutions/site for HIV pol gene sequences, but in many cases, investigators heuristically select other threshold parameters to better capture the underlying dynamics of the epidemic they are studying. Here, we present a general heuristic scoring approach for tuning a distance threshold adaptively, which seeks to prevent the formation of giant clusters. We prioritize the ratio of the sizes of the largest and the second largest cluster, maximizing the number of clusters present in the network. We apply our scoring heuristic to outbreaks with different characteristics, such as regional or temporal variability, and demonstrate the utility of using the scoring mechanism's suggested distance threshold to identify clusters exhibiting risk factors that would have otherwise been more difficult to identify. For example, while we found that a 0.015 substitutions/site distance threshold is typical for US-like epidemics, recent outbreaks like the CRF07_BC subtype among men who have sex with men (MSM) in China have been found to have a lower optimal threshold of 0.005 to better capture the transition from injected drug use (IDU) to MSM as the primary risk factor. Alternatively, in communities surrounding Lake Victoria in Uganda, where there has been sustained heterosexual transmission for many years, we found that a larger distance threshold is necessary to capture a more risk factor-diverse population with sparse sampling over a longer period of time. Such identification may allow for more informed intervention action by respective public health officials.
ABSTRACT
BACKGROUND: Reassortment between the RNA segments encoding haemagglutinin (HA) and neuraminidase (NA), the major antigenic influenza proteins, produces viruses with novel HA and NA subtype combinations and has preceded the emergence of pandemic strains. It has been suggested that productive viral infection requires a balance in the level of functional activity of HA and NA, arising from their closely interacting roles in the viral life cycle, and that this functional balance could be mediated by genetic changes in the HA and NA. Here, we investigate how the selective pressure varies for H7 avian influenza HA on different NA subtype backgrounds. RESULTS: By extending Bayesian stochastic mutational mapping methods to calculate the ratio of the rate of non-synonymous change to the rate of synonymous change (d(N)/d(S)), we found the average d(N)/d(S) across the avian influenza H7 HA1 region to be significantly greater on an N2 NA subtype background than on an N1, N3 or N7 background. Observed differences in evolutionary rates of H7 HA on different NA subtype backgrounds could not be attributed to underlying differences between avian host species or virus pathogenicity. Examination of d(N)/d(S) values for each subtype on a site-by-site basis indicated that the elevated d(N)/d(S) on the N2 NA background was a result of increased selection, rather than a relaxation of selective constraint. CONCLUSIONS: Our results are consistent with the hypothesis that reassortment exposes influenza HA to significant changes in selective pressure through genetic interactions with NA. Such epistatic effects might be explicitly accounted for in future models of influenza evolution.
Subject(s)
Biological Evolution , Hemagglutinin Glycoproteins, Influenza Virus/genetics , Influenza A virus/classification , Influenza A virus/genetics , Neuraminidase/genetics , Reassortant Viruses/classification , Reassortant Viruses/genetics , Animals , Bayes Theorem , Birds , Hemagglutinin Glycoproteins, Influenza Virus/metabolism , Influenza A virus/metabolism , Influenza A virus/pathogenicity , Influenza in Birds/virology , Neuraminidase/metabolism , Phylogeny , Reassortant Viruses/metabolism , Reassortant Viruses/pathogenicity , Stochastic ProcessesABSTRACT
HIV incidence in Kazakhstan increased by 73% between 2010 and 2020, with an estimated 35,000 people living with HIV (PLHIV) in 2020. The development of antiretroviral drug resistance is a major threat to effective antiretroviral therapy (ART), yet studies on the prevalence of drug resistance in Kazakhstan are sparse. In this study on the molecular epidemiology of HIV in Kazakhstan, we analyzed 968 partial HIV-1 pol sequences that were collected between 2017 and 2020 from PLHIV across all regions of Kazakhstan, covering almost 3% of PLHIV in 2020. Sequences predominantly represented subtypes A6 (57%) and CRF02_AG (41%), with 32% of sequences exhibiting high-level drug resistance. We further identified distinct drug-resistant mutations (DRMs) in the two subtypes: subtype A6 showed a propensity for DRMs A62V, G190S, K101E, and D67N, while CRF02_AG showed a propensity for K103N and V179E. Codon usage analysis revealed that different mutational pathways for the two subtypes may explain the difference in G190S and V179E frequencies. Phylogenetic analysis highlighted differences in the timing and geographic spread of both subtypes within the country, with A62V-harboring subtype A6 sequences clustering on the phylogeny, indicative of sustained transmission of the mutation. Our findings suggest an HIV epidemic characterized by high levels of drug resistance and differential DRM frequencies between subtypes. This emphasizes the importance of drug resistance monitoring within Kazakhstan, together with DRM and subtype screening at diagnosis, to tailor drug regimens and provide effective, virally suppressive ART.
Subject(s)
Anti-HIV Agents , HIV Infections , HIV-1 , Humans , Kazakhstan/epidemiology , Phylogeny , Drug Resistance, Viral/genetics , Mutation , HIV Infections/drug therapy , HIV Infections/epidemiology , Anti-HIV Agents/pharmacology , Anti-HIV Agents/therapeutic use , GenotypeABSTRACT
The spread of influenza has usually been described by a 'density' model, where the largest centres of population drive the epidemic within a country. An alternative model emphasizing the role of air travel has recently been developed. We have examined the relative importance of the two in the context of the 2009 H1N1 influenza epidemic in Scotland. We obtained genome sequences of 70 strains representative of the geographical and temporal distribution of H1N1 influenza during the summer and winter phases of the pandemic in 2009. We analysed these strains, together with another 128 from the rest of the UK and 292 globally distributed strains, using maximum-likelihood phylogenetic and bayesian phylogeographical methods. This revealed strikingly different epidemic patterns within Scotland in the early and late parts of 2009. The summer epidemic in Scotland was characterized by multiple independent introductions from both international and other UK sources, followed by major local expansion of a single clade that probably originated in Birmingham. The winter phase, in contrast, was more diverse genetically, with several clades of similar size in different locations, some of which had no particularly close phylogenetic affinity to strains sampled from either Scotland or England. Overall there was evidence to support both models, with significant links demonstrated between North American sequences and those from England, and between England and East Asia, indicating that major air-travel routes played an important role in the pattern of spread of the pandemic, both within the UK and globally.
Subject(s)
Influenza A Virus, H1N1 Subtype/isolation & purification , Influenza, Human/epidemiology , Influenza, Human/virology , Epidemics , Humans , Influenza A Virus, H1N1 Subtype/classification , Influenza A Virus, H1N1 Subtype/genetics , Influenza, Human/transmission , Molecular Sequence Data , Phylogeny , Scotland/epidemiology , Seasons , Travel , United Kingdom/epidemiologyABSTRACT
BACKGROUND: Many studies of sexual behavior have shown that individuals vary greatly in their number of sexual partners over time, but it has proved difficult to obtain parameter estimates relating to the dynamics of human immunodeficiency virus (HIV) transmission except in small-scale contact tracing studies. Recent developments in molecular phylodynamics have provided new routes to obtain these parameter estimates, and current clinical practice provides suitable data for entire infected populations. METHODS: A phylodynamic analysis was performed on partial pol gene sequences obtained for routine clinical care from 14,560 individuals, representing approximately 60% of the HIV-positive men who have sex with men (MSM) under care in the United Kingdom. RESULTS: Among individuals linked to others in the data set, 29% are linked to only 1 individual, 41% are linked to 2-10 individuals, and 29% are linked to ≥10 individuals. The right-skewed degree distribution can be approximated by a power law, but the data are best fitted by a Waring distribution for all time depths. For time depths of 5-7 years, the distribution parameter ρ lies within the range that indicates infinite variance. CONCLUSIONS: The transmission network among UK MSM is characterized by preferential association such that a randomly distributed intervention would not be expected to stop the epidemic.
Subject(s)
Epidemics , HIV Infections/epidemiology , HIV Infections/transmission , HIV-1/genetics , Cluster Analysis , Contact Tracing , HIV-1/classification , HIV-1/isolation & purification , Humans , Male , Molecular Epidemiology , Molecular Sequence Data , Molecular Typing , Phylogeny , Sequence Analysis, DNA , United Kingdom/epidemiologyABSTRACT
The Sustainable East Africa Research in Community Health (SEARCH) trial was a universal test-and-treat (UTT) trial in rural Uganda and Kenya, aiming to lower regional HIV-1 incidence. Here, we quantify breakthrough HIV-1 transmissions occurring during the trial from population-based, dried blood spot samples. Between 2013 and 2017, we obtained 549 gag and 488 pol HIV-1 consensus sequences from 745 participants: 469 participants infected prior to trial commencement and 276 SEARCH-incident infections. Putative transmission clusters, with a 1.5% pairwise genetic distance threshold, were inferred from maximum likelihood phylogenies; clusters arising after the start of SEARCH were identified with Bayesian time-calibrated phylogenies. Our phylodynamic approach identified nine clusters arising after the SEARCH start date: eight pairs and one triplet, representing mostly opposite-gender linked (6/9), within-community transmissions (7/9). Two clusters contained individuals with non-nucleoside reverse transcriptase inhibitor (NNRTI) resistance, both linked to intervention communities. The identification of SEARCH-incident, within-community transmissions reveals the role of unsuppressed individuals in sustaining the epidemic in both arms of a UTT trial setting. The presence of transmitted NNRTI resistance, implying treatment failure to the efavirenz-based antiretroviral therapy (ART) used during SEARCH, highlights the need to improve delivery and adherence to up-to-date ART recommendations, to halt HIV-1 transmission.
Subject(s)
Anti-HIV Agents , HIV Infections , HIV Seropositivity , HIV-1 , Anti-HIV Agents/therapeutic use , Bayes Theorem , HIV Infections/diagnosis , HIV Infections/drug therapy , HIV Infections/epidemiology , HIV-1/genetics , Humans , Reverse Transcriptase Inhibitors/therapeutic use , Uganda/epidemiologyABSTRACT
The heterosexual risk group has become the largest HIV infected group in the United Kingdom during the last 10 years, but little is known of the network structure and dynamics of viral transmission in this group. The overwhelming majority of UK heterosexual infections are of non-B HIV subtypes, indicating viruses originating among immigrants from sub-Saharan Africa. The high rate of HIV evolution, combined with the availability of a very high density sample of viral sequences from routine clinical care has allowed the phylodynamics of the epidemic to be investigated for the first time. Sequences of the viral protease and partial reverse transcriptase coding regions from 11,071 patients infected with HIV of non-B subtypes were studied. Of these, 2774 were closely linked to at least one other sequence by nucleotide distance. Including the closest sequences from the global HIV database identified 296 individuals that were in UK-based groups of 3 or more individuals. There were a total of 8 UK-based clusters of 10 or more, comprising 143/2774 (5%) individuals, much lower than the figure of 25% obtained earlier for men who have sex with men (MSM). Sample dates were incorporated into relaxed clock phylogenetic analyses to estimate the dates of internal nodes. From the resulting time-resolved phylogenies, the internode lengths, used as estimates of maximum transmission intervals, had a median of 27 months overall, over twice as long as obtained for MSM (14 months), with only 2% of transmissions occurring in the first 6 months after infection. This phylodynamic analysis of non-B subtype HIV sequences representing over 40% of the estimated UK HIV-infected heterosexual population has revealed heterosexual HIV transmission in the UK is clustered, but on average in smaller groups and is transmitted with slower dynamics than among MSM. More effective intervention to restrict the epidemic may therefore be feasible, given effective diagnosis programmes.