Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
Science ; 381(6655): 336-343, 2023 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-37471538

RESUMO

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of concern (VOCs) now arise in the context of heterogeneous human connectivity and population immunity. Through a large-scale phylodynamic analysis of 115,622 Omicron BA.1 genomes, we identified >6,000 introductions of the antigenically distinct VOC into England and analyzed their local transmission and dispersal history. We find that six of the eight largest English Omicron lineages were already transmitting when Omicron was first reported in southern Africa (22 November 2021). Multiple datasets show that importation of Omicron continued despite subsequent restrictions on travel from southern Africa as a result of export from well-connected secondary locations. Initiation and dispersal of Omicron transmission lineages in England was a two-stage process that can be explained by models of the country's human geography and hierarchical travel network. Our results enable a comparison of the processes that drive the invasion of Omicron and other VOCs across multiple spatial scales.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , África Austral , COVID-19/transmissão , COVID-19/virologia , Genômica , SARS-CoV-2/classificação , SARS-CoV-2/genética , SARS-CoV-2/patogenicidade , Filogenia
2.
Syst Biol ; 72(5): 1136-1153, 2023 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-37458991

RESUMO

Divergence time estimation is crucial to provide temporal signals for dating biologically important events from species divergence to viral transmissions in space and time. With the advent of high-throughput sequencing, recent Bayesian phylogenetic studies have analyzed hundreds to thousands of sequences. Such large-scale analyses challenge divergence time reconstruction by requiring inference on highly correlated internal node heights that often become computationally infeasible. To overcome this limitation, we explore a ratio transformation that maps the original $N-1$ internal node heights into a space of one height parameter and $N-2$ ratio parameters. To make the analyses scalable, we develop a collection of linear-time algorithms to compute the gradient and Jacobian-associated terms of the log-likelihood with respect to these ratios. We then apply Hamiltonian Monte Carlo sampling with the ratio transform in a Bayesian framework to learn the divergence times in 4 pathogenic viruses (West Nile virus, rabies virus, Lassa virus, and Ebola virus) and the coralline red algae. Our method both resolves a mixing issue in the West Nile virus example and improves inference efficiency by at least 5-fold for the Lassa and rabies virus examples as well as for the algae example. Our method now also makes it computationally feasible to incorporate mixed-effects molecular clock models for the Ebola virus example, confirms the findings from the original study, and reveals clearer multimodal distributions of the divergence times of some clades of interest.


Assuntos
Algoritmos , Filogenia , Teorema de Bayes , Fatores de Tempo , Método de Monte Carlo
3.
Philos Trans R Soc Lond B Biol Sci ; 377(1861): 20210242, 2022 10 10.
Artigo em Inglês | MEDLINE | ID: mdl-35989603

RESUMO

Recent advances in Bayesian phylogenetics offer substantial computational savings to accommodate increased genomic sampling that challenges traditional inference methods. In this review, we begin with a brief summary of the Bayesian phylogenetic framework, and then conceptualize a variety of methods to improve posterior approximations via Markov chain Monte Carlo (MCMC) sampling. Specifically, we discuss methods to improve the speed of likelihood calculations, reduce MCMC burn-in, and generate better MCMC proposals. We apply several of these techniques to study the evolution of HIV virulence along a 1536-tip phylogeny and estimate the internal node heights of a 1000-tip SARS-CoV-2 phylogenetic tree in order to illustrate the speed-up of such analyses using current state-of-the-art approaches. We conclude our review with a discussion of promising alternatives to MCMC that approximate the phylogenetic posterior. This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.


Assuntos
COVID-19 , Software , Algoritmos , Teorema de Bayes , Humanos , Cadeias de Markov , Método de Monte Carlo , Filogenia , SARS-CoV-2/genética
4.
medRxiv ; 2021 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-34751273

RESUMO

The SARS-CoV-2 Gamma variant spread rapidly across Brazil, causing substantial infection and death waves. We use individual-level patient records following hospitalisation with suspected or confirmed COVID-19 to document the extensive shocks in hospital fatality rates that followed Gamma's spread across 14 state capitals, and in which more than half of hospitalised patients died over sustained time periods. We show that extensive fluctuations in COVID-19 in-hospital fatality rates also existed prior to Gamma's detection, and were largely transient after Gamma's detection, subsiding with hospital demand. Using a Bayesian fatality rate model, we find that the geographic and temporal fluctuations in Brazil's COVID-19 in-hospital fatality rates are primarily associated with geographic inequities and shortages in healthcare capacity. We project that approximately half of Brazil's COVID-19 deaths in hospitals could have been avoided without pre-pandemic geographic inequities and without pandemic healthcare pressure. Our results suggest that investments in healthcare resources, healthcare optimization, and pandemic preparedness are critical to minimize population wide mortality and morbidity caused by highly transmissible and deadly pathogens such as SARS-CoV-2, especially in low- and middle-income countries. NOTE: The following manuscript has appeared as 'Report 46 - Factors driving extensive spatial and temporal fluctuations in COVID-19 fatality rates in Brazilian hospitals' at https://spiral.imperial.ac.uk:8443/handle/10044/1/91875 . ONE SENTENCE SUMMARY: COVID-19 in-hospital fatality rates fluctuate dramatically in Brazil, and these fluctuations are primarily associated with geographic inequities and shortages in healthcare capacity.

5.
Syst Biol ; 70(1): 181-189, 2021 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-32415977

RESUMO

Markov models of character substitution on phylogenies form the foundation of phylogenetic inference frameworks. Early models made the simplifying assumption that the substitution process is homogeneous over time and across sites in the molecular sequence alignment. While standard practice adopts extensions that accommodate heterogeneity of substitution rates across sites, heterogeneity in the process over time in a site-specific manner remains frequently overlooked. This is problematic, as evolutionary processes that act at the molecular level are highly variable, subjecting different sites to different selective constraints over time, impacting their substitution behavior. We propose incorporating time variability through Markov-modulated models (MMMs), which extend covarion-like models and allow the substitution process (including relative character exchange rates as well as the overall substitution rate) at individual sites to vary across lineages. We implement a general MMM framework in BEAST, a popular Bayesian phylogenetic inference software package, allowing researchers to compose a wide range of MMMs through flexible XML specification. Using examples from bacterial, viral, and plastid genome evolution, we show that MMMs impact phylogenetic tree estimation and can substantially improve model fit compared to standard substitution models. Through simulations, we show that marginal likelihood estimation accurately identifies the generative model and does not systematically prefer the more parameter-rich MMMs. To mitigate the increased computational demands associated with MMMs, our implementation exploits recent developments in BEAGLE, a high-performance computational library for phylogenetic inference. [Bayesian inference; BEAGLE; BEAST; covarion, heterotachy; Markov-modulated models; phylogenetics.].


Assuntos
Evolução Molecular , Modelos Genéticos , Teorema de Bayes , Simulação por Computador , Cadeias de Markov , Filogenia , Alinhamento de Sequência
6.
Syst Biol ; 70(2): 258-267, 2021 02 10.
Artigo em Inglês | MEDLINE | ID: mdl-32687171

RESUMO

Relaxed random walk (RRW) models of trait evolution introduce branch-specific rate multipliers to modulate the variance of a standard Brownian diffusion process along a phylogeny and more accurately model overdispersed biological data. Increased taxonomic sampling challenges inference under RRWs as the number of unknown parameters grows with the number of taxa. To solve this problem, we present a scalable method to efficiently fit RRWs and infer this branch-specific variation in a Bayesian framework. We develop a Hamiltonian Monte Carlo (HMC) sampler to approximate the high-dimensional, correlated posterior that exploits a closed-form evaluation of the gradient of the trait data log-likelihood with respect to all branch-rate multipliers simultaneously. Our gradient calculation achieves computational complexity that scales only linearly with the number of taxa under study. We compare the efficiency of our HMC sampler to the previously standard univariable Metropolis-Hastings approach while studying the spatial emergence of the West Nile virus in North America in the early 2000s. Our method achieves at least a 6-fold speed increase over the univariable approach. Additionally, we demonstrate the scalability of our method by applying the RRW to study the correlation between five mammalian life history traits in a phylogenetic tree with $3650$ tips.[Bayesian inference; BEAST; Hamiltonian Monte Carlo; life history; phylodynamics, relaxed random walk.].


Assuntos
Algoritmos , Animais , Teorema de Bayes , Método de Monte Carlo , Fenótipo , Filogenia
8.
Wellcome Open Res ; 5: 53, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32923688

RESUMO

Nonparametric coalescent-based models are often employed to infer past population dynamics over time. Several of these models, such as the skyride and skygrid models, are equipped with a block-updating Markov chain Monte Carlo sampling scheme to efficiently estimate model parameters. The advent of powerful computational hardware along with the use of high-performance libraries for statistical phylogenetics has, however, made the development of alternative estimation methods feasible. We here present the implementation and performance assessment of a Hamiltonian Monte Carlo gradient-based sampler to infer the parameters of the skygrid model. The skygrid is a popular and flexible coalescent-based model for estimating population dynamics over time and is available in BEAST 1.10.5, a widely-used software package for Bayesian pylogenetic and phylodynamic analysis. Taking into account the increased computational cost of gradient evaluation, we report substantial increases in effective sample size per time unit compared to the established block-updating sampler. We expect gradient-based samplers to assume an increasingly important role for different classes of parameters typically estimated in Bayesian phylogenetic and phylodynamic analyses.

9.
J Antimicrob Chemother ; 75(5): 1311-1320, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32053203

RESUMO

BACKGROUND: Validated biomarkers to evaluate HIV-1 cure strategies are currently lacking, therefore requiring analytical treatment interruption (ATI) in study participants. Little is known about the safety of ATI and its long-term impact on patient health. OBJECTIVES: ATI safety was assessed and potential biomarkers predicting viral rebound were evaluated. METHODS: PBMCs, plasma and CSF were collected from 11 HIV-1-positive individuals at four different timepoints during ATI (NCT02641756). Total and integrated HIV-1 DNA, cell-associated (CA) HIV-1 RNA transcripts and restriction factor (RF) expression were measured by PCR-based assays. Markers of neuroinflammation and neuronal injury [neurofilament light chain (NFL) and YKL-40 protein] were measured in CSF. Additionally, neopterin, tryptophan and kynurenine were measured, both in plasma and CSF, as markers of immune activation. RESULTS: Total HIV-1 DNA, integrated HIV-1 DNA and CA viral RNA transcripts did not differ pre- and post-ATI. Similarly, no significant NFL or YKL-40 increases in CSF were observed between baseline and viral rebound. Furthermore, markers of immune activation did not increase during ATI. Interestingly, the RFs SLFN11 and APOBEC3G increased after ATI before viral rebound. Similarly, Tat-Rev transcripts were increased preceding viral rebound after interruption. CONCLUSIONS: ATI did not increase viral reservoir size and it did not reveal signs of increased neuronal injury or inflammation, suggesting that these well-monitored ATIs are safe. Elevation of Tat-Rev transcription and induced expression of the RFs SLFN11 and APOBEC3G after ATI, prior to viral rebound, indicates that these factors could be used as potential biomarkers predicting viral rebound.


Assuntos
Infecções por HIV , HIV-1 , Desaminase APOBEC-3G , Biomarcadores , Infecções por HIV/tratamento farmacológico , HIV-1/genética , Humanos , Proteínas Nucleares , RNA Viral , Carga Viral
10.
Nat Commun ; 10(1): 5310, 2019 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-31757953

RESUMO

The role of Africa in the dynamics of the global spread of a zoonotic and economically-important virus, such as the highly pathogenic avian influenza (HPAI) H5Nx of the Gs/GD lineage, remains unexplored. Here we characterise the spatiotemporal patterns of virus diffusion during three HPAI H5Nx intercontinental epidemic waves and demonstrate that Africa mainly acted as an ecological sink of the HPAI H5Nx viruses. A joint analysis of host dynamics and continuous spatial diffusion indicates that poultry trade as well as wild bird migrations have contributed to the virus spreading into Africa, with West Africa acting as a crucial hotspot for virus introduction and dissemination into the continent. We demonstrate varying paths of avian influenza incursions into Africa as well as virus spread within Africa over time, which reveal that virus expansion is a complex phenomenon, shaped by an intricate interplay between avian host ecology, virus characteristics and environmental variables.


Assuntos
Influenza Aviária/transmissão , Influenza Humana/transmissão , Doenças das Aves Domésticas/transmissão , África , África Ocidental , Animais , Humanos , Virus da Influenza A Subtipo H5N1/genética , Vírus da Influenza A Subtipo H5N8/genética , Vírus da Influenza A/genética , Influenza Aviária/economia , Influenza Aviária/epidemiologia , Influenza Aviária/virologia , Influenza Humana/economia , Influenza Humana/epidemiologia , Influenza Humana/virologia , Filogenia , Aves Domésticas , Doenças das Aves Domésticas/economia , Doenças das Aves Domésticas/epidemiologia , Doenças das Aves Domésticas/virologia
11.
Methods Mol Biol ; 1910: 691-722, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31278682

RESUMO

In this chapter, we focus on the computational challenges associated with statistical phylogenomics and how use of the broad-platform evolutionary analysis general likelihood evaluator (BEAGLE), a high-performance library for likelihood computation, can help to substantially reduce computation time in phylogenomic and phylodynamic analyses. We discuss computational improvements brought about by the BEAGLE library on a variety of state-of-the-art multicore hardware, and for a range of commonly used evolutionary models. For data sets of varying dimensions, we specifically focus on comparing performance in the Bayesian evolutionary analysis by sampling trees (BEAST) software between multicore central processing units (CPUs) and a wide range of graphics processing cards (GPUs). We put special emphasis on computational benchmarks from the field of phylodynamics, which combines the challenges of phylogenomics with those of modelling trait data associated with the observed sequence data. In conclusion, we show that for increasingly large molecular sequence data sets, GPUs can offer tremendous computational advancements through the use of the BEAGLE library, which is available for software packages for both Bayesian inference and maximum-likelihood frameworks.


Assuntos
Teorema de Bayes , Biologia Computacional , Genômica , Filogenia , Software , Animais , Biologia Computacional/métodos , Evolução Molecular , Genômica/métodos , Humanos , Cadeias de Markov , Modelos Estatísticos , Método de Monte Carlo , Reprodutibilidade dos Testes
13.
Nat Commun ; 9(1): 2222, 2018 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-29884821

RESUMO

Genetic analyses have provided important insights into Ebola virus spread during the recent West African outbreak, but their implications for specific intervention scenarios remain unclear. Here, we address this issue using a collection of phylodynamic approaches. We show that long-distance dispersal events were not crucial for epidemic expansion and that preventing viral lineage movement to any given administrative area would, in most cases, have had little impact. However, major urban areas were critical in attracting and disseminating the virus: preventing viral lineage movement to all three capitals simultaneously would have contained epidemic size to one-third. We also show that announcements of border closures were followed by a significant but transient effect on international virus dispersal. By quantifying the hypothetical impact of different intervention strategies, as well as the impact of barriers on dispersal frequency, our study illustrates how phylodynamic analyses can help to address specific epidemiological and outbreak control questions.


Assuntos
Surtos de Doenças/prevenção & controle , Ebolavirus/fisiologia , Doença pelo Vírus Ebola/epidemiologia , Doença pelo Vírus Ebola/virologia , África Ocidental/epidemiologia , Ebolavirus/classificação , Ebolavirus/genética , Genoma Viral/genética , Geografia , Doença pelo Vírus Ebola/transmissão , Humanos , Filogenia , Fatores de Tempo
14.
Bioinformatics ; 33(12): 1798-1805, 2017 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-28200071

RESUMO

MOTIVATION: Advances in sequencing technology continue to deliver increasingly large molecular sequence datasets that are often heavily partitioned in order to accurately model the underlying evolutionary processes. In phylogenetic analyses, partitioning strategies involve estimating conditionally independent models of molecular evolution for different genes and different positions within those genes, requiring a large number of evolutionary parameters that have to be estimated, leading to an increased computational burden for such analyses. The past two decades have also seen the rise of multi-core processors, both in the central processing unit (CPU) and Graphics processing unit processor markets, enabling massively parallel computations that are not yet fully exploited by many software packages for multipartite analyses. RESULTS: We here propose a Markov chain Monte Carlo (MCMC) approach using an adaptive multivariate transition kernel to estimate in parallel a large number of parameters, split across partitioned data, by exploiting multi-core processing. Across several real-world examples, we demonstrate that our approach enables the estimation of these multipartite parameters more efficiently than standard approaches that typically use a mixture of univariate transition kernels. In one case, when estimating the relative rate parameter of the non-coding partition in a heterochronous dataset, MCMC integration efficiency improves by > 14-fold. AVAILABILITY AND IMPLEMENTATION: Our implementation is part of the BEAST code base, a widely used open source software package to perform Bayesian phylogenetic inference. CONTACT: guy.baele@kuleuven.be. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Evolução Molecular , Filogenia , Análise de Sequência de DNA/métodos , Software , Teorema de Bayes , Biologia Computacional/métodos , Cadeias de Markov , Método de Monte Carlo
15.
BMC Bioinformatics ; 15: 133, 2014 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-24885610

RESUMO

BACKGROUND: Simulated nucleotide or amino acid sequences are frequently used to assess the performance of phylogenetic reconstruction methods. BEAST, a Bayesian statistical framework that focuses on reconstructing time-calibrated molecular evolutionary processes, supports a wide array of evolutionary models, but lacked matching machinery for simulation of character evolution along phylogenies. RESULTS: We present a flexible Monte Carlo simulation tool, called πBUSS, that employs the BEAGLE high performance library for phylogenetic computations to rapidly generate large sequence alignments under complex evolutionary models. πBUSS sports a user-friendly graphical user interface (GUI) that allows combining a rich array of models across an arbitrary number of partitions. A command-line interface mirrors the options available through the GUI and facilitates scripting in large-scale simulation studies. πBUSS may serve as an easy-to-use, standard sequence simulation tool, but the available models and data types are particularly useful to assess the performance of complex BEAST inferences. The connection with BEAST is further strengthened through the use of a common extensible markup language (XML), allowing to specify also more advanced evolutionary models. To support simulation under the latter, as well as to support simulation and analysis in a single run, we also add the πBUSS core simulation routine to the list of BEAST XML parsers. CONCLUSIONS: πBUSS offers a unique combination of flexibility and ease-of-use for sequence simulation under realistic evolutionary scenarios. Through different interfaces, πBUSS supports simulation studies ranging from modest endeavors for illustrative purposes to complex and large-scale assessments of evolutionary inference procedures. Applications are not restricted to the BEAST framework, or even time-measured evolutionary histories, and πBUSS can be connected to various other programs using standard input and output format.


Assuntos
Evolução Molecular , Análise de Sequência/métodos , Software , Teorema de Bayes , Simulação por Computador , Método de Monte Carlo , Filogenia , Alinhamento de Sequência
16.
Mol Biol Evol ; 30(3): 713-24, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23180580

RESUMO

Effective population size is fundamental in population genetics and characterizes genetic diversity. To infer past population dynamics from molecular sequence data, coalescent-based models have been developed for Bayesian nonparametric estimation of effective population size over time. Among the most successful is a Gaussian Markov random field (GMRF) model for a single gene locus. Here, we present a generalization of the GMRF model that allows for the analysis of multilocus sequence data. Using simulated data, we demonstrate the improved performance of our method to recover true population trajectories and the time to the most recent common ancestor (TMRCA). We analyze a multilocus alignment of HIV-1 CRF02_AG gene sequences sampled from Cameroon. Our results are consistent with HIV prevalence data and uncover some aspects of the population history that go undetected in Bayesian parametric estimation. Finally, we recover an older and more reconcilable TMRCA for a classic ancient DNA data set.


Assuntos
Loci Gênicos , Modelos Genéticos , Algoritmos , Teorema de Bayes , Simulação por Computador , Evolução Molecular , Genes Virais , Especiação Genética , HIV-1/genética , Humanos , Cadeias de Markov , Método de Monte Carlo , Mutação , Densidade Demográfica , Dinâmica Populacional , Estatísticas não Paramétricas
17.
J Virol ; 83(24): 12917-24, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19793809

RESUMO

Human immunodeficiency virus type 1 (HIV-1) genetic diversity, due to its high evolutionary rate, has long been identified as a main cause of problems in the development of an efficient HIV-1 vaccine. However, little is known about differences in evolutionary rate between different subtypes. In this study, we collected representative samples of the main epidemic subtypes and circulating recombinant forms (CRFs), namely, sub-subtype A1, subtypes B, C, D, and G, and CRFs 01_AE and 02_AG. We analyzed separate data sets for pol and env. We performed a Bayesian Markov chain Monte Carlo relaxed-clock phylogenetic analysis and applied a codon model to the resulting phylogenetic trees to estimate nonsynonymous (dN) and synonymous (dS) rates along each and every branch. We found important differences in the evolutionary rates of the different subtypes. These are due to differences not only in the dN rate but also in the dS rate, varying in roughly similar ways, indicating that these differences are caused by both different selective pressures (for dN rate) and the replication dynamics (for dS rate) (i.e., mutation rate or generation time) of the strains. CRF02_AG and subtype G had higher rates, while subtype D had lower dN and dS rates than the other subtypes. The dN/dS ratio estimates were also different, especially for the env gene, with subtype G showing the lowest dN/dS ratio of all subtypes.


Assuntos
Evolução Molecular , HIV-1/classificação , HIV-1/genética , Método de Monte Carlo , Filogenia
18.
PLoS Comput Biol ; 5(9): e1000520, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19779555

RESUMO

As a key factor in endemic and epidemic dynamics, the geographical distribution of viruses has been frequently interpreted in the light of their genetic histories. Unfortunately, inference of historical dispersal or migration patterns of viruses has mainly been restricted to model-free heuristic approaches that provide little insight into the temporal setting of the spatial dynamics. The introduction of probabilistic models of evolution, however, offers unique opportunities to engage in this statistical endeavor. Here we introduce a Bayesian framework for inference, visualization and hypothesis testing of phylogeographic history. By implementing character mapping in a Bayesian software that samples time-scaled phylogenies, we enable the reconstruction of timed viral dispersal patterns while accommodating phylogenetic uncertainty. Standard Markov model inference is extended with a stochastic search variable selection procedure that identifies the parsimonious descriptions of the diffusion process. In addition, we propose priors that can incorporate geographical sampling distributions or characterize alternative hypotheses about the spatial dynamics. To visualize the spatial and temporal information, we summarize inferences using virtual globe software. We describe how Bayesian phylogeography compares with previous parsimony analysis in the investigation of the influenza A H5N1 origin and H5N1 epidemiological linkage among sampling localities. Analysis of rabies in West African dog populations reveals how virus diffusion may enable endemic maintenance through continuous epidemic cycles. From these analyses, we conclude that our phylogeographic framework will make an important asset in molecular epidemiology that can be easily generalized to infer biogeogeography from genetic data for many organisms.


Assuntos
Teorema de Bayes , Geografia , Modelos Biológicos , Epidemiologia Molecular/métodos , Filogenia , Animais , Biologia Computacional/métodos , Cães , Humanos , Virus da Influenza A Subtipo H5N1/genética , Influenza Humana/epidemiologia , Cadeias de Markov , Raiva/epidemiologia , Vírus da Raiva/genética , Processos Estocásticos
19.
J Comput Biol ; 14(8): 1105-14, 2007 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-17985990

RESUMO

The human immunodeficiency virus (HIV) has a genome that is rich in adenine, and its rapid evolution shows an observed bias of guanine (G) to adenine (A) mutations. Two mechanisms have been proposed to explain these properties: (1) an imbalance in dNTP pool concentrations which drives the misincorporation process during reverse transcription, and (2) cytidine deamination by the APOBEC3G/3F restriction factor, causing G to A mutations most notably in specific dinucleotide contexts. Although crucial to understanding HIV evolution, current estimates on misincorporation bias during the replication cycle are based on scarce in vitro measurements. In this work, HIV partial pol sequences obtained for drug resistance testing purposes are analyzed using likelihood methods to estimate various models of HIV misincorporation bias in vivo. The technique is robust to selection on the amino acid sequence and selection against CpG dinucleotides. A model where misincorporations are explained only by an imbalance in dNTP pool concentrations, together with a preference for transitions versus transversions, explained 98% (95% confidence interval [C.I.] 93-100) of the observed variation in freely estimated misincorporation rates. Although dinucleotide context was responsible for variation in misincorporation probabilities, this variation was not specific for G to A mutations implying that the footprint of APOBEC3G/3F editing could not be detected. These results indicate that an imbalance in dNTP pool concentrations explains most of the bias in HIV nucleotide misincorporations, while the effect of editing by APOBEC3G/3F on HIV evolution, based on its dinucleotide specificity, could not be observed in this study.


Assuntos
Citidina Desaminase/metabolismo , Citosina Desaminase/metabolismo , Desoxirribonucleotídeos/metabolismo , Evolução Molecular , HIV/genética , HIV/metabolismo , Desaminase APOBEC-3G , Algoritmos , Biologia Computacional , DNA Viral/genética , DNA Viral/metabolismo , Infecções por HIV/metabolismo , Infecções por HIV/virologia , Humanos , Funções Verossimilhança , Cadeias de Markov , Modelos Biológicos , Método de Monte Carlo , Mutação , Edição de RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA