Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 121
Filtrar
1.
bioRxiv ; 2024 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-39005464

RESUMEN

Infectious disease dynamics are driven by the complex interplay of epidemiological, ecological, and evolutionary processes. Accurately modeling these interactions is crucial for understanding pathogen spread and informing public health strategies. However, existing simulators often fail to capture the dynamic interplay between these processes, resulting in oversimplified models that do not fully reflect real-world complexities in which the pathogen's genetic evolution dynamically influences disease transmission. We introduce the epidemiological-ecological-evolutionary simulator (e3SIM), an open-source framework that concurrently models the transmission dynamics and molecular evolution of pathogens within a host population while integrating environmental factors. Using an agent-based, discrete-generation, forward-in-time approach, e3SIM incorporates compartmental models, host-population contact networks, and quantitative-trait models for pathogens. This integration allows for realistic simulations of disease spread and pathogen evolution. Key features include a modular and scalable design, flexibility in modeling various epidemiological and population-genetic complexities, incorporation of time-varying environmental factors, and a user-friendly graphical interface. We demonstrate e3SIM's capabilities through simulations of realistic outbreak scenarios with SARS-CoV-2 and Mycobacterium tuberculosis, illustrating its flexibility for studying the genomic epidemiology of diverse pathogen types.

3.
Nat Commun ; 15(1): 2962, 2024 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-38580642

RESUMEN

The projected trajectory of multidrug resistant tuberculosis (MDR-TB) epidemics depends on the reproductive fitness of circulating strains of MDR M. tuberculosis (Mtb). Previous efforts to characterize the fitness of MDR Mtb have found that Mtb strains of the Beijing sublineage (Lineage 2.2.1) may be more prone to develop resistance and retain fitness in the presence of resistance-conferring mutations than other lineages. Using Mtb genome sequences from all culture-positive cases collected over two years in Moldova, we estimate the fitness of Ural (Lineage 4.2) and Beijing strains, the two lineages in which MDR is concentrated in the country. We estimate that the fitness of MDR Ural strains substantially exceeds that of other susceptible and MDR strains, and we identify several mutations specific to these MDR Ural strains. Our findings suggest that MDR Ural Mtb has been transmitting efficiently in Moldova and poses a substantial risk of spreading further in the region.


Asunto(s)
Mycobacterium tuberculosis , Tuberculosis Resistente a Múltiples Medicamentos , Humanos , Mycobacterium tuberculosis/genética , Antituberculosos/farmacología , Antituberculosos/uso terapéutico , Moldavia/epidemiología , Genotipo , Tuberculosis Resistente a Múltiples Medicamentos/tratamiento farmacológico , Tuberculosis Resistente a Múltiples Medicamentos/epidemiología , Tuberculosis Resistente a Múltiples Medicamentos/microbiología , Farmacorresistencia Bacteriana Múltiple/genética
4.
Evolution ; 78(6): 1092-1108, 2024 May 29.
Artículo en Inglés | MEDLINE | ID: mdl-38459852

RESUMEN

COVID-19 has become endemic, with dynamics that reflect the waning of immunity and re-exposure, by contrast to the epidemic phase driven by exposure in immunologically naïve populations. Endemic does not, however, mean constant. Further evolution of SARS-CoV-2, as well as changes in behavior and public health policy, continue to play a major role in the endemic load of disease and mortality. In this article, we analyze evolutionary models to explore the impact that a newly arising variant can have on the short-term and longer-term endemic load, characterizing how these impacts depend on the transmission and immunological properties of the variants. We describe how evolutionary changes in the virus will increase the endemic load most for a persistently immune-escape variant, by an intermediate amount for a more transmissible variant, and least for a transiently immune-escape variant. Balancing the tendency for evolution to favor variants that increase the endemic load, we explore the impact of vaccination strategies and non-pharmaceutical interventions that can counter these increases in the impact of disease. We end with some open questions about the future of COVID-19 as an endemic disease.


Asunto(s)
COVID-19 , SARS-CoV-2 , COVID-19/epidemiología , COVID-19/virología , COVID-19/inmunología , SARS-CoV-2/inmunología , SARS-CoV-2/genética , Humanos , Enfermedades Endémicas , Evolución Molecular
5.
BMC Public Health ; 24(1): 472, 2024 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-38355444

RESUMEN

BACKGROUND: Vaccine homophily describes non-heterogeneous vaccine uptake within contact networks. This study was performed to determine observable patterns of vaccine homophily, as well as the impact of vaccine homophily on disease transmission within and between vaccination groups under conditions of high and low vaccine efficacy. METHODS: Residents of British Columbia, Canada, aged ≥ 16 years, were recruited via online advertisements between February and March 2022, and provided information about vaccination status, perceived vaccination status of household and non-household contacts, compliance with COVID-19 prevention guidelines, and history of COVID-19. A deterministic mathematical model was used to assess transmission dynamics between vaccine status groups under conditions of high and low vaccine efficacy. RESULTS: Vaccine homophily was observed among those with 0, 2, or 3 doses of the vaccine. Greater homophily was observed among those who had more doses of the vaccine (p < 0.0001). Those with fewer vaccine doses had larger contact networks (p < 0.0001), were more likely to report prior COVID-19 (p < 0.0001), and reported lower compliance with COVID-19 prevention guidelines (p < 0.0001). Mathematical modelling showed that vaccine homophily plays a considerable role in epidemic growth under conditions of high and low vaccine efficacy. Furthermore, vaccine homophily contributes to a high force of infection among unvaccinated individuals under conditions of high vaccine efficacy, as well as to an elevated force of infection from unvaccinated to suboptimally vaccinated individuals under conditions of low vaccine efficacy. INTERPRETATION: The uneven uptake of COVID-19 vaccines and the nature of the contact network in the population play important roles in shaping COVID-19 transmission dynamics.


Asunto(s)
COVID-19 , Humanos , COVID-19/epidemiología , COVID-19/prevención & control , Vacunas contra la COVID-19 , Estudios Transversales , Pandemias/prevención & control , Vacunación , Colombia Británica/epidemiología
6.
J Theor Biol ; 578: 111689, 2024 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-38061489

RESUMEN

We investigated the implications of employing a circular approximation of split systems in the calculation of maximum diversity subsets of a set of taxa in a conservation biology context where diversity is measured using Split System Diversity (SSD). We conducted a comparative analysis between the maximum SSD score and the maximum SSD set(s) of size k, efficiently determined using a circular approximation, and the true results obtained through brute-force search based on the original data. Through experimentation on simulated datasets and SNP data across 50 Atlantic Salmon populations, our findings demonstrate that employing a circular approximation can lead to the generation of an incorrect max-SSD set(s). We built a graph-based split system whose circular approximation led to a max-SSD set of size k=4 that was less than the true max-SSD set by 17.6%. This discrepancy increased to 25% for k=11 when we used a hypergraph-based split system. The same comparison on the Atlantic salmon dataset revealed a mere 1% difference. However, noteworthy disparities emerged in the population composition between the two sets. These findings underscore the importance of assessing the suitability of circular approximations in conservation biology systems. Caution is advised when relying solely on circular approximations to determine sets of maximum diversity, and careful consideration of the data characteristics is crucial for accurate results in conservation biology applications.


Asunto(s)
Biodiversidad , Conservación de los Recursos Naturales
7.
Epidemics ; 45: 100733, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38056165

RESUMEN

The serial interval of an infectious disease is an important variable in epidemiology. It is defined as the period of time between the symptom onset times of the infector and infectee in a direct transmission pair. Under partially sampled data, purported infector-infectee pairs may actually be separated by one or more unsampled cases in between. Misunderstanding such pairs as direct transmissions will result in overestimating the length of serial intervals. On the other hand, two cases that are infected by an unseen third case (known as coprimary transmission) may be classified as a direct transmission pair, leading to an underestimation of the serial interval. Here, we introduce a method to jointly estimate the distribution of serial intervals factoring in these two sources of error. We simultaneously estimate the distribution of the number of unsampled intermediate cases between purported infector-infectee pairs, as well as the fraction of such pairs that are coprimary. We also extend our method to situations where each infectee has multiple possible infectors, and show how to factor this additional source of uncertainty into our estimates. We assess our method's performance on simulated data sets and find that our method provides consistent and robust estimates. We also apply our method to data from real-life outbreaks of four infectious diseases and compare our results with published results. With similar accuracy, our method of estimating serial interval distribution provides unique advantages, allowing its application in settings of low sampling rates and large population sizes, such as widespread community transmission tracked by routine public health surveillance.


Asunto(s)
COVID-19 , Humanos , COVID-19/epidemiología , Brotes de Enfermedades , Factores de Tiempo
8.
PLoS Comput Biol ; 19(12): e1011755, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38153948

RESUMEN

The mechanisms behind vaccine-induced strain replacement in the pneumococcus remain poorly understood. There is emerging evidence that distinct pneumococcal lineages can co-colonise for significant time periods, and that novel recombinants can readily emerge during natural colonisation. Despite this, patterns of post-vaccine replacement are indicative of competition between specific lineages. Here, we develop a multiscale transmission model to investigate explicitly how within host dynamics shape observed ecological patterns, both pre- and post-vaccination. Our model framework explores competition between and within strains defined by distinct antigenic, metabolic and resistance profiles. We allow for strains to freely co-colonise and recombine within hosts, and consider how each of these types may contribute to a strain's overall fitness. Our results suggest that antigenic and resistance profiles are key drivers of post-vaccine success.


Asunto(s)
Infecciones Neumocócicas , Streptococcus pneumoniae , Humanos , Infecciones Neumocócicas/prevención & control , Infecciones Neumocócicas/epidemiología , Vacunas Neumococicas , Dinámica Poblacional , Vacunación
9.
Sci Adv ; 9(44): eabp9185, 2023 11 03.
Artículo en Inglés | MEDLINE | ID: mdl-37922357

RESUMEN

The seasonal influenza (flu) vaccine is designed to protect against those influenza viruses predicted to circulate during the upcoming flu season, but identifying which viruses are likely to circulate is challenging. We use features from phylogenetic trees reconstructed from hemagglutinin (HA) and neuraminidase (NA) sequences, together with a support vector machine, to predict future circulation. We obtain accuracies of 0.75 to 0.89 (AUC 0.83 to 0.91) over 2016-2020. We explore ways to select potential candidates for a seasonal vaccine and find that the machine learning model has a moderate ability to select strains that are close to future populations. However, consensus sequences among the most recent 3 years also do well at this task. We identify similar candidate strains to those proposed by the World Health Organization, suggesting that this approach can help inform vaccine strain selection.


Asunto(s)
Vacunas contra la Influenza , Gripe Humana , Orthomyxoviridae , Humanos , Filogenia , Estaciones del Año , Gripe Humana/prevención & control , Vacunas contra la Influenza/genética
10.
J Math Biol ; 87(6): 80, 2023 11 06.
Artículo en Inglés | MEDLINE | ID: mdl-37926744

RESUMEN

Almost all models used in analysis of infectious disease outbreaks contain some notion of population size, usually taken as the census population size of the community in question. In many settings, however, the census population is not equivalent to the population likely to be exposed, for example if there are population structures, outbreak controls or other heterogeneities. Although these factors may be taken into account in the model: adding compartments to a compartmental model, variable mixing rates and so on, this makes fitting more challenging, especially if the population complexities are not fully known. In this work we consider the concept of effective population size in outbreak modelling, which we define as the size of the population involved in an outbreak, as an alternative to use of more complex models. Effective population size is an important quantity in genetics for estimation of genetic diversity loss in populations, but it has not been widely applied in epidemiology. Through simulation studies and application to data from outbreaks of COVID-19 in China, we find that simple SIR models with effective population size can provide a good fit to data which are not themselves simple or SIR.


Asunto(s)
COVID-19 , Enfermedades Transmisibles , Humanos , Densidad de Población , Enfermedades Transmisibles/epidemiología , Brotes de Enfermedades , Simulación por Computador , COVID-19/epidemiología
11.
Vaccine ; 41(43): 6411-6418, 2023 10 13.
Artículo en Inglés | MEDLINE | ID: mdl-37718186

RESUMEN

BACKGROUND: It is evident that COVID-19 will remain a public health concern in the coming years, largely driven by variants of concern (VOC). It is critical to continuously monitor vaccine effectiveness as new variants emerge and new vaccines and/or boosters are developed. Systematic surveillance of the scientific evidence base is necessary to inform public health action and identify key uncertainties. Evidence syntheses may also be used to populate models to fill in research gaps and help to prepare for future public health crises. This protocol outlines the rationale and methods for a living evidence synthesis of the effectiveness of COVID-19 vaccines in reducing the morbidity and mortality associated with, and transmission of, VOC of SARS-CoV-2. METHODS: Living evidence syntheses of vaccine effectiveness will be carried out over one year for (1) a range of potential outcomes in the index individual associated with VOC (pathogenesis); and (2) transmission of VOC. The literature search will be conducted up to May 2023. Observational and database-linkage primary studies will be included, as well as RCTs. Information sources include electronic databases (MEDLINE; Embase; Cochrane, L*OVE; the CNKI and Wangfang platforms), pre-print servers (medRxiv, BiorXiv), and online repositories of grey literature. Title and abstract and full-text screening will be performed by two reviewers using a liberal accelerated method. Data extraction and risk of bias assessment will be completed by one reviewer with verification of the assessment by a second reviewer. Results from included studies will be pooled via random effects meta-analysis when appropriate, or otherwise summarized narratively. DISCUSSION: Evidence generated from our living evidence synthesis will be used to inform policy making, modelling, and prioritization of future research on the effectiveness of COVID-19 vaccines against VOC.


Asunto(s)
COVID-19 , Humanos , COVID-19/prevención & control , Vacunas contra la COVID-19 , SARS-CoV-2 , Eficacia de las Vacunas , Sesgo , Metaanálisis como Asunto
12.
Infect Genet Evol ; 113: 105484, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37531976

RESUMEN

OBJECTIVES: Clustering pathogen sequence data is a common practice in epidemiology to gain insights into the genetic diversity and evolutionary relationships among pathogens. We can find groups of cases with a shared transmission history and common origin, as well as identifying transmission hotspots. Motivated by the experience of clustering SARS-CoV-2 cases using whole genome sequence data during the COVID-19 pandemic to aid with public health investigation, we investigated how differences in epidemiology and sampling can influence the composition of clusters that are identified. METHODS: We performed genomic clustering on simulated SARS-CoV-2 outbreaks produced with different transmission rates and levels of genomic diversity, along with varying the proportion of cases sampled. RESULTS: In single outbreaks with a low transmission rate, decreasing the sampling fraction resulted in multiple, separate clusters being identified where intermediate cases in transmission chains are missed. Outbreaks simulated with a high transmission rate were more robust to changes in the sampling fraction and largely resulted in a single cluster that included all sampled outbreak cases. When considering multiple outbreaks in a sampled jurisdiction seeded by different introductions, low genomic diversity between introduced cases caused outbreaks to be merged into large clusters. If the transmission and sampling fraction, and diversity between introductions was low, a combination of the spurious break-up of outbreaks and the linking of closely related cases in different outbreaks resulted in clusters that may appear informative, but these did not reflect the true underlying population structure. Conversely, genomic clusters matched the true population structure when there was relatively high diversity between introductions and a high transmission rate. CONCLUSION: Differences in epidemiology and sampling can impact our ability to identify genomic clusters that describe the underlying population structure. These findings can help to guide recommendations for the use of pathogen clustering in public health investigations.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , COVID-19/epidemiología , Pandemias , Brotes de Enfermedades , Genómica , Análisis por Conglomerados
13.
14.
Nat Commun ; 14(1): 4830, 2023 08 10.
Artículo en Inglés | MEDLINE | ID: mdl-37563113

RESUMEN

Serial intervals - the time between symptom onset in infector and infectee - are a fundamental quantity in infectious disease control. However, their estimation requires knowledge of individuals' exposures, typically obtained through resource-intensive contact tracing efforts. We introduce an alternate framework using virus sequences to inform who infected whom and thereby estimate serial intervals. We apply our technique to SARS-CoV-2 sequences from case clusters in the first two COVID-19 waves in Victoria, Australia. We find that our approach offers high resolution, cluster-specific serial interval estimates that are comparable with those obtained from contact data, despite requiring no knowledge of who infected whom and relying on incompletely-sampled data. Compared to a published serial interval, cluster-specific serial intervals can vary estimates of the effective reproduction number by a factor of 2-3. We find that serial interval estimates in settings such as schools and meat processing/packing plants are shorter than those in healthcare facilities.


Asunto(s)
COVID-19 , Humanos , COVID-19/epidemiología , SARS-CoV-2/genética , Genómica , Trazado de Contacto , Victoria
15.
Genome Res ; 33(7): 1053-1060, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37217252

RESUMEN

The reconstruction of phylogenetic networks is an important but challenging problem in phylogenetics and genome evolution, as the space of phylogenetic networks is vast and cannot be sampled well. One approach to the problem is to solve the minimum phylogenetic network problem, in which phylogenetic trees are first inferred, and then the smallest phylogenetic network that displays all the trees is computed. The approach takes advantage of the fact that the theory of phylogenetic trees is mature, and there are excellent tools available for inferring phylogenetic trees from a large number of biomolecular sequences. A tree-child network is a phylogenetic network satisfying the condition that every nonleaf node has at least one child that is of indegree one. Here, we develop a new method that infers the minimum tree-child network by aligning lineage taxon strings in the phylogenetic trees. This algorithmic innovation enables us to get around the limitations of the existing programs for phylogenetic network inference. Our new program, named ALTS, is fast enough to infer a tree-child network with a large number of reticulations for a set of up to 50 phylogenetic trees with 50 taxa that have only trivial common clusters in about a quarter of an hour on average.


Asunto(s)
Algoritmos , Genoma , Humanos , Filogenia
16.
Microb Genom ; 9(3)2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36867086

RESUMEN

In the management of infectious disease outbreaks, grouping cases into clusters and understanding their underlying epidemiology are fundamental tasks. In genomic epidemiology, clusters are typically identified either using pathogen sequences alone or with sequences in combination with epidemiological data such as location and time of collection. However, it may not be feasible to culture and sequence all pathogen isolates, so sequence data may not be available for all cases. This presents challenges for identifying clusters and understanding epidemiology, because these cases may be important for transmission. Demographic, clinical and location data are likely to be available for unsequenced cases, and comprise partial information about their clustering. Here, we use statistical modelling to assign unsequenced cases to clusters already identified by genomic methods, assuming that a more direct method of linking individuals, such as contact tracing, is not available. We build our model on pairwise similarity between cases to predict whether cases cluster together, in contrast to using individual case data to predict the cases' clusters. We then develop methods that allow us to determine whether a pair of unsequenced cases are likely to cluster together, to group them into their most probable clusters, to identify which are most likely to be members of a specific (known) cluster, and to estimate the true size of a known cluster given a set of unsequenced cases. We apply our method to tuberculosis data from Valencia, Spain. Among other applications, we find that clustering can be predicted successfully using spatial distance between cases and whether nationality is the same. We can identify the correct cluster for an unsequenced case, among 38 possible clusters, with an accuracy of approximately 35 %, higher than both direct multinomial regression (17 %) and random selection (< 5 %).


Asunto(s)
Brotes de Enfermedades , Genómica , Humanos , Análisis por Conglomerados , Modelos Logísticos
17.
Biometrics ; 79(4): 3650-3663, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-36745619

RESUMEN

Understanding factors that contribute to the increased likelihood of pathogen transmission between two individuals is important for infection control. However, analyzing measures of pathogen relatedness to estimate these associations is complicated due to correlation arising from the presence of the same individual across multiple dyadic outcomes, potential spatial correlation caused by unmeasured transmission dynamics, and the distinctive distributional characteristics of some of the outcomes. We develop two novel hierarchical Bayesian spatial methods for analyzing dyadic pathogen genetic relatedness data, in the form of patristic distances and transmission probabilities, that simultaneously address each of these complications. Using individual-level spatially correlated random effect parameters, we account for multiple sources of correlation between the outcomes as well as other important features of their distribution. Through simulation, we show the limitations of existing approaches in terms of estimating key associations of interest, and the ability of the new methodology to correct for these issues across datasets with different levels of correlation. All methods are applied to Mycobacterium tuberculosis data from the Republic of Moldova, where we identify previously unknown factors associated with disease transmission and, through analysis of the random effect parameters, key individuals, and areas with increased transmission activity. Model comparisons show the importance of the new methodology in this setting. The methods are implemented in the R package GenePair.


Asunto(s)
Mycobacterium tuberculosis , Humanos , Mycobacterium tuberculosis/genética , Teorema de Bayes , Simulación por Computador
18.
J Theor Biol ; 559: 111368, 2023 02 21.
Artículo en Inglés | MEDLINE | ID: mdl-36436733

RESUMEN

COVID-19 remains a major public health concern, with large resurgences even where there has been widespread uptake of vaccines. Waning immunity and the emergence of new variants will shape the long-term burden and dynamics of COVID-19. We explore the transition to the endemic state, and the endemic incidence in British Columbia (BC), Canada and South Africa (SA), to compare low and high vaccination coverage settings with differing public health policies, using a combination of modelling approaches. We compare reopening (relaxation of public health measures) gradually and rapidly as well as at different vaccination levels. We examine how the eventual endemic state depends on the duration of immunity, the rate of importations, the efficacy of vaccines and the transmissibility. These depend on the evolution of the virus, which continues to undergo selection. Slower reopening leads to a lower peak level of incidence and fewer overall infections in the wave following reopening: as much as a 60% lower peak and a 10% lower total in some illustrative simulations; under realistic parameters, reopening when 70% of the population is vaccinated leads to a large resurgence in cases. The long-term endemic behaviour may stabilize as late as January 2023, with further waves of high incidence occurring depending on the transmissibility of the prevalent variant, duration of immunity, and antigenic drift. We find that long term endemic levels are not necessarily lower than current pandemic levels: in a population of 100,000 with representative parameter settings (Reproduction number 5, 1-year duration of immunity, vaccine efficacy at 80% and importations at 3 cases per 100K per day) there are over 100 daily incident cases in the model. Predicted prevalence at endemicity has increased more than twofold after the emergence and spread of Omicron. The consequent burden on health care systems depends on the severity of infection in immunized or previously infected individuals.


Asunto(s)
COVID-19 , Pandemias , Humanos , Pandemias/prevención & control , COVID-19/epidemiología , COVID-19/prevención & control , Vacunación , Transporte Biológico , Salud Pública
19.
J Comput Biol ; 30(2): 189-203, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36374242

RESUMEN

Genome-wide association studies (GWASs) are often confounded by population stratification and structure. Linear mixed models (LMMs) are a powerful class of methods for uncovering genetic effects, while controlling for such confounding. LMMs include random effects for a genetic similarity matrix, and they assume that a true genetic similarity matrix is known. However, uncertainty about the phylogenetic structure of a study population may degrade the quality of LMM results. This may happen in bacterial studies in which the number of samples or loci is small, or in studies with low-quality genotyping. In this study, we develop methods for linear mixed models in which the genetic similarity matrix is unknown and is derived from Markov chain Monte Carlo estimates of the phylogeny. We apply our model to a GWAS of multidrug resistance in tuberculosis, and illustrate our methods on simulated data.


Asunto(s)
Estudio de Asociación del Genoma Completo , Modelos Genéticos , Humanos , Estudio de Asociación del Genoma Completo/métodos , Filogenia , Incertidumbre , Modelos Lineales , Polimorfismo de Nucleótido Simple
20.
medRxiv ; 2023 Dec 29.
Artículo en Inglés | MEDLINE | ID: mdl-38234741

RESUMEN

Background: Because M. tuberculosis evolves slowly, transmission clusters often contain multiple individuals with identical consensus genomes, making it difficult to reconstruct transmission chains. Finding additional sources of shared M. tuberculosis variation could help overcome this problem. Previous studies have reported M. tuberculosis diversity within infected individuals; however, whether within-host variation improves transmission inferences remains unclear. Methods: To evaluate the transmission information present in within-host M. tuberculosis variation, we re-analyzed publicly available sequence data from three household transmission studies, using household membership as a proxy for transmission linkage between donor-recipient pairs. Findings: We found moderate levels of minority variation present in M. tuberculosis sequence data from cultured isolates that varied significantly across studies (mean: 6, 7, and 170 minority variants above a 1% minor allele frequency threshold, outside of PE/PPE genes). Isolates from household members shared more minority variants than did isolates from unlinked individuals in the three studies (mean 98 shared minority variants vs. 10; 0.8 vs. 0.2, and 0.7 vs. 0.2, respectively). Shared within-host variation was significantly associated with household membership (OR: 1.51 [1.30,1.71], for one standard deviation increase in shared minority variants). Models that included shared within-host variation improved the accuracy of predicting household membership in all three studies as compared to models without within-host variation (AUC: 0.95 versus 0.92, 0.99 versus 0.95, and 0.93 versus 0.91). Interpretation: Within-host M. tuberculosis variation persists through culture and could enhance the resolution of transmission inferences. The substantial differences in minority variation recovered across studies highlights the need to optimize approaches to recover and incorporate within-host variation into automated phylogenetic and transmission inference. Funding: NIAID: 5K01AI173385.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...