Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
mSystems ; 8(4): e0005823, 2023 08 31.
Artigo em Inglês | MEDLINE | ID: mdl-37314210

RESUMO

Having the ability to predict the protein-encoding gene content of an incomplete genome or metagenome-assembled genome is important for a variety of bioinformatic tasks. In this study, as a proof of concept, we built machine learning classifiers for predicting variable gene content in Escherichia coli genomes using only the nucleotide k-mers from a set of 100 conserved genes as features. Protein families were used to define orthologs, and a single classifier was built for predicting the presence or absence of each protein family occurring in 10%-90% of all E. coli genomes. The resulting set of 3,259 extreme gradient boosting classifiers had a per-genome average macro F1 score of 0.944 [0.943-0.945, 95% CI]. We show that the F1 scores are stable across multi-locus sequence types and that the trend can be recapitulated by sampling a smaller number of core genes or diverse input genomes. Surprisingly, the presence or absence of poorly annotated proteins, including "hypothetical proteins" was accurately predicted (F1 = 0.902 [0.898-0.906, 95% CI]). Models for proteins with horizontal gene transfer-related functions had slightly lower F1 scores but were still accurate (F1s = 0.895, 0.872, 0.824, and 0.841 for transposon, phage, plasmid, and antimicrobial resistance-related functions, respectively). Finally, using a holdout set of 419 diverse E. coli genomes that were isolated from freshwater environmental sources, we observed an average per-genome F1 score of 0.880 [0.876-0.883, 95% CI], demonstrating the extensibility of the models. Overall, this study provides a framework for predicting variable gene content using a limited amount of input sequence data. IMPORTANCE Having the ability to predict the protein-encoding gene content of a genome is important for assessing genome quality, binning genomes from shotgun metagenomic assemblies, and assessing risk due to the presence of antimicrobial resistance and other virulence genes. In this study, we built a set of binary classifiers for predicting the presence or absence of variable genes occurring in 10%-90% of all publicly available E. coli genomes. Overall, the results show that a large portion of the E. coli variable gene content can be predicted with high accuracy, including genes with functions relating to horizontal gene transfer. This study offers a strategy for predicting gene content using limited input sequence data.


Assuntos
Anti-Infecciosos , Proteínas de Escherichia coli , Escherichia coli/genética , Genoma Bacteriano/genética , Plasmídeos , Proteínas de Escherichia coli/genética
2.
Trop Med Health ; 49(1): 1, 2021 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-33397511

RESUMO

BACKGROUND: Lack of sustainable access to clean drinking water continues to be an issue of paramount global importance, leading to millions of preventable deaths annually. Best practices for providing sustainable access to clean drinking water, however, remain unclear. Widespread installation of low-cost, in-home, point of use water filtration systems is a promising strategy. METHODS: We conducted a prospective, randomized, controlled trial whereby 16 villages were selected and randomly assigned to one of four treatment arms based on the installation location of Sawyer® PointONE™ filters (filter in both home and school; filter in home only; filter in school only; control group). Water samples and self-reported information on diarrhea were collected at multiple times throughout the study. RESULTS: Self-reported household prevalence of diarrhea decreased from 25.6 to 9.76% from installation to follow-up (at least 7 days, and up to 200 days post-filter installation). These declines were also observed in diarrhea with economic or educational consequences (diarrhea which led to medical treatment and/or missing school or work) with baseline prevalence of 9.64% declining to 1.57%. Decreases in diarrhea prevalence were observed across age groups. There was no evidence of a loss of efficacy of filters up to 200 days post-filter installation. Installation of filters in schools was not associated with decreases in diarrhea prevalence in school-aged children or family members. Unfiltered water samples both at schools and homes contained potential waterborne bacterial pathogens, dissolved heavy metals and metals associated with particulates. All dissolved metals were detected at levels below World Health Organization action guidelines. CONCLUSIONS: This controlled trial provides strong evidence of the effectiveness of point-of-use, hollow fiber membrane filters at reducing diarrhea from bacterial sources up to 200 days post-installation when installed in homes. No statistically significant reduction in diarrhea was found when filters were installed in schools. Further research is needed in order to explore filter efficacy and utilization after 200 days post-installation. TRIAL REGISTRATION: ClinicalTrials.gov, NCT03972618 . Registered 3 June 2019-retrospectively registered.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA