Búsqueda | Portal de Búsqueda de la BVS España

A genomic data resource for predicting antimicrobial resistance from laboratory-derived antimicrobial susceptibility phenotypes.

VanOeffelen, Margo; Nguyen, Marcus; Aytan-Aktug, Derya; Brettin, Thomas; Dietrich, Emily M; Kenyon, Ronald W; Machi, Dustin; Mao, Chunhong; Olson, Robert; Pusch, Gordon D; Shukla, Maulik; Stevens, Rick; Vonstein, Veronika; Warren, Andrew S; Wattam, Alice R; Yoo, Hyunseung; Davis, James J.

Brief Bioinform ; 22(6)2021 11 05.

Artículo en Inglés | MEDLINE | ID: mdl-34379107

RESUMEN

Antimicrobial resistance (AMR) is a major global health threat that affects millions of people each year. Funding agencies worldwide and the global research community have expended considerable capital and effort tracking the evolution and spread of AMR by isolating and sequencing bacterial strains and performing antimicrobial susceptibility testing (AST). For the last several years, we have been capturing these efforts by curating data from the literature and data resources and building a set of assembled bacterial genome sequences that are paired with laboratory-derived AST data. This collection currently contains AST data for over 67 000 genomes encompassing approximately 40 genera and over 100 species. In this paper, we describe the characteristics of this collection, highlighting areas where sampling is comparatively deep or shallow, and showing areas where attention is needed from the research community to improve sampling and tracking efforts. In addition to using the data to track the evolution and spread of AMR, it also serves as a useful starting point for building machine learning models for predicting AMR phenotypes. We demonstrate this by describing two machine learning models that are built from the entire dataset to show where the predictive power is comparatively high or low. This AMR metadata collection is freely available and maintained on the Bacterial and Viral Bioinformatics Center (BV-BRC) FTP site ftp://ftp.bvbrc.org/RELEASE_NOTES/PATRIC_genomes_AMR.txt.

Asunto(s)

Biología Computacional/métodos , Bases de Datos Genéticas , Farmacorresistencia Microbiana , Genómica/métodos , Pruebas de Sensibilidad Microbiana , Inteligencia Artificial , Bacterias/efectos de los fármacos , Bacterias/genética , Genoma Bacteriano , Humanos , Laboratorios , Aprendizaje Automático , Fenotipo

Can Machine Learning Models Predict Asparaginase-associated Pancreatitis in Childhood Acute Lymphoblastic Leukemia.

Nielsen, Rikke L; Wolthers, Benjamin O; Helenius, Marianne; Albertsen, Birgitte K; Clemmensen, Line; Nielsen, Kasper; Kanerva, Jukka; Niinimäki, Riitta; Frandsen, Thomas L; Attarbaschi, Andishe; Barzilai, Shlomit; Colombini, Antonella; Escherich, Gabriele; Aytan-Aktug, Derya; Liu, Hsi-Che; Möricke, Anja; Samarasinghe, Sujith; van der Sluis, Inge M; Stanulla, Martin; Tulstrup, Morten; Yadav, Rachita; Zapotocka, Ester; Schmiegelow, Kjeld; Gupta, Ramneek.

J Pediatr Hematol Oncol ; 44(3): e628-e636, 2022 04 01.

Artículo en Inglés | MEDLINE | ID: mdl-35226426

RESUMEN

Asparaginase-associated pancreatitis (AAP) frequently affects children treated for acute lymphoblastic leukemia (ALL) causing severe acute and persisting complications. Known risk factors such as asparaginase dosing, older age and single nucleotide polymorphisms (SNPs) have insufficient odds ratios to allow personalized asparaginase therapy. In this study, we explored machine learning strategies for prediction of individual AAP risk. We integrated information on age, sex, and SNPs based on Illumina Omni2.5exome-8 arrays of patients with childhood ALL (N=1564, 244 with AAP 1.0 to 17.9 yo) from 10 international ALL consortia into machine learning models including regression, random forest, AdaBoost and artificial neural networks. A model with only age and sex had area under the receiver operating characteristic curve (ROC-AUC) of 0.62. Inclusion of 6 pancreatitis candidate gene SNPs or 4 validated pancreatitis SNPs boosted ROC-AUC somewhat (0.67) while 30 SNPs, identified through our AAP genome-wide association study cohort, boosted performance (0.80). Most predictive features included rs10273639 (PRSS1-PRSS2), rs10436957 (CTRC), rs13228878 (PRSS1/PRSS2), rs1505495 (GALNTL6), rs4655107 (EPHB2) and age (1 to 7 y). Second AAP following asparaginase re-exposure was predicted with ROC-AUC: 0.65. The machine learning models assist individual-level risk assessment of AAP for future prevention trials, and may legitimize asparaginase re-exposure when AAP risk is predicted to be low.

Asunto(s)

Antineoplásicos , Asparaginasa , Pancreatitis , Leucemia-Linfoma Linfoblástico de Células Precursoras , Antineoplásicos/efectos adversos , Asparaginasa/efectos adversos , Niño , Estudio de Asociación del Genoma Completo , Humanos , Aprendizaje Automático , Pancreatitis/inducido químicamente , Pancreatitis/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/tratamiento farmacológico , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética

ResFinder - an open online resource for identification of antimicrobial resistance genes in next-generation sequencing data and prediction of phenotypes from genotypes.

Florensa, Alfred Ferrer; Kaas, Rolf Sommer; Clausen, Philip Thomas Lanken Conradsen; Aytan-Aktug, Derya; Aarestrup, Frank M.

Microb Genom ; 8(1)2022 01.

Artículo en Inglés | MEDLINE | ID: mdl-35072601

RESUMEN

Antimicrobial resistance (AMR) is one of the most important health threats globally. The ability to accurately identify resistant bacterial isolates and the individual antimicrobial resistance genes (ARGs) is essential for understanding the evolution and emergence of AMR and to provide appropriate treatment. The rapid developments in next-generation sequencing technologies have made this technology available to researchers and microbiologists at routine laboratories around the world. However, tools available for those with limited experience with bioinformatics are lacking, especially to enable researchers and microbiologists in low- and middle-income countries (LMICs) to perform their own studies. The CGE-tools (Center for Genomic Epidemiology) including ResFinder (https://cge.cbs.dtu.dk/services/ResFinder/) was developed to provide freely available easy to use online bioinformatic tools allowing inexperienced researchers and microbiologists to perform simple bioinformatic analyses. The main purpose was and is to provide these solutions for people involved in frontline diagnosis especially in LMICs. Since its original publication in 2012, ResFinder has undergone a number of improvements including improvement of the code and databases, inclusion of point mutations for selected bacterial species and predictions of phenotypes also for selected species. As of 28 September 2021, 820 803 analyses have been performed using ResFinder from 61 776 IP-addresses in 171 countries. ResFinder clearly fulfills a need for several people around the globe and we hope to be able to continue to provide this service free of charge in the future. We also hope and expect to provide further improvements including phenotypic predictions for additional bacterial species.

Asunto(s)

Bacterias/genética , Proteínas Bacterianas/genética , Biología Computacional/métodos , Bacterias/efectos de los fármacos , Bases de Datos Genéticas , Farmacorresistencia Bacteriana , Genoma Bacteriano , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Pruebas de Sensibilidad Microbiana , Mutación , Fenotipo , Análisis de Secuencia de ADN , Programas Informáticos

PlasmidHostFinder: Prediction of Plasmid Hosts Using Random Forest.

Aytan-Aktug, Derya; Clausen, Philip T L C; Szarvas, Judit; Munk, Patrick; Otani, Saria; Nguyen, Marcus; Davis, James J; Lund, Ole; Aarestrup, Frank M.

mSystems ; 7(2): e0118021, 2022 04 26.

Artículo en Inglés | MEDLINE | ID: mdl-35382558

RESUMEN

Plasmids play a major role facilitating the spread of antimicrobial resistance between bacteria. Understanding the host range and dissemination trajectories of plasmids is critical for surveillance and prevention of antimicrobial resistance. Identification of plasmid host ranges could be improved using automated pattern detection methods compared to homology-based methods due to the diversity and genetic plasticity of plasmids. In this study, we developed a method for predicting the host range of plasmids using machine learning-specifically, random forests. We trained the models with 8,519 plasmids from 359 different bacterial species per taxonomic level; the models achieved Matthews correlation coefficients of 0.662 and 0.867 at the species and order levels, respectively. Our results suggest that despite the diverse nature and genetic plasticity of plasmids, our random forest model can accurately distinguish between plasmid hosts. This tool is available online through the Center for Genomic Epidemiology (https://cge.cbs.dtu.dk/services/PlasmidHostFinder/). IMPORTANCE Antimicrobial resistance is a global health threat to humans and animals, causing high mortality and morbidity while effectively ending decades of success in fighting against bacterial infections. Plasmids confer extra genetic capabilities to the host organisms through accessory genes that can encode antimicrobial resistance and virulence. In addition to lateral inheritance, plasmids can be transferred horizontally between bacterial taxa. Therefore, detection of the host range of plasmids is crucial for understanding and predicting the dissemination trajectories of extrachromosomal genes and bacterial evolution as well as taking effective countermeasures against antimicrobial resistance.

Asunto(s)

Antiinfecciosos , Bosques Aleatorios , Animales , Humanos , Plásmidos , Bacterias/genética , Genómica

SourceFinder: a Machine-Learning-Based Tool for Identification of Chromosomal, Plasmid, and Bacteriophage Sequences from Assemblies.

Aytan-Aktug, Derya; Grigorjev, Vladislav; Szarvas, Judit; Clausen, Philip T L C; Munk, Patrick; Nguyen, Marcus; Davis, James J; Aarestrup, Frank M; Lund, Ole.

Microbiol Spectr ; 10(6): e0264122, 2022 12 21.

Artículo en Inglés | MEDLINE | ID: mdl-36377945

RESUMEN

High-throughput genome sequencing technologies enable the investigation of complex genetic interactions, including the horizontal gene transfer of plasmids and bacteriophages. However, identifying these elements from assembled reads remains challenging due to genome sequence plasticity and the difficulty in assembling complete sequences. In this study, we developed a classifier, using random forest, to identify whether sequences originated from bacterial chromosomes, plasmids, or bacteriophages. The classifier was trained on a diverse collection of 23,211 chromosomal, plasmid, and bacteriophage sequences from hundreds of bacterial species. In order to adapt the classifier to incomplete sequences, each complete sequence was subsampled into 5,000 nucleotide fragments and further subdivided into k-mers. This three-class classifier succeeded in identifying chromosomes, plasmids, and bacteriophages using k-mer distributions of complete and partial genome sequences, including simulated metagenomic scaffolds with minimum performance of 0.939 area under the receiver operating characteristic curve (AUC). This classifier, implemented as SourceFinder, has been made available as an online web service to help the community with predicting the chromosomal, plasmid, and bacteriophage sources of assembled bacterial sequence data (https://cge.food.dtu.dk/services/SourceFinder/). IMPORTANCE Extra-chromosomal genes encoding antimicrobial resistance, metal resistance, and virulence provide selective advantages for bacterial survival under stress conditions and pose serious threats to human and animal health. These accessory genes can impact the composition of microbiomes by providing selective advantages to their hosts. Accurately identifying extra-chromosomal elements in genome sequence data are critical for understanding gene dissemination trajectories and taking preventative measures. Therefore, in this study, we developed a random forest classifier for identifying the source of bacterial chromosomal, plasmid, and bacteriophage sequences.

Asunto(s)

Bacteriófagos , Genoma Bacteriano , Humanos , Bacteriófagos/genética , Plásmidos/genética , Cromosomas Bacterianos/genética , Aprendizaje Automático

A roadmap for the generation of benchmarking resources for antimicrobial resistance detection using next generation sequencing.

Petrillo, Mauro; Fabbri, Marco; Kagkli, Dafni Maria; Querci, Maddalena; Van den Eede, Guy; Alm, Erik; Aytan-Aktug, Derya; Capella-Gutierrez, Salvador; Carrillo, Catherine; Cestaro, Alessandro; Chan, Kok-Gan; Coque, Teresa; Endrullat, Christoph; Gut, Ivo; Hammer, Paul; Kay, Gemma L; Madec, Jean-Yves; Mather, Alison E; McHardy, Alice Carolyn; Naas, Thierry; Paracchini, Valentina; Peter, Silke; Pightling, Arthur; Raffael, Barbara; Rossen, John; Ruppé, Etienne; Schlaberg, Robert; Vanneste, Kevin; Weber, Lukas M; Westh, Henrik; Angers-Loustau, Alexandre.

F1000Res ; 10: 80, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-35847383

RESUMEN

Next Generation Sequencing technologies significantly impact the field of Antimicrobial Resistance (AMR) detection and monitoring, with immediate uses in diagnosis and risk assessment. For this application and in general, considerable challenges remain in demonstrating sufficient trust to act upon the meaningful information produced from raw data, partly because of the reliance on bioinformatics pipelines, which can produce different results and therefore lead to different interpretations. With the constant evolution of the field, it is difficult to identify, harmonise and recommend specific methods for large-scale implementations over time. In this article, we propose to address this challenge through establishing a transparent, performance-based, evaluation approach to provide flexibility in the bioinformatics tools of choice, while demonstrating proficiency in meeting common performance standards. The approach is two-fold: first, a community-driven effort to establish and maintain "live" (dynamic) benchmarking platforms to provide relevant performance metrics, based on different use-cases, that would evolve together with the AMR field; second, agreed and defined datasets to allow the pipelines' implementation, validation, and quality-control over time. Following previous discussions on the main challenges linked to this approach, we provide concrete recommendations and future steps, related to different aspects of the design of benchmarks, such as the selection and the characteristics of the datasets (quality, choice of pathogens and resistances, etc.), the evaluation criteria of the pipelines, and the way these resources should be deployed in the community.

Asunto(s)

Benchmarking , Secuenciación de Nucleótidos de Alto Rendimiento , Antibacterianos/farmacología , Biología Computacional/métodos , Farmacorresistencia Bacteriana/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos

Understanding and predicting ciprofloxacin minimum inhibitory concentration in Escherichia coli with machine learning.

Pataki, Bálint Ármin; Matamoros, Sébastien; van der Putten, Boas C L; Remondini, Daniel; Giampieri, Enrico; Aytan-Aktug, Derya; Hendriksen, Rene S; Lund, Ole; Csabai, István; Schultsz, Constance.

Sci Rep ; 10(1): 15026, 2020 09 14.

Artículo en Inglés | MEDLINE | ID: mdl-32929164

RESUMEN

It is important that antibiotics prescriptions are based on antimicrobial susceptibility data to ensure effective treatment outcomes. The increasing availability of next-generation sequencing, bacterial whole genome sequencing (WGS) can facilitate a more reliable and faster alternative to traditional phenotyping for the detection and surveillance of AMR. This work proposes a machine learning approach that can predict the minimum inhibitory concentration (MIC) for a given antibiotic, here ciprofloxacin, on the basis of both genome-wide mutation profiles and profiles of acquired antimicrobial resistance genes. We analysed 704 Escherichia coli genomes combined with their respective MIC measurements for ciprofloxacin originating from different countries. The four most important predictors found by the model, mutations in gyrA residues Ser83 and Asp87, a mutation in parC residue Ser80 and presence of the qnrS1 gene, have been experimentally validated before. Using only these four predictors in a linear regression model, 65% and 93% of the test samples' MIC were correctly predicted within a two- and a four-fold dilution range, respectively. The presented work does not treat machine learning as a black box model concept, but also identifies the genomic features that determine susceptibility. The recent progress in WGS technology in combination with machine learning analysis approaches indicates that in the near future WGS of bacteria might become cheaper and faster than a MIC measurement.

Asunto(s)

Antibacterianos/toxicidad , Ciprofloxacina/toxicidad , Farmacorresistencia Bacteriana , Genes Bacterianos , Aprendizaje Automático , Girasa de ADN/genética , Escherichia coli/efectos de los fármacos , Escherichia coli/genética , Proteínas de Escherichia coli/genética , Concentración 50 Inhibidora , Péptidos y Proteínas de Señalización Intracelular/genética , Mutación , Pruebas de Toxicidad/métodos

Data integration for prediction of weight loss in randomized controlled dietary trials.

Nielsen, Rikke Linnemann; Helenius, Marianne; Garcia, Sara L; Roager, Henrik M; Aytan-Aktug, Derya; Hansen, Lea Benedicte Skov; Lind, Mads Vendelbo; Vogt, Josef K; Dalgaard, Marlene Danner; Bahl, Martin I; Jensen, Cecilia Bang; Muktupavela, Rasa; Warinner, Christina; Aaskov, Vincent; Gøbel, Rikke; Kristensen, Mette; Frøkiær, Hanne; Sparholt, Morten H; Christensen, Anders F; Vestergaard, Henrik; Hansen, Torben; Kristiansen, Karsten; Brix, Susanne; Petersen, Thomas Nordahl; Lauritzen, Lotte; Licht, Tine Rask; Pedersen, Oluf; Gupta, Ramneek.

Sci Rep ; 10(1): 20103, 2020 11 18.

Artículo en Inglés | MEDLINE | ID: mdl-33208769

RESUMEN

Diet is an important component in weight management strategies, but heterogeneous responses to the same diet make it difficult to foresee individual weight-loss outcomes. Omics-based technologies now allow for analysis of multiple factors for weight loss prediction at the individual level. Here, we classify weight loss responders (N = 106) and non-responders (N = 97) of overweight non-diabetic middle-aged Danes to two earlier reported dietary trials over 8 weeks. Random forest models integrated gut microbiome, host genetics, urine metabolome, measures of physiology and anthropometrics measured prior to any dietary intervention to identify individual predisposing features of weight loss in combination with diet. The most predictive models for weight loss included features of diet, gut bacterial species and urine metabolites (ROC-AUC: 0.84-0.88) compared to a diet-only model (ROC-AUC: 0.62). A model ensemble integrating multi-omics identified 64% of the non-responders with 80% confidence. Such models will be useful to assist in selecting appropriate weight management strategies, as individual predisposition to diet response varies.

Asunto(s)

Dietoterapia/métodos , Microbioma Gastrointestinal , Pérdida de Peso , Biomarcadores/sangre , Biomarcadores/orina , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Aprendizaje Automático , Masculino , Periodo Posprandial , Curva ROC , Ensayos Clínicos Controlados Aleatorios como Asunto , Reproducibilidad de los Resultados , Resultado del Tratamiento , Granos Enteros

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA