Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 188
Filtrar
Más filtros

Tipo del documento
Intervalo de año de publicación
2.
Nucleic Acids Res ; 49(D1): D319-D324, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33166383

RESUMEN

The majority of naturally occurring proteins have evolved to function under mild conditions inside the living organisms. One of the critical obstacles for the use of proteins in biotechnological applications is their insufficient stability at elevated temperatures or in the presence of salts. Since experimental screening for stabilizing mutations is typically laborious and expensive, in silico predictors are often used for narrowing down the mutational landscape. The recent advances in machine learning and artificial intelligence further facilitate the development of such computational tools. However, the accuracy of these predictors strongly depends on the quality and amount of data used for training and testing, which have often been reported as the current bottleneck of the approach. To address this problem, we present a novel database of experimental thermostability data for single-point mutants FireProtDB. The database combines the published datasets, data extracted manually from the recent literature, and the data collected in our laboratory. Its user interface is designed to facilitate both types of the expected use: (i) the interactive explorations of individual entries on the level of a protein or mutation and (ii) the construction of highly customized and machine learning-friendly datasets using advanced searching and filtering. The database is freely available at https://loschmidt.chemi.muni.cz/fireprotdb.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Aprendizaje Automático/estadística & datos numéricos , Mutación Puntual , Proteínas/química , Conjuntos de Datos como Asunto , Internet , Modelos Moleculares , Anotación de Secuencia Molecular , Estabilidad Proteica , Proteínas/genética , Programas Informáticos
3.
Nucleic Acids Res ; 49(D1): D1334-D1346, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33156327

RESUMEN

In 2014, the National Institutes of Health (NIH) initiated the Illuminating the Druggable Genome (IDG) program to identify and improve our understanding of poorly characterized proteins that can potentially be modulated using small molecules or biologics. Two resources produced from these efforts are: The Target Central Resource Database (TCRD) (http://juniper.health.unm.edu/tcrd/) and Pharos (https://pharos.nih.gov/), a web interface to browse the TCRD. The ultimate goal of these resources is to highlight and facilitate research into currently understudied proteins, by aggregating a multitude of data sources, and ranking targets based on the amount of data available, and presenting data in machine learning ready format. Since the 2017 release, both TCRD and Pharos have produced two major releases, which have incorporated or expanded an additional 25 data sources. Recently incorporated data types include human and viral-human protein-protein interactions, protein-disease and protein-phenotype associations, and drug-induced gene signatures, among others. These aggregated data have enabled us to generate new visualizations and content sections in Pharos, in order to empower users to find new areas of study in the druggable genome.


Asunto(s)
Bases de Datos Factuales , Genoma Humano , Enfermedades Neurodegenerativas/genética , Proteómica/métodos , Programas Informáticos , Virosis/genética , Animales , Anticonvulsivantes/química , Anticonvulsivantes/uso terapéutico , Antivirales/química , Antivirales/uso terapéutico , Productos Biológicos/química , Productos Biológicos/uso terapéutico , Minería de Datos/estadística & datos numéricos , Interacciones Huésped-Patógeno/efectos de los fármacos , Interacciones Huésped-Patógeno/genética , Humanos , Internet , Aprendizaje Automático/estadística & datos numéricos , Ratones , Ratones Noqueados , Terapia Molecular Dirigida/métodos , Enfermedades Neurodegenerativas/clasificación , Enfermedades Neurodegenerativas/tratamiento farmacológico , Enfermedades Neurodegenerativas/virología , Mapeo de Interacción de Proteínas , Proteoma/agonistas , Proteoma/antagonistas & inhibidores , Proteoma/genética , Proteoma/metabolismo , Bibliotecas de Moléculas Pequeñas/química , Bibliotecas de Moléculas Pequeñas/uso terapéutico , Virosis/clasificación , Virosis/tratamiento farmacológico , Virosis/virología
4.
J Hepatol ; 76(3): 600-607, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-34793867

RESUMEN

BACKGROUND & AIMS: Saliva and stool microbiota are altered in cirrhosis. Since stool is logistically difficult to collect compared to saliva, it is important to determine their relative diagnostic and prognostic capabilities. We aimed to determine the ability of stool vs. saliva microbiota to differentiate between groups based on disease severity using machine learning (ML). METHODS: Controls and outpatients with cirrhosis underwent saliva and stool microbiome analysis. Controls vs. cirrhosis and within cirrhosis (based on hepatic encephalopathy [HE], proton pump inhibitor [PPI] and rifaximin use) were classified using 4 ML techniques (random forest [RF], support vector machine, logistic regression, and gradient boosting) with AUC comparisons for stool, saliva or both sample types. Individual microbial contributions were computed using feature importance of RF and Shapley additive explanations. Finally, thresholds for including microbiota were varied between 2.5% and 10%, and core microbiome (DESeq2) analysis was performed. RESULTS: Two hundred and sixty-nine participants, including 87 controls and 182 patients with cirrhosis, of whom 57 had HE, 78 were on PPIs and 29 on rifaximin were included. Regardless of the ML model, stool microbiota had a significantly higher AUC in differentiating groups vs. saliva. Regarding individual microbiota: autochthonous taxa drove the difference between controls vs. patients with cirrhosis, oral-origin microbiota the difference between PPI users/non-users, and pathobionts and autochthonous taxa the difference between rifaximin users/non-users and patients with/without HE. These were consistent with the core microbiome analysis results. CONCLUSIONS: On ML analysis, stool microbiota composition is significantly more informative in differentiating between controls and patients with cirrhosis, and those with varying cirrhosis severity, compared to saliva. Despite logistic challenges, stool should be preferred over saliva for microbiome analysis. LAY SUMMARY: Since it is harder to collect stool than saliva, we wanted to test whether microbes from saliva were better than stool in differentiating between healthy people and those with cirrhosis and, among those with cirrhosis, those with more severe disease. Using machine learning, we found that microbes in stool were more accurate than saliva alone or in combination, therefore, stool should be preferred for analysis and collection wherever possible.


Asunto(s)
Heces/microbiología , Encefalopatía Hepática/diagnóstico , Cirrosis Hepática/diagnóstico , Tamizaje Masivo/normas , Saliva/microbiología , Anciano , Femenino , Encefalopatía Hepática/fisiopatología , Humanos , Cirrosis Hepática/fisiopatología , Aprendizaje Automático/normas , Aprendizaje Automático/estadística & datos numéricos , Masculino , Tamizaje Masivo/métodos , Tamizaje Masivo/estadística & datos numéricos , Microbiota/fisiología , Persona de Mediana Edad , Pronóstico
5.
Crit Care Med ; 50(2): e162-e172, 2022 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-34406171

RESUMEN

OBJECTIVES: Prognostication of neurologic status among survivors of in-hospital cardiac arrests remains a challenging task for physicians. Although models such as the Cardiac Arrest Survival Post-Resuscitation In-hospital score are useful for predicting neurologic outcomes, they were developed using traditional statistical techniques. In this study, we derive and compare the performance of several machine learning models with each other and with the Cardiac Arrest Survival Post-Resuscitation In-hospital score for predicting the likelihood of favorable neurologic outcomes among survivors of resuscitation. DESIGN: Analysis of the Get With The Guidelines-Resuscitation registry. SETTING: Seven-hundred fifty-five hospitals participating in Get With The Guidelines-Resuscitation from January 1, 2001, to January 28, 2017. PATIENTS: Adult in-hospital cardiac arrest survivors. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Of 117,674 patients in our cohort, 28,409 (24%) had a favorable neurologic outcome, as defined as survival with a Cerebral Performance Category score of less than or equal to 2 at discharge. Using patient characteristics, pre-existing conditions, prearrest interventions, and periarrest variables, we constructed logistic regression, support vector machines, random forests, gradient boosted machines, and neural network machine learning models to predict favorable neurologic outcome. Events prior to October 20, 2009, were used for model derivation, and all subsequent events were used for validation. The gradient boosted machine predicted favorable neurologic status at discharge significantly better than the Cardiac Arrest Survival Post-Resuscitation In-hospital score (C-statistic: 0.81 vs 0.73; p < 0.001) and outperformed all other machine learning models in terms of discrimination, calibration, and accuracy measures. Variables that were consistently most important for prediction across all models were duration of arrest, initial cardiac arrest rhythm, admission Cerebral Performance Category score, and age. CONCLUSIONS: The gradient boosted machine algorithm was the most accurate for predicting favorable neurologic outcomes in in-hospital cardiac arrest survivors. Our results highlight the utility of machine learning for predicting neurologic outcomes in resuscitated patients.


Asunto(s)
Predicción/métodos , Paro Cardíaco/complicaciones , Aprendizaje Automático/normas , Evaluación de Resultado en la Atención de Salud/estadística & datos numéricos , Anciano , Área Bajo la Curva , Estudios de Cohortes , Femenino , Paro Cardíaco/epidemiología , Paro Cardíaco/mortalidad , Humanos , Aprendizaje Automático/estadística & datos numéricos , Masculino , Persona de Mediana Edad , Evaluación de Resultado en la Atención de Salud/métodos , Pronóstico , Curva ROC , Sobrevivientes/estadística & datos numéricos
6.
Phys Chem Chem Phys ; 24(3): 1326-1337, 2022 Jan 19.
Artículo en Inglés | MEDLINE | ID: mdl-34718360

RESUMEN

We combined our generalized energy-based fragmentation (GEBF) approach and machine learning (ML) technique to construct quantum mechanics (QM) quality force fields for proteins. In our scheme, the training sets for a protein are only constructed from its small subsystems, which capture all short-range interactions in the target system. The energy of a given protein is expressed as the summation of atomic contributions from QM calculations of various subsystems, corrected by long-range Coulomb and van der Waals interactions. With the Gaussian approximation potential (GAP) method, our protocol can automatically generate training sets with high efficiency. To facilitate the construction of training sets for proteins, we store all trained subsystem data in a library. If subsystems in the library are detected in a new protein, corresponding datasets can be directly reused as a part of the training set on this new protein. With two polypeptides, 4ZNN and 1XQ8 segment, as examples, the energies and forces predicted by GEBF-GAP are in good agreement with those from conventional QM calculations, and dihedral angle distributions from GEBF-GAP molecular dynamics (MD) simulations can also well reproduce those from ab initio MD simulations. In addition, with the training set generated from GEBF-GAP, we also demonstrate that GEBF-ML force fields constructed by neural network (NN) methods can also show QM quality. Therefore, the present work provides an efficient and systematic way to build QM quality force fields for biological systems.


Asunto(s)
Fragmentos de Péptidos/química , alfa-Sinucleína/química , Bases de Datos de Compuestos Químicos , Conjuntos de Datos como Asunto , Humanos , Aprendizaje Automático/estadística & datos numéricos , Simulación de Dinámica Molecular/estadística & datos numéricos , Teoría Cuántica , Termodinámica
7.
Am J Epidemiol ; 190(9): 1830-1840, 2021 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-33517416

RESUMEN

Although variables are often measured with error, the impact of measurement error on machine-learning predictions is seldom quantified. The purpose of this study was to assess the impact of measurement error on the performance of random-forest models and variable importance. First, we assessed the impact of misclassification (i.e., measurement error of categorical variables) of predictors on random-forest model performance (e.g., accuracy, sensitivity) and variable importance (mean decrease in accuracy) using data from the National Comorbidity Survey Replication (2001-2003). Second, we created simulated data sets in which we knew the true model performance and variable importance measures and could verify that quantitative bias analysis was recovering the truth in misclassified versions of the data sets. Our findings showed that measurement error in the data used to construct random forests can distort model performance and variable importance measures and that bias analysis can recover the correct results. This study highlights the utility of applying quantitative bias analysis in machine learning to quantify the impact of measurement error on study results.


Asunto(s)
Sesgo , Error Científico Experimental/estadística & datos numéricos , Simulación por Computador , Conjuntos de Datos como Asunto , Humanos , Aprendizaje Automático/estadística & datos numéricos , Probabilidad , Intento de Suicidio/estadística & datos numéricos
8.
Br J Haematol ; 193(1): 171-175, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33620089

RESUMEN

Disease relapse is the greatest cause of treatment failure in paediatric B-cell acute lymphoblastic leukaemia (B-ALL). Current risk stratifications fail to capture all patients at risk of relapse. Herein, we used a machine-learning approach to identify B-ALL blast-secreted factors that are associated with poor survival outcomes. Using this approach, we identified a two-gene expression signature (CKLF and IL1B) that allowed identification of high-risk patients at diagnosis. This two-gene expression signature enhances the predictive value of current at diagnosis or end-of-induction risk stratification suggesting the model can be applied continuously to help guide implementation of risk-adapted therapies.


Asunto(s)
Quimiocinas/genética , Interleucina-1beta/genética , Proteínas con Dominio MARVEL/genética , Aprendizaje Automático/estadística & datos numéricos , Leucemia-Linfoma Linfoblástico de Células Precursoras B/diagnóstico , Leucemia-Linfoma Linfoblástico de Células Precursoras B/genética , Enfermedad Aguda , Adolescente , Niño , Preescolar , Femenino , Humanos , Lactante , Masculino , Leucemia-Linfoma Linfoblástico de Células Precursoras B/mortalidad , Valor Predictivo de las Pruebas , Recurrencia , Medición de Riesgo/normas , Análisis de Supervivencia , Transcriptoma/genética , Insuficiencia del Tratamiento
9.
Crit Care Med ; 49(12): e1212-e1222, 2021 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-34374503

RESUMEN

OBJECTIVES: Prognostication of outcome is an essential step in defining therapeutic goals after cardiac arrest. Gray-white-matter ratio obtained from brain CT can predict poor outcome. However, manual placement of regions of interest is a potential source of error and interrater variability. Our objective was to assess the performance of poor outcome prediction by automated quantification of changes in brain CTs after cardiac arrest. DESIGN: Observational, derivation/validation cohort study design. Outcome was determined using the Cerebral Performance Category upon hospital discharge. Poor outcome was defined as death or unresponsive wakefulness syndrome/coma. CTs were automatically decomposed using coregistration with a brain atlas. SETTING: ICUs at a large, academic hospital with circulatory arrest center. PATIENTS: We identified 433 cardiac arrest patients from a large previously established database with brain CTs within 10 days after cardiac arrest. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Five hundred sixteen brain CTs were evaluated (derivation cohort n = 309, validation cohort n = 207). Patients with poor outcome had significantly lower radiodensities in gray matter regions. Automated GWR_si (putamen/posterior limb of internal capsule) was performed with an area under the curve of 0.86 (95%-CI: 0.80-0.93) for CTs taken later than 24 hours after cardiac arrest (similar performance in the validation cohort). Poor outcome (Cerebral Performance Category 4-5) was predicted with a specificity of 100% (95% CI, 87-100%, derivation; 88-100%, validation) at a threshold of less than 1.10 and a sensitivity of 49% (95% CI, 36-58%, derivation) and 38% (95% CI, 27-50%, validation) for CTs later than 24 hours after cardiac arrest. Sensitivity and area under the curve were lower for CTs performed within 24 hours after cardiac arrest. CONCLUSIONS: Automated gray-white-matter ratio from brain CT is a promising tool for prediction of poor neurologic outcome after cardiac arrest with high specificity and low-to-moderate sensitivity. Prediction by gray-white-matter ratio at the basal ganglia level performed best. Sensitivity increased considerably for CTs performed later than 24 hours after cardiac arrest.


Asunto(s)
Encéfalo/diagnóstico por imagen , Paro Cardíaco/complicaciones , Aprendizaje Automático/normas , Tomografía Computarizada por Rayos X/instrumentación , Anciano , Estudios de Cohortes , Femenino , Paro Cardíaco/diagnóstico por imagen , Humanos , Aprendizaje Automático/estadística & datos numéricos , Masculino , Persona de Mediana Edad , Curva ROC , Tomografía Computarizada por Rayos X/métodos , Estudios de Validación como Asunto
10.
J Cutan Pathol ; 48(8): 1061-1068, 2021 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-33421167

RESUMEN

Artificial intelligence (AI) utilizes computer algorithms to carry out tasks with human-like intelligence. Convolutional neural networks, a type of deep learning AI, can classify basal cell carcinoma, seborrheic keratosis, and conventional nevi, highlighting the potential for deep learning algorithms to improve diagnostic workflow in dermatopathology of highly routine diagnoses. Additionally, convolutional neural networks can support the diagnosis of melanoma and may help predict disease outcomes. Capabilities of machine learning in dermatopathology can extend beyond clinical diagnosis to education and research. Intelligent tutoring systems can teach visual diagnoses in inflammatory dermatoses, with measurable cognitive effects on learners. Natural language interfaces can instruct dermatopathology trainees to produce diagnostic reports that capture relevant detail for diagnosis in compliance with guidelines. Furthermore, deep learning can power computation- and population-based research. However, there are many limitations of deep learning that need to be addressed before broad incorporation into clinical practice. The current potential of AI in dermatopathology is to supplement diagnosis, and dermatopathologist guidance is essential for the development of useful deep learning algorithms. Herein, the recent progress of AI in dermatopathology is reviewed with emphasis on how deep learning can influence diagnosis, education, and research.


Asunto(s)
Inteligencia Artificial/estadística & datos numéricos , Dermatología/educación , Patología/educación , Neoplasias Cutáneas/diagnóstico , Algoritmos , Carcinoma Basocelular/diagnóstico , Carcinoma Basocelular/patología , Aprendizaje Profundo/estadística & datos numéricos , Dermatología/instrumentación , Diagnóstico Diferencial , Pruebas Diagnósticas de Rutina/instrumentación , Humanos , Queratosis Seborreica/diagnóstico , Queratosis Seborreica/patología , Aprendizaje Automático/estadística & datos numéricos , Melanoma/diagnóstico , Melanoma/patología , Redes Neurales de la Computación , Nevo/diagnóstico , Nevo/patología , Variaciones Dependientes del Observador , Patología/instrumentación , Investigación/instrumentación , Neoplasias Cutáneas/patología
11.
Prenat Diagn ; 41(4): 505-516, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33462877

RESUMEN

OBJECTIVE: To investigate the performance of the machine learning (ML) model in predicting small-for-gestational-age (SGA) at birth, using second-trimester data. METHODS: Retrospective data of 347 patients, consisting of maternal demographics and ultrasound parameters collected between the 20th and 25th gestational weeks, were studied. ML models were applied to different combinations of the parameters to predict SGA and severe SGA at birth (defined as 10th and third centile birth weight). RESULTS: Using second-trimester measurements, ML models achieved an accuracy of 70% and 73% in predicting SGA and severe SGA whereas clinical guidelines had accuracies of 64% and 48%. Uterine PI (Ut PI) was found to be an important predictor, corroborating with existing literature, but surprisingly, so was nuchal fold thickness (NF). Logistic regression showed that Ut PI and NF were significant predictors and statistical comparisons showed that these parameters were significantly different in disease. Further, including NF was found to improve ML model performance, and vice versa. CONCLUSION: ML could potentially improve the prediction of SGA at birth from second-trimester measurements, and demonstrated reduced NF to be an important predictor. Early prediction of SGA allows closer clinical monitoring, which provides an opportunity to discover any underlying diseases associated with SGA.


Asunto(s)
Recién Nacido Pequeño para la Edad Gestacional/crecimiento & desarrollo , Aprendizaje Automático/normas , Medida de Translucencia Nucal/clasificación , Valor Predictivo de las Pruebas , Femenino , Edad Gestacional , Humanos , Recién Nacido , Modelos Logísticos , Aprendizaje Automático/estadística & datos numéricos , Masculino , Medida de Translucencia Nucal/estadística & datos numéricos , Estudios Retrospectivos , Singapur/epidemiología
12.
Nucleic Acids Res ; 47(8): e45, 2019 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-30773592

RESUMEN

Although rapid progress has been made in computational approaches for prioritizing cancer driver genes, research is far from achieving the ultimate goal of discovering a complete catalog of genes truly associated with cancer. Driver gene lists predicted from these computational tools lack consistency and are prone to false positives. Here, we developed an approach (DriverML) integrating Rao's score test and supervised machine learning to identify cancer driver genes. The weight parameters in the score statistics quantified the functional impacts of mutations on the protein. To obtain optimized weight parameters, the score statistics of prior driver genes were maximized on pan-cancer training data. We conducted rigorous and unbiased benchmark analysis and comparisons of DriverML with 20 other existing tools in 31 independent datasets from The Cancer Genome Atlas (TCGA). Our comprehensive evaluations demonstrated that DriverML was robust and powerful among various datasets and outperformed the other tools with a better balance of precision and sensitivity. In vitro cell-based assays further proved the validity of the DriverML prediction of novel driver genes. In summary, DriverML uses an innovative, machine learning-based approach to prioritize cancer driver genes and provides dramatic improvements over currently existing methods. Its source code is available at https://github.com/HelloYiHan/DriverML.


Asunto(s)
Regulación Neoplásica de la Expresión Génica , Aprendizaje Automático/estadística & datos numéricos , Proteínas de Neoplasias/genética , Neoplasias/genética , Oncogenes , Programas Informáticos , Atlas como Asunto , Proteínas de Ciclo Celular/genética , Proteínas de Ciclo Celular/metabolismo , Línea Celular Tumoral , Movimiento Celular , Proliferación Celular , Conjuntos de Datos como Asunto , Humanos , Método de Montecarlo , Mutación , Proteínas de Neoplasias/metabolismo , Neoplasias/diagnóstico , Neoplasias/patología , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo
13.
Acta Radiol ; 62(12): 1601-1609, 2021 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-33203215

RESUMEN

BACKGROUND: Cardiomegaly is a relatively common incidental finding on chest X-rays; if left untreated, it can result in significant complications. Using Artificial Intelligence for diagnosing cardiomegaly could be beneficial, as this pathology may be underreported, or overlooked, especially in busy or under-staffed settings. PURPOSE: To explore the feasibility of applying four different transfer learning methods to identify the presence of cardiomegaly in chest X-rays and to compare their diagnostic performance using the radiologists' report as the gold standard. MATERIAL AND METHODS: Two thousand chest X-rays were utilized in the current study: 1000 were normal and 1000 had confirmed cardiomegaly. Of these exams, 80% were used for training and 20% as a holdout test dataset. A total of 2048 deep features were extracted using Google's Inception V3, VGG16, VGG19, and SqueezeNet networks. A logistic regression algorithm optimized in regularization terms was used to classify chest X-rays into those with presence or absence of cardiomegaly. RESULTS: Diagnostic accuracy is reported by means of sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), with the VGG19 network providing the best values of sensitivity (84%), specificity (83%), PPV (83%), NPV (84%), and overall accuracy (84,5%). The other networks presented sensitivity at 64.1%-82%, specificity at 77.1%-81.1%, PPV at 74%-81.4%, NPV at 68%-82%, and overall accuracy at 71%-81.3%. CONCLUSION: Deep learning using transfer learning methods based on VGG19 network can be used for the automatic detection of cardiomegaly on chest X-ray images. However, further validation and training of each method is required before application to clinical cases.


Asunto(s)
Cardiomegalia/diagnóstico por imagen , Aprendizaje Automático , Radiografía Torácica , Algoritmos , Inteligencia Artificial , Estudios Transversales , Conjuntos de Datos como Asunto , Estudios de Factibilidad , Humanos , Modelos Logísticos , Aprendizaje Automático/estadística & datos numéricos , Valor Predictivo de las Pruebas , Radiografía Torácica/estadística & datos numéricos , Estándares de Referencia , Sensibilidad y Especificidad
14.
Molecules ; 26(15)2021 Jul 29.
Artículo en Inglés | MEDLINE | ID: mdl-34361751

RESUMEN

Species of Mycobacteriaceae cause disease in animals and humans, including tuberculosis and leprosy. Individuals infected with organisms in the Mycobacterium tuberculosis complex (MTBC) or non-tuberculous mycobacteria (NTM) may present identical symptoms, however the treatment for each can be different. Although the NTM infection is considered less vital due to the chronicity of the disease and the infrequency of occurrence in healthy populations, diagnosis and differentiation among Mycobacterium species currently require culture isolation, which can take several weeks. The use of volatile organic compounds (VOCs) is a promising approach for species identification and in recent years has shown promise for use in the rapid analysis of both in vitro cultures as well as ex vivo diagnosis using breath or sputum. The aim of this contribution is to analyze VOCs in the culture headspace of seven different species of mycobacteria and to define the volatilome profiles that are discriminant for each species. For the pre-concentration of VOCs, solid-phase micro-extraction (SPME) was employed and samples were subsequently analyzed using gas chromatography-quadrupole mass spectrometry (GC-qMS). A machine learning approach was applied for the selection of the 13 discriminatory features, which might represent clinically translatable bacterial biomarkers.


Asunto(s)
Metaboloma , Mycobacterium abscessus/química , Complejo Mycobacterium avium/química , Mycobacterium avium/química , Mycobacterium bovis/química , Mycobacterium/química , Compuestos Orgánicos Volátiles/aislamiento & purificación , Biomarcadores/análisis , Cromatografía de Gases y Espectrometría de Masas/métodos , Aprendizaje Automático/estadística & datos numéricos , Mycobacterium/metabolismo , Mycobacterium abscessus/metabolismo , Mycobacterium avium/metabolismo , Complejo Mycobacterium avium/metabolismo , Mycobacterium bovis/metabolismo , Análisis de Componente Principal , Microextracción en Fase Sólida , Compuestos Orgánicos Volátiles/clasificación , Compuestos Orgánicos Volátiles/metabolismo
15.
Adv Skin Wound Care ; 34(8): 1-12, 2021 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-34260423

RESUMEN

OBJECTIVE: Wound infection is prevalent in home healthcare (HHC) and often leads to hospitalizations. However, none of the previous studies of wounds in HHC have used data from clinical notes. Therefore, the authors created a more accurate description of a patient's condition by extracting risk factors from clinical notes to build predictive models to identify a patient's risk of wound infection in HHC. METHODS: The structured data (eg, standardized assessments) and unstructured information (eg, narrative-free text charting) were retrospectively reviewed for HHC patients with wounds who were served by a large HHC agency in 2014. Wound infection risk factors were identified through bivariate analysis and stepwise variable selection. Risk predictive performance of three machine learning models (logistic regression, random forest, and artificial neural network) was compared. RESULTS: A total of 754 of 54,316 patients (1.39%) had a hospitalization or ED visit related to wound infection. In the bivariate logistic regression, language describing wound type in the patient's clinical notes was strongly associated with risk (odds ratio, 9.94; P < .05). The areas under the curve were 0.82 in logistic regression, 0.75 in random forest, and 0.78 in artificial neural network. Risk prediction performance of the models improved (by up to 13.2%) after adding risk factors extracted from clinical notes. CONCLUSIONS: Logistic regression showed the best risk prediction performance in prediction of wound infection-related hospitalization or ED visits in HHC. The use of data extracted from clinical notes can improve the performance of risk prediction models.


Asunto(s)
Servicios de Atención de Salud a Domicilio/normas , Aprendizaje Automático/normas , Medición de Riesgo/métodos , Infección de Heridas/prevención & control , Anciano , Algoritmos , Servicio de Urgencia en Hospital/organización & administración , Servicio de Urgencia en Hospital/estadística & datos numéricos , Femenino , Predicción/métodos , Servicios de Atención de Salud a Domicilio/estadística & datos numéricos , Hospitalización/estadística & datos numéricos , Humanos , Modelos Logísticos , Aprendizaje Automático/estadística & datos numéricos , Masculino , Persona de Mediana Edad , Estudios Retrospectivos , Medición de Riesgo/normas , Medición de Riesgo/estadística & datos numéricos , Factores de Riesgo , Infección de Heridas/epidemiología
16.
J Med Syst ; 45(4): 48, 2021 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-33646459

RESUMEN

Early identification of patients with life-threatening risks such as delirium is crucial in order to initiate preventive actions as quickly as possible. Despite intense research on machine learning for the prediction of clinical outcomes, the acceptance of the integration of such complex models in clinical routine remains unclear. The aim of this study was to evaluate user acceptance of an already implemented machine learning-based application predicting the risk of delirium for in-patients. We applied a mixed methods design to collect opinions and concerns from health care professionals including physicians and nurses who regularly used the application. The evaluation was framed by the Technology Acceptance Model assessing perceived ease of use, perceived usefulness, actual system use and output quality of the application. Questionnaire results from 47 nurses and physicians as well as qualitative results of four expert group meetings rated the overall usefulness of the delirium prediction positively. For healthcare professionals, the visualization and presented information was understandable, the application was easy to use and the additional information for delirium management was appreciated. The application did not increase their workload, but the actual system use was still low during the pilot study. Our study provides insights into the user acceptance of a machine learning-based application supporting delirium management in hospitals. In order to improve quality and safety in healthcare, computerized decision support should predict actionable events and be highly accepted by users.


Asunto(s)
Algoritmos , Toma de Decisiones Clínicas , Delirio/diagnóstico , Errores Diagnósticos/estadística & datos numéricos , Aprendizaje Automático/estadística & datos numéricos , Australia , Diagnóstico Diferencial , Registros Electrónicos de Salud/normas , Femenino , Humanos , Masculino , Persona de Mediana Edad , Proyectos Piloto , Escalas de Valoración Psiquiátrica
17.
BMC Genomics ; 21(1): 6, 2020 Jan 02.
Artículo en Inglés | MEDLINE | ID: mdl-31898477

RESUMEN

BACKGROUND: To evaluate binary classifications and their confusion matrices, scientific researchers can employ several statistical rates, accordingly to the goal of the experiment they are investigating. Despite being a crucial issue in machine learning, no widespread consensus has been reached on a unified elective chosen measure yet. Accuracy and F1 score computed on confusion matrices have been (and still are) among the most popular adopted metrics in binary classification tasks. However, these statistical measures can dangerously show overoptimistic inflated results, especially on imbalanced datasets. RESULTS: The Matthews correlation coefficient (MCC), instead, is a more reliable statistical rate which produces a high score only if the prediction obtained good results in all of the four confusion matrix categories (true positives, false negatives, true negatives, and false positives), proportionally both to the size of positive elements and the size of negative elements in the dataset. CONCLUSIONS: In this article, we show how MCC produces a more informative and truthful score in evaluating binary classifications than accuracy and F1 score, by first explaining the mathematical properties, and then the asset of MCC in six synthetic use cases and in a real genomics scenario. We believe that the Matthews correlation coefficient should be preferred to accuracy and F1 score in evaluating binary classification tasks by all scientific communities.


Asunto(s)
Correlación de Datos , Interpretación Estadística de Datos , Aprendizaje Automático/estadística & datos numéricos , Algoritmos , Biología Computacional/estadística & datos numéricos
18.
J Biomed Sci ; 27(1): 80, 2020 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-32664906

RESUMEN

BACKGROUND: Recent trials have shown promise in intra-arterial thrombectomy after the first 6-24 h of stroke onset. Quick and precise identification of the salvageable tissue is essential for successful stroke management. In this study, we examined the feasibility of machine learning (ML) approaches for differentiating the ischemic penumbra (IP) from the infarct core (IC) by using diffusion tensor imaging (DTI)-derived metrics. METHODS: Fourteen male rats subjected to permanent middle cerebral artery occlusion (pMCAO) were included in this study. Using a 7 T magnetic resonance imaging, DTI metrics such as fractional anisotropy, pure anisotropy, diffusion magnitude, mean diffusivity (MD), axial diffusivity, and radial diffusivity were derived. The MD and relative cerebral blood flow maps were coregistered to define the IP and IC at 0.5 h after pMCAO. A 2-level classifier was proposed based on DTI-derived metrics to classify stroke hemispheres into the IP, IC, and normal tissue (NT). The classification performance was evaluated using leave-one-out cross validation. RESULTS: The IC and non-IC can be accurately segmented by the proposed 2-level classifier with an area under the receiver operating characteristic curve (AUC) between 0.99 and 1.00, and with accuracies between 96.3 and 96.7%. For the training dataset, the non-IC can be further classified into the IP and NT with an AUC between 0.96 and 0.98, and with accuracies between 95.0 and 95.9%. For the testing dataset, the classification accuracy for IC and non-IC was 96.0 ± 2.3% whereas for IP and NT, it was 80.1 ± 8.0%. Overall, we achieved the accuracy of 88.1 ± 6.7% for classifying three tissue subtypes (IP, IC, and NT) in the stroke hemisphere and the estimated lesion volumes were not significantly different from those of the ground truth (p = .56, .94, and .78, respectively). CONCLUSIONS: Our method achieved comparable results to the conventional approach using perfusion-diffusion mismatch. We suggest that a single DTI sequence along with ML algorithms is capable of dichotomizing ischemic tissue into the IC and IP.


Asunto(s)
Imagen de Difusión Tensora/métodos , Infarto de la Arteria Cerebral Media/patología , Isquemia/diagnóstico por imagen , Aprendizaje Automático/estadística & datos numéricos , Algoritmos , Animales , Benchmarking , Modelos Animales de Enfermedad , Masculino , Curva ROC , Ratas , Ratas Sprague-Dawley
19.
Mol Pharm ; 17(7): 2660-2671, 2020 07 06.
Artículo en Inglés | MEDLINE | ID: mdl-32496787

RESUMEN

There has been much recent interest in machine learning (ML) and molecular quantitative structure property relationships (QSPR). The present research evaluated modern ML-based methods implemented in commercial software (COSMOquick and Molecular Modeling Pro), compared to a classical group contribution approach (Joback and Reid method), to estimate melting points and enthalpy of fusion values. A broad data set of market compounds was gathered from the literature, together with new data measured by differential scanning calorimetry for drug candidates. The highest prediction accuracy was achieved by QSPR using stochastic gradient boosting. The model deviations were discussed, particularly the implications on thermodynamic solubility modeling, as this typically requires estimation of both melting point and enthalpy of fusion. The results suggested that despite considerable advancement in prediction accuracy, there are still limitations especially with complex drug candidates. It is recommended that in such cases, melting properties obtained in silico should be used carefully as input data for thermodynamic solubility modeling. Future research will show how the prediction limits of thermophysical drug properties can be further advanced by even larger data sets and other ML algorithms or also by using molecular simulations.


Asunto(s)
Aprendizaje Automático , Preparaciones Farmacéuticas/química , Algoritmos , Rastreo Diferencial de Calorimetría , Simulación por Computador , Congelación , Aprendizaje Automático/estadística & datos numéricos , Modelos Químicos , Modelos Moleculares , Relación Estructura-Actividad Cuantitativa , Programas Informáticos , Solubilidad , Termodinámica , Temperatura de Transición , Agua/química
20.
PLoS Comput Biol ; 15(4): e1006931, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30933970

RESUMEN

Increasing evidence has indicated that microRNAs(miRNAs) play vital roles in various pathological processes and thus are closely related with many complex human diseases. The identification of potential disease-related miRNAs offers new opportunities to understand disease etiology and pathogenesis. Although there have been numerous computational methods proposed to predict reliable miRNA-disease associations, they suffer from various limitations that affect the prediction accuracy and their applicability. In this study, we develop a novel method to discover disease-related candidate miRNAs based on Adaptive Multi-View Multi-Label learning(AMVML). Specifically, considering the inherent noise existed in the current dataset, we propose to learn a new affinity graph adaptively for both diseases and miRNAs from multiple similarity profiles. We then simultaneously update the miRNA-disease association predicted from both spaces based on multi-label learning. In particular, we prove the convergence of AMVML theoretically and the corresponding analysis indicates that it has a fast convergence rate. To comprehensively illustrate the prediction performance of our method, we compared AMVML with four state-of-the-art methods under different validation frameworks. As a result, our method achieved comparable performance under various evaluation metrics, which suggests that our method is capable of discovering greater number of true miRNA-disease associations. The case study conducted on thyroid neoplasms further identified a potential diagnostic biomarker. Together, the experimental results confirms the utility of our method and we anticipate that our method could serve as a reliable and efficient tool for uncovering novel disease-related miRNAs.


Asunto(s)
Predisposición Genética a la Enfermedad , Aprendizaje Automático , MicroARNs/genética , Algoritmos , Biomarcadores de Tumor/genética , Biología Computacional , Estudios de Asociación Genética/estadística & datos numéricos , Humanos , Aprendizaje Automático/estadística & datos numéricos , Modelos Genéticos , Modelos Estadísticos , Neoplasias de la Tiroides/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA