Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Commun Biol ; 7(1): 407, 2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38570615

RESUMO

The interpretation of complex biological datasets requires the identification of representative variables that describe the data without critical information loss. This is particularly important in the analysis of large phenotypic datasets (phenomics). Here we introduce Multi-Attribute Subset Selection (MASS), an algorithm which separates a matrix of phenotypes (e.g., yield across microbial species and environmental conditions) into predictor and response sets of conditions. Using mixed integer linear programming, MASS expresses the response conditions as a linear combination of the predictor conditions, while simultaneously searching for the optimally descriptive set of predictors. We apply the algorithm to three microbial datasets and identify environmental conditions that predict phenotypes under other conditions, providing biologically interpretable axes for strain discrimination. MASS could be used to reduce the number of experiments needed to identify species or to map their metabolic capabilities. The generality of the algorithm allows addressing subset selection problems in areas beyond biology.


Assuntos
Algoritmos , Fenótipo
3.
Front Endocrinol (Lausanne) ; 15: 1298628, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38356959

RESUMO

Introduction: Predictive models have been used to aid early diagnosis of PCOS, though existing models are based on small sample sizes and limited to fertility clinic populations. We built a predictive model using machine learning algorithms based on an outpatient population at risk for PCOS to predict risk and facilitate earlier diagnosis, particularly among those who meet diagnostic criteria but have not received a diagnosis. Methods: This is a retrospective cohort study from a SafetyNet hospital's electronic health records (EHR) from 2003-2016. The study population included 30,601 women aged 18-45 years without concurrent endocrinopathy who had any visit to Boston Medical Center for primary care, obstetrics and gynecology, endocrinology, family medicine, or general internal medicine. Four prediction outcomes were assessed for PCOS. The first outcome was PCOS ICD-9 diagnosis with additional model outcomes of algorithm-defined PCOS. The latter was based on Rotterdam criteria and merging laboratory values, radiographic imaging, and ICD data from the EHR to define irregular menstruation, hyperandrogenism, and polycystic ovarian morphology on ultrasound. Results: We developed predictive models using four machine learning methods: logistic regression, supported vector machine, gradient boosted trees, and random forests. Hormone values (follicle-stimulating hormone, luteinizing hormone, estradiol, and sex hormone binding globulin) were combined to create a multilayer perceptron score using a neural network classifier. Prediction of PCOS prior to clinical diagnosis in an out-of-sample test set of patients achieved an average AUC of 85%, 81%, 80%, and 82%, respectively in Models I, II, III and IV. Significant positive predictors of PCOS diagnosis across models included hormone levels and obesity; negative predictors included gravidity and positive bHCG. Conclusion: Machine learning algorithms were used to predict PCOS based on a large at-risk population. This approach may guide early detection of PCOS within EHR-interfaced populations to facilitate counseling and interventions that may reduce long-term health consequences. Our model illustrates the potential benefits of an artificial intelligence-enabled provider assistance tool that can be integrated into the EHR to reduce delays in diagnosis. However, model validation in other hospital-based populations is necessary.


Assuntos
Síndrome do Ovário Policístico , Humanos , Feminino , Síndrome do Ovário Policístico/diagnóstico , Estudos Retrospectivos , Inteligência Artificial , Registros Eletrônicos de Saúde , Hormônio Luteinizante , Algoritmos , Aprendizado de Máquina
4.
Artif Intell Med ; 146: 102715, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-38042602

RESUMO

BACKGROUND: Ventilator-associated pneumonia (VAP) is a leading cause of morbidity and mortality in intensive care units (ICUs). Early identification of patients at risk of VAP enables early intervention, which in turn improves patient outcomes. We developed a predictive model for individualized risk assessment utilizing machine learning to identify patients at risk of developing VAP. METHODS: The Philips eRI dataset, a multi-institution electronic medical record (EMR), was used for model development. For adult (≥18y) patients, we propose a set of criteria using indications of the start of a new antibiotic treatment temporally contiguous to a microbiological test to mark suspected infection events, of which those with a positive culture are labeled as presumed VAP if 1) the event occurs at least 48 h after intubation, and 2) there are no indications of community-acquired pneumonia (CAP) or other hospital-acquired infections (HAI) in the patient charts. The resulting VAP and no-VAP (control) cases were then used to build an ensemble of decision trees to predict the risk of VAP in the next 24 h using data on patients' demographics, vitals, labs, and ventilator settings. RESULTS: The resulting model predicts the development of VAP 24 h in advance with an AUC of 76 % and AUPRC of 75 %. Additionally, we group hospitals that are similar in healthcare processes into distinct clusters and characterize VAP prediction for the identified hospital clusters. We show inter-hospital (teaching status and healthcare processes) and cohort-specific (age groups, gender, early vs late VAP, ICU mortality status) differences in VAP prediction and associated symptomologies. CONCLUSIONS: Our proposed VAP criteria use clinical actions to mark incidences of presumed VAP infection, which enables the development of models for early detection of these events. We curated a patient cohort using these criteria and used it to build a model for predicting impending VAP events prior to clinical suspicions. We present a clustering approach for tailoring the VAP prediction model for different hospital types based on their EMR data characteristics. The model provides an instantaneous risk score that allows early interventions and confirmatory diagnostic actions.


Assuntos
Infecção Hospitalar , Pneumonia Associada à Ventilação Mecânica , Adulto , Humanos , Pneumonia Associada à Ventilação Mecânica/diagnóstico , Pneumonia Associada à Ventilação Mecânica/epidemiologia , Pneumonia Associada à Ventilação Mecânica/tratamento farmacológico , Infecção Hospitalar/tratamento farmacológico , Antibacterianos/uso terapêutico , Unidades de Terapia Intensiva , Hospitais , Aprendizado de Máquina
5.
medRxiv ; 2023 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-37577593

RESUMO

Introduction: Predictive models have been used to aid early diagnosis of PCOS, though existing models are based on small sample sizes and limited to fertility clinic populations. We built a predictive model using machine learning algorithms based on an outpatient population at risk for PCOS to predict risk and facilitate earlier diagnosis, particularly among those who meet diagnostic criteria but have not received a diagnosis. Methods: This is a retrospective cohort study from a SafetyNet hospital's electronic health records (EHR) from 2003-2016. The study population included 30,601 women aged 18-45 years without concurrent endocrinopathy who had any visit to Boston Medical Center for primary care, obstetrics and gynecology, endocrinology, family medicine, or general internal medicine. Four prediction outcomes were assessed for PCOS. The first outcome was PCOS ICD-9 diagnosis with additional model outcomes of algorithm-defined PCOS. The latter was based on Rotterdam criteria and merging laboratory values, radiographic imaging, and ICD data from the EHR to define irregular menstruation, hyperandrogenism, and polycystic ovarian morphology on ultrasound. Results: We developed predictive models using four machine learning methods: logistic regression, supported vector machine, gradient boosted trees, and random forests. Hormone values (follicle-stimulating hormone, luteinizing hormone, estradiol, and sex hormone binding globulin) were combined to create a multilayer perceptron score using a neural network classifier. Prediction of PCOS prior to clinical diagnosis in an out-of-sample test set of patients achieved AUC of 85%, 81%, 80%, and 82%, respectively in Models I, II, III and IV. Significant positive predictors of PCOS diagnosis across models included hormone levels and obesity; negative predictors included gravidity and positive bHCG. Conclusions: Machine learning algorithms were used to predict PCOS based on a large at-risk population. This approach may guide early detection of PCOS within EHR-interfaced populations to facilitate counseling and interventions that may reduce long-term health consequences. Our model illustrates the potential benefits of an artificial intelligence-enabled provider assistance tool that can be integrated into the EHR to reduce delays in diagnosis. However, model validation in other hospital-based populations is necessary.

6.
Hum Reprod ; 37(3): 565-576, 2022 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-35024824

RESUMO

STUDY QUESTION: Can we derive adequate models to predict the probability of conception among couples actively trying to conceive? SUMMARY ANSWER: Leveraging data collected from female participants in a North American preconception cohort study, we developed models to predict pregnancy with performance of ∼70% in the area under the receiver operating characteristic curve (AUC). WHAT IS KNOWN ALREADY: Earlier work has focused primarily on identifying individual risk factors for infertility. Several predictive models have been developed in subfertile populations, with relatively low discrimination (AUC: 59-64%). STUDY DESIGN, SIZE, DURATION: Study participants were female, aged 21-45 years, residents of the USA or Canada, not using fertility treatment, and actively trying to conceive at enrollment (2013-2019). Participants completed a baseline questionnaire at enrollment and follow-up questionnaires every 2 months for up to 12 months or until conception. We used data from 4133 participants with no more than one menstrual cycle of pregnancy attempt at study entry. PARTICIPANTS/MATERIALS, SETTING, METHODS: On the baseline questionnaire, participants reported data on sociodemographic factors, lifestyle and behavioral factors, diet quality, medical history and selected male partner characteristics. A total of 163 predictors were considered in this study. We implemented regularized logistic regression, support vector machines, neural networks and gradient boosted decision trees to derive models predicting the probability of pregnancy: (i) within fewer than 12 menstrual cycles of pregnancy attempt time (Model I), and (ii) within 6 menstrual cycles of pregnancy attempt time (Model II). Cox models were used to predict the probability of pregnancy within each menstrual cycle for up to 12 cycles of follow-up (Model III). We assessed model performance using the AUC and the weighted-F1 score for Models I and II, and the concordance index for Model III. MAIN RESULTS AND THE ROLE OF CHANCE: Model I and II AUCs were 70% and 66%, respectively, in parsimonious models, and the concordance index for Model III was 63%. The predictors that were positively associated with pregnancy in all models were: having previously breastfed an infant and using multivitamins or folic acid supplements. The predictors that were inversely associated with pregnancy in all models were: female age, female BMI and history of infertility. Among nulligravid women with no history of infertility, the most important predictors were: female age, female BMI, male BMI, use of a fertility app, attempt time at study entry and perceived stress. LIMITATIONS, REASONS FOR CAUTION: Reliance on self-reported predictor data could have introduced misclassification, which would likely be non-differential with respect to the pregnancy outcome given the prospective design. In addition, we cannot be certain that all relevant predictor variables were considered. Finally, though we validated the models using split-sample replication techniques, we did not conduct an external validation study. WIDER IMPLICATIONS OF THE FINDINGS: Given a wide range of predictor data, machine learning algorithms can be leveraged to analyze epidemiologic data and predict the probability of conception with discrimination that exceeds earlier work. STUDY FUNDING/COMPETING INTEREST(S): The research was partially supported by the U.S. National Science Foundation (under grants DMS-1664644, CNS-1645681 and IIS-1914792) and the National Institutes for Health (under grants R01 GM135930 and UL54 TR004130). In the last 3 years, L.A.W. has received in-kind donations for primary data collection in PRESTO from FertilityFriend.com, Kindara.com, Sandstone Diagnostics and Swiss Precision Diagnostics. L.A.W. also serves as a fibroid consultant to AbbVie, Inc. The other authors declare no competing interests. TRIAL REGISTRATION NUMBER: N/A.


Assuntos
Fertilidade , Infertilidade , Estudos de Coortes , Feminino , Humanos , Masculino , Gravidez , Estudos Prospectivos , Inquéritos e Questionários
7.
Elife ; 92020 10 12.
Artigo em Inglês | MEDLINE | ID: mdl-33044170

RESUMO

This study examined records of 2566 consecutive COVID-19 patients at five Massachusetts hospitals and sought to predict level-of-care requirements based on clinical and laboratory data. Several classification methods were applied and compared against standard pneumonia severity scores. The need for hospitalization, ICU care, and mechanical ventilation were predicted with a validation accuracy of 88%, 87%, and 86%, respectively. Pneumonia severity scores achieve respective accuracies of 73% and 74% for ICU care and ventilation. When predictions are limited to patients with more complex disease, the accuracy of the ICU and ventilation prediction models achieved accuracy of 83% and 82%, respectively. Vital signs, age, BMI, dyspnea, and comorbidities were the most important predictors of hospitalization. Opacities on chest imaging, age, admission vital signs and symptoms, male gender, admission laboratory results, and diabetes were the most important risk factors for ICU admission and mechanical ventilation. The factors identified collectively form a signature of the novel COVID-19 disease.


The new coronavirus (now named SARS-CoV-2) causing the disease pandemic in 2019 (COVID-19), has so far infected over 35 million people worldwide and killed more than 1 million. Most people with COVID-19 have no symptoms or only mild symptoms. But some become seriously ill and need hospitalization. The sickest are admitted to an Intensive Care Unit (ICU) and may need mechanical ventilation to help them breath. Being able to predict which patients with COVID-19 will become severely ill could help hospitals around the world manage the huge influx of patients caused by the pandemic and save lives. Now, Hao, Sotudian, Wang, Xu et al. show that computer models using artificial intelligence technology can help predict which COVID-19 patients will be hospitalized, admitted to the ICU, or need mechanical ventilation. Using data of 2,566 COVID-19 patients from five Massachusetts hospitals, Hao et al. created three separate models that can predict hospitalization, ICU admission, and the need for mechanical ventilation with more than 86% accuracy, based on patient characteristics, clinical symptoms, laboratory results and chest x-rays. Hao et al. found that the patients' vital signs, age, obesity, difficulty breathing, and underlying diseases like diabetes, were the strongest predictors of the need for hospitalization. Being male, having diabetes, cloudy chest x-rays, and certain laboratory results were the most important risk factors for intensive care treatment and mechanical ventilation. Laboratory results suggesting tissue damage, severe inflammation or oxygen deprivation in the body's tissues were important warning signs of severe disease. The results provide a more detailed picture of the patients who are likely to suffer from severe forms of COVID-19. Using the predictive models may help physicians identify patients who appear okay but need closer monitoring and more aggressive treatment. The models may also help policy makers decide who needs workplace accommodations such as being allowed to work from home, which individuals may benefit from more frequent testing, and who should be prioritized for vaccination when a vaccine becomes available.


Assuntos
Betacoronavirus , Infecções por Coronavirus/terapia , Necessidades e Demandas de Serviços de Saúde , Pandemias , Pneumonia Viral/terapia , Adulto , Idoso , Área Sob a Curva , Índice de Massa Corporal , COVID-19 , Comorbidade , Infecções por Coronavirus/epidemiologia , Diabetes Mellitus/epidemiologia , Feminino , Hospitalização/estatística & dados numéricos , Humanos , Unidades de Terapia Intensiva/estatística & dados numéricos , Unidades de Terapia Intensiva/provisão & distribuição , Masculino , Massachusetts/epidemiologia , Pessoa de Meia-Idade , Dinâmica não Linear , Pneumonia Viral/epidemiologia , Utilização de Procedimentos e Técnicas , Curva ROC , Respiração Artificial/estatística & dados numéricos , Fatores de Risco , SARS-CoV-2 , Ventiladores Mecânicos/provisão & distribuição
8.
JMIR Med Inform ; 8(10): e21788, 2020 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-33055061

RESUMO

BACKGROUND: The novel coronavirus SARS-CoV-2 and its associated disease, COVID-19, have caused worldwide disruption, leading countries to take drastic measures to address the progression of the disease. As SARS-CoV-2 continues to spread, hospitals are struggling to allocate resources to patients who are most at risk. In this context, it has become important to develop models that can accurately predict the severity of infection of hospitalized patients to help guide triage, planning, and resource allocation. OBJECTIVE: The aim of this study was to develop accurate models to predict the mortality of hospitalized patients with COVID-19 using basic demographics and easily obtainable laboratory data. METHODS: We performed a retrospective study of 375 hospitalized patients with COVID-19 in Wuhan, China. The patients were randomly split into derivation and validation cohorts. Regularized logistic regression and support vector machine classifiers were trained on the derivation cohort, and accuracy metrics (F1 scores) were computed on the validation cohort. Two types of models were developed: the first type used laboratory findings from the entire length of the patient's hospital stay, and the second type used laboratory findings that were obtained no later than 12 hours after admission. The models were further validated on a multicenter external cohort of 542 patients. RESULTS: Of the 375 patients with COVID-19, 174 (46.4%) died of the infection. The study cohort was composed of 224/375 men (59.7%) and 151/375 women (40.3%), with a mean age of 58.83 years (SD 16.46). The models developed using data from throughout the patients' length of stay demonstrated accuracies as high as 97%, whereas the models with admission laboratory variables possessed accuracies of up to 93%. The latter models predicted patient outcomes an average of 11.5 days in advance. Key variables such as lactate dehydrogenase, high-sensitivity C-reactive protein, and percentage of lymphocytes in the blood were indicated by the models. In line with previous studies, age was also found to be an important variable in predicting mortality. In particular, the mean age of patients who survived COVID-19 infection (50.23 years, SD 15.02) was significantly lower than the mean age of patients who died of the infection (68.75 years, SD 11.83; P<.001). CONCLUSIONS: Machine learning models can be successfully employed to accurately predict outcomes of patients with COVID-19. Our models achieved high accuracies and could predict outcomes more than one week in advance; this promising result suggests that these models can be highly useful for resource allocation in hospitals.

9.
PLoS One ; 15(9): e0238118, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32903282

RESUMO

INTRODUCTION: New financial incentives, such as reduced Medicare reimbursements, have led hospitals to closely monitor their readmission rates and initiate efforts aimed at reducing them. In this context, many surgical departments participate in the American College of Surgeons National Surgical Quality Improvement Program (NSQIP), which collects detailed demographic, laboratory, clinical, procedure and perioperative occurrence data. The availability of such data enables the development of data science methods which predict readmissions and, as done in this paper, offer specific recommendations aimed at preventing readmissions. MATERIALS AND METHODS: This study leverages NSQIP data for 722,101 surgeries to develop predictive and prescriptive models, predicting readmissions and offering real-time, personalized treatment recommendations for surgical patients during their hospital stay, aimed at reducing the risk of a 30-day readmission. We applied a variety of classification methods to predict 30-day readmissions and developed two prescriptive methods to recommend pre-operative blood transfusions to increase the patient's hematocrit with the objective of preventing readmissions. The effect of these interventions was evaluated using several predictive models. RESULTS: Predictions of 30-day readmissions based on the entire collection of NSQIP variables achieve an out-of-sample accuracy of 87% (Area Under the Curve-AUC). Predictions based only on pre-operative variables have an accuracy of 74% AUC, out-of-sample. Personalized interventions, in the form of pre-operative blood transfusions identified by the prescriptive methods, reduce readmissions by 12%, on average, for patients considered as candidates for pre-operative transfusion (pre-operative hematoctic <30). The prediction accuracy of the proposed models exceeds results in the literature. CONCLUSIONS: This study is among the first to develop a methodology for making specific, data-driven, personalized treatment recommendations to reduce the 30-day readmission rate. The reported predicted reduction in readmissions can lead to more than $20 million in savings in the U.S. annually.


Assuntos
Modelos Estatísticos , Readmissão do Paciente/estatística & dados numéricos , Procedimentos Cirúrgicos Operatórios/estatística & dados numéricos , Transfusão de Sangue , Bases de Dados Factuais , Hematócrito , Humanos , Medição de Risco
10.
mSystems ; 4(2)2019.
Artigo em Inglês | MEDLINE | ID: mdl-30984871

RESUMO

Microbes face a trade-off between being metabolically independent and relying on neighboring organisms for the supply of some essential metabolites. This balance of conflicting strategies affects microbial community structure and dynamics, with important implications for microbiome research and synthetic ecology. A "gedanken" (thought) experiment to investigate this trade-off would involve monitoring the rise of mutual dependence as the number of metabolic reactions allowed in an organism is increasingly constrained. The expectation is that below a certain number of reactions, no individual organism would be able to grow in isolation and cross-feeding partnerships and division of labor would emerge. We implemented this idealized experiment using in silico genome-scale models. In particular, we used mixed-integer linear programming to identify trade-off solutions in communities of Escherichia coli strains. The strategies that we found revealed a large space of opportunities in nuanced and nonintuitive metabolic division of labor, including, for example, splitting the tricarboxylic acid (TCA) cycle into two separate halves. The systematic computation of possible solutions in division of labor for 1-, 2-, and 3-strain consortia resulted in a rich and complex landscape. This landscape displayed a nonlinear boundary, indicating that the loss of an intracellular reaction was not necessarily compensated for by a single imported metabolite. Different regions in this landscape were associated with specific solutions and patterns of exchanged metabolites. Our approach also predicts the existence of regions in this landscape where independent bacteria are viable but are outcompeted by cross-feeding pairs, providing a possible incentive for the rise of division of labor. IMPORTANCE Understanding how microbes assemble into communities is a fundamental open issue in biology, relevant to human health, metabolic engineering, and environmental sustainability. A possible mechanism for interactions of microbes is through cross-feeding, i.e., the exchange of small molecules. These metabolic exchanges may allow different microbes to specialize in distinct tasks and evolve division of labor. To systematically explore the space of possible strategies for division of labor, we applied advanced optimization algorithms to computational models of cellular metabolism. Specifically, we searched for communities able to survive under constraints (such as a limited number of reactions) that would not be sustainable by individual species. We found that predicted consortia partition metabolic pathways in ways that would be difficult to identify manually, possibly providing a competitive advantage over individual organisms. In addition to helping understand diversity in natural microbial communities, our approach could assist in the design of synthetic consortia.

11.
Stat Methods Med Res ; 28(12): 3667-3682, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-30474497

RESUMO

Objective: To derive a predictive model to identify patients likely to be hospitalized during the following year due to complications attributed to Type II diabetes. Methods: A variety of supervised machine learning classification methods were tested and a new method that discovers hidden patient clusters in the positive class (hospitalized) was developed while, at the same time, sparse linear support vector machine classifiers were derived to separate positive samples from the negative ones (non-hospitalized). The convergence of the new method was established and theoretical guarantees were proved on how the classifiers it produces generalize to a test set not seen during training. Results: The methods were tested on a large set of patients from the Boston Medical Center - the largest safety net hospital in New England. It is found that our new joint clustering/classification method achieves an accuracy of 89% (measured in terms of area under the ROC Curve) and yields informative clusters which can help interpret the classification results, thus increasing the trust of physicians to the algorithmic output and providing some guidance towards preventive measures. While it is possible to increase accuracy to 92% with other methods, this comes with increased computational cost and lack of interpretability. The analysis shows that even a modest probability of preventive actions being effective (more than 19%) suffices to generate significant hospital care savings. Conclusions: Predictive models are proposed that can help avert hospitalizations, improve health outcomes and drastically reduce hospital expenditures. The scope for savings is significant as it has been estimated that in the USA alone, about $5.8 billion are spent each year on diabetes-related hospitalizations that could be prevented.


Assuntos
Diabetes Mellitus Tipo 2 , Registros Eletrônicos de Saúde , Hospitalização/tendências , Boston , Análise por Conglomerados , Análise Custo-Benefício , Previsões , Humanos
12.
Proc IEEE Inst Electr Electron Eng ; 106(4): 690-707, 2018 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-30886441

RESUMO

Urban living in modern large cities has significant adverse effects on health, increasing the risk of several chronic diseases. We focus on the two leading clusters of chronic disease, heart disease and diabetes, and develop data-driven methods to predict hospitalizations due to these conditions. We base these predictions on the patients' medical history, recent and more distant, as described in their Electronic Health Records (EHR). We formulate the prediction problem as a binary classification problem and consider a variety of machine learning methods, including kernelized and sparse Support Vector Machines (SVM), sparse logistic regression, and random forests. To strike a balance between accuracy and interpretability of the prediction, which is important in a medical setting, we propose two novel methods: K-LRT, a likelihood ratio test-based method, and a Joint Clustering and Classification (JCC) method which identifies hidden patient clusters and adapts classifiers to each cluster. We develop theoretical out-of-sample guarantees for the latter method. We validate our algorithms on large datasets from the Boston Medical Center, the largest safety-net hospital system in New England.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA