Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Am J Respir Crit Care Med ; 205(11): 1330-1336, 2022 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-35258444

RESUMO

Rationale: Care of emergency department (ED) patients with pneumonia can be challenging. Clinical decision support may decrease unnecessary variation and improve care. Objectives: To report patient outcomes and processes of care after deployment of electronic pneumonia clinical decision support (ePNa): a comprehensive, open loop, real-time clinical decision support embedded within the electronic health record. Methods: We conducted a pragmatic, stepped-wedge, cluster-controlled trial with deployment at 2-month intervals in 16 community hospitals. ePNa extracts real-time and historical data to guide diagnosis, risk stratification, microbiological studies, site of care, and antibiotic therapy. We included all adult ED patients with pneumonia over the course of 3 years identified by International Classification of Diseases, 10th Revision discharge coding confirmed by chest imaging. Measurements and Main Results: The median age of the 6,848 patients was 67 years (interquartile range, 50-79), and 48% were female; 64.8% were hospital admitted. Unadjusted mortality was 8.6% before and 4.8% after deployment. A mixed effects logistic regression model adjusting for severity of illness with hospital cluster as the random effect showed an adjusted odds ratio of 0.62 (0.49-0.79; P < 0.001) for 30-day all-cause mortality after deployment. Lower mortality was consistent across hospital clusters. ePNa-concordant antibiotic prescribing increased from 83.5% to 90.2% (P < 0.001). The mean time from ED admission to first antibiotic was 159.4 (156.9-161.9) minutes at baseline and 150.9 (144.1-157.8) minutes after deployment (P < 0.001). Outpatient disposition from the ED increased from 29.2% to 46.9%, whereas 7-day secondary hospital admission was unchanged (5.2% vs. 6.1%). ePNa was used by ED clinicians in 67% of eligible patients. Conclusions: ePNa deployment was associated with improved processes of care and lower mortality. Clinical trial registered with www.clinicaltrials.gov (NCT03358342).


Assuntos
Sistemas de Apoio a Decisões Clínicas , Pneumonia , Adulto , Idoso , Antibacterianos/uso terapêutico , Serviço Hospitalar de Emergência , Feminino , Hospitalização , Humanos , Masculino , Pneumonia/diagnóstico
2.
BMC Public Health ; 20(1): 608, 2020 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-32357871

RESUMO

BACKGROUND: Risk adjustment models are employed to prevent adverse selection, anticipate budgetary reserve needs, and offer care management services to high-risk individuals. We aimed to address two unknowns about risk adjustment: whether machine learning (ML) and inclusion of social determinants of health (SDH) indicators improve prospective risk adjustment for health plan payments. METHODS: We employed a 2-by-2 factorial design comparing: (i) linear regression versus ML (gradient boosting) and (ii) demographics and diagnostic codes alone, versus additional ZIP code-level SDH indicators. Healthcare claims from privately-insured US adults (2016-2017), and Census data were used for analysis. Data from 1.02 million adults were used for derivation, and data from 0.26 million to assess performance. Model performance was measured using coefficient of determination (R2), discrimination (C-statistic), and mean absolute error (MAE) for the overall population, and predictive ratio and net compensation for vulnerable subgroups. We provide 95% confidence intervals (CI) around each performance measure. RESULTS: Linear regression without SDH indicators achieved moderate determination (R2 0.327, 95% CI: 0.300, 0.353), error ($6992; 95% CI: $6889, $7094), and discrimination (C-statistic 0.703; 95% CI: 0.701, 0.705). ML without SDH indicators improved all metrics (R2 0.388; 95% CI: 0.357, 0.420; error $6637; 95% CI: $6539, $6735; C-statistic 0.717; 95% CI: 0.715, 0.718), reducing misestimation of cost by $3.5 M per 10,000 members. Among people living in areas with high poverty, high wealth inequality, or high prevalence of uninsured, SDH indicators reduced underestimation of cost, improving the predictive ratio by 3% (~$200/person/year). CONCLUSIONS: ML improved risk adjustment models and the incorporation of SDH indicators reduced underpayment in several vulnerable populations.


Assuntos
Promoção da Saúde/economia , Promoção da Saúde/estatística & dados numéricos , Seguro Saúde/economia , Seguro Saúde/estatística & dados numéricos , Aprendizado de Máquina/economia , Aprendizado de Máquina/estatística & dados numéricos , Determinantes Sociais da Saúde/economia , Determinantes Sociais da Saúde/estatística & dados numéricos , Adulto , Análise Custo-Benefício , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Prospectivos , Risco Ajustado
3.
PLoS Med ; 15(11): e1002699, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30481176

RESUMO

BACKGROUND: Magnetic resonance imaging (MRI) of the knee is the preferred method for diagnosing knee injuries. However, interpretation of knee MRI is time-intensive and subject to diagnostic error and variability. An automated system for interpreting knee MRI could prioritize high-risk patients and assist clinicians in making diagnoses. Deep learning methods, in being able to automatically learn layers of features, are well suited for modeling the complex relationships between medical images and their interpretations. In this study we developed a deep learning model for detecting general abnormalities and specific diagnoses (anterior cruciate ligament [ACL] tears and meniscal tears) on knee MRI exams. We then measured the effect of providing the model's predictions to clinical experts during interpretation. METHODS AND FINDINGS: Our dataset consisted of 1,370 knee MRI exams performed at Stanford University Medical Center between January 1, 2001, and December 31, 2012 (mean age 38.0 years; 569 [41.5%] female patients). The majority vote of 3 musculoskeletal radiologists established reference standard labels on an internal validation set of 120 exams. We developed MRNet, a convolutional neural network for classifying MRI series and combined predictions from 3 series per exam using logistic regression. In detecting abnormalities, ACL tears, and meniscal tears, this model achieved area under the receiver operating characteristic curve (AUC) values of 0.937 (95% CI 0.895, 0.980), 0.965 (95% CI 0.938, 0.993), and 0.847 (95% CI 0.780, 0.914), respectively, on the internal validation set. We also obtained a public dataset of 917 exams with sagittal T1-weighted series and labels for ACL injury from Clinical Hospital Centre Rijeka, Croatia. On the external validation set of 183 exams, the MRNet trained on Stanford sagittal T2-weighted series achieved an AUC of 0.824 (95% CI 0.757, 0.892) in the detection of ACL injuries with no additional training, while an MRNet trained on the rest of the external data achieved an AUC of 0.911 (95% CI 0.864, 0.958). We additionally measured the specificity, sensitivity, and accuracy of 9 clinical experts (7 board-certified general radiologists and 2 orthopedic surgeons) on the internal validation set both with and without model assistance. Using a 2-sided Pearson's chi-squared test with adjustment for multiple comparisons, we found no significant differences between the performance of the model and that of unassisted general radiologists in detecting abnormalities. General radiologists achieved significantly higher sensitivity in detecting ACL tears (p-value = 0.002; q-value = 0.019) and significantly higher specificity in detecting meniscal tears (p-value = 0.003; q-value = 0.019). Using a 1-tailed t test on the change in performance metrics, we found that providing model predictions significantly increased clinical experts' specificity in identifying ACL tears (p-value < 0.001; q-value = 0.006). The primary limitations of our study include lack of surgical ground truth and the small size of the panel of clinical experts. CONCLUSIONS: Our deep learning model can rapidly generate accurate clinical pathology classifications of knee MRI exams from both internal and external datasets. Moreover, our results support the assertion that deep learning models can improve the performance of clinical experts during medical imaging interpretation. Further research is needed to validate the model prospectively and to determine its utility in the clinical setting.


Assuntos
Lesões do Ligamento Cruzado Anterior/diagnóstico por imagem , Aprendizado Profundo , Diagnóstico por Computador/métodos , Interpretação de Imagem Assistida por Computador/métodos , Joelho/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Lesões do Menisco Tibial/diagnóstico por imagem , Adulto , Automação , Bases de Dados Factuais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Estudos Retrospectivos , Adulto Jovem
4.
PLoS Med ; 15(11): e1002686, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30457988

RESUMO

BACKGROUND: Chest radiograph interpretation is critical for the detection of thoracic diseases, including tuberculosis and lung cancer, which affect millions of people worldwide each year. This time-consuming task typically requires expert radiologists to read the images, leading to fatigue-based diagnostic error and lack of diagnostic expertise in areas of the world where radiologists are not available. Recently, deep learning approaches have been able to achieve expert-level performance in medical image interpretation tasks, powered by large network architectures and fueled by the emergence of large labeled datasets. The purpose of this study is to investigate the performance of a deep learning algorithm on the detection of pathologies in chest radiographs compared with practicing radiologists. METHODS AND FINDINGS: We developed CheXNeXt, a convolutional neural network to concurrently detect the presence of 14 different pathologies, including pneumonia, pleural effusion, pulmonary masses, and nodules in frontal-view chest radiographs. CheXNeXt was trained and internally validated on the ChestX-ray8 dataset, with a held-out validation set consisting of 420 images, sampled to contain at least 50 cases of each of the original pathology labels. On this validation set, the majority vote of a panel of 3 board-certified cardiothoracic specialist radiologists served as reference standard. We compared CheXNeXt's discriminative performance on the validation set to the performance of 9 radiologists using the area under the receiver operating characteristic curve (AUC). The radiologists included 6 board-certified radiologists (average experience 12 years, range 4-28 years) and 3 senior radiology residents, from 3 academic institutions. We found that CheXNeXt achieved radiologist-level performance on 11 pathologies and did not achieve radiologist-level performance on 3 pathologies. The radiologists achieved statistically significantly higher AUC performance on cardiomegaly, emphysema, and hiatal hernia, with AUCs of 0.888 (95% confidence interval [CI] 0.863-0.910), 0.911 (95% CI 0.866-0.947), and 0.985 (95% CI 0.974-0.991), respectively, whereas CheXNeXt's AUCs were 0.831 (95% CI 0.790-0.870), 0.704 (95% CI 0.567-0.833), and 0.851 (95% CI 0.785-0.909), respectively. CheXNeXt performed better than radiologists in detecting atelectasis, with an AUC of 0.862 (95% CI 0.825-0.895), statistically significantly higher than radiologists' AUC of 0.808 (95% CI 0.777-0.838); there were no statistically significant differences in AUCs for the other 10 pathologies. The average time to interpret the 420 images in the validation set was substantially longer for the radiologists (240 minutes) than for CheXNeXt (1.5 minutes). The main limitations of our study are that neither CheXNeXt nor the radiologists were permitted to use patient history or review prior examinations and that evaluation was limited to a dataset from a single institution. CONCLUSIONS: In this study, we developed and validated a deep learning algorithm that classified clinically important abnormalities in chest radiographs at a performance level comparable to practicing radiologists. Once tested prospectively in clinical settings, the algorithm could have the potential to expand patient access to chest radiograph diagnostics.


Assuntos
Competência Clínica , Aprendizado Profundo , Diagnóstico por Computador/métodos , Pneumonia/diagnóstico por imagem , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Radiografia Torácica/métodos , Radiologistas , Humanos , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Estudos Retrospectivos
5.
Pac Symp Biocomput ; 29: 120-133, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38160274

RESUMO

Lack of diagnosis coding is a barrier to leveraging veterinary notes for medical and public health research. Previous work is limited to develop specialized rule-based or customized supervised learning models to predict diagnosis coding, which is tedious and not easily transferable. In this work, we show that open-source large language models (LLMs) pretrained on general corpus can achieve reasonable performance in a zero-shot setting. Alpaca-7B can achieve a zero-shot F1 of 0.538 on CSU test data and 0.389 on PP test data, two standard benchmarks for coding from veterinary notes. Furthermore, with appropriate fine-tuning, the performance of LLMs can be substantially boosted, exceeding those of strong state-of-the-art supervised models. VetLLM, which is fine-tuned on Alpaca-7B using just 5000 veterinary notes, can achieve a F1 of 0.747 on CSU test data and 0.637 on PP test data. It is of note that our fine-tuning is data-efficient: using 200 notes can outperform supervised models trained with more than 100,000 notes. The findings demonstrate the great potential of leveraging LLMs for language processing tasks in medicine, and we advocate this new paradigm for processing clinical text.


Assuntos
Camelídeos Americanos , Humanos , Animais , Processamento de Linguagem Natural , Biologia Computacional , Idioma
6.
AMIA Annu Symp Proc ; 2023: 1007-1016, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38222438

RESUMO

Low-yield repetitive laboratory diagnostics burden patients and inflate cost of care. In this study, we assess whether stability in repeated laboratory diagnostic measurements is predictable with uncertainty estimates using electronic health record data available before the diagnostic is ordered. We use probabilistic regression to predict a distribution of plausible values, allowing use-time customization for various definitions of "stability" given dynamic ranges and clinical scenarios. After converting distributions into "stability" scores, the models achieve a sensitivity of 29% for white blood cells, 60% for hemoglobin, 100% for platelets, 54% for potassium, 99% for albumin and 35% for creatinine for predicting stability at 90% precision, suggesting those fractions of repetitive tests could be reduced with low risk of missing important changes. The findings demonstrate the feasibility of using electronic health record data to identify low-yield repetitive tests and offer personalized guidance for better usage of testing while ensuring high quality care.


Assuntos
Técnicas de Laboratório Clínico , Hemoglobinas , Humanos
7.
J Thorac Imaging ; 37(3): 162-167, 2022 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-34561377

RESUMO

PURPOSE: Patients with pneumonia often present to the emergency department (ED) and require prompt diagnosis and treatment. Clinical decision support systems for the diagnosis and management of pneumonia are commonly utilized in EDs to improve patient care. The purpose of this study is to investigate whether a deep learning model for detecting radiographic pneumonia and pleural effusions can improve functionality of a clinical decision support system (CDSS) for pneumonia management (ePNa) operating in 20 EDs. MATERIALS AND METHODS: In this retrospective cohort study, a dataset of 7434 prior chest radiographic studies from 6551 ED patients was used to develop and validate a deep learning model to identify radiographic pneumonia, pleural effusions, and evidence of multilobar pneumonia. Model performance was evaluated against 3 radiologists' adjudicated interpretation and compared with performance of the natural language processing of radiology reports used by ePNa. RESULTS: The deep learning model achieved an area under the receiver operating characteristic curve of 0.833 (95% confidence interval [CI]: 0.795, 0.868) for detecting radiographic pneumonia, 0.939 (95% CI: 0.911, 0.962) for detecting pleural effusions and 0.847 (95% CI: 0.800, 0.890) for identifying multilobar pneumonia. On all 3 tasks, the model achieved higher agreement with the adjudicated radiologist interpretation compared with ePNa. CONCLUSIONS: A deep learning model demonstrated higher agreement with radiologists than the ePNa CDSS in detecting radiographic pneumonia and related findings. Incorporating deep learning models into pneumonia CDSS could enhance diagnostic performance and improve pneumonia management.


Assuntos
Sistemas de Apoio a Decisões Clínicas , Aprendizado Profundo , Derrame Pleural , Pneumonia , Serviço Hospitalar de Emergência , Humanos , Derrame Pleural/diagnóstico por imagem , Pneumonia/diagnóstico por imagem , Radiografia Torácica , Estudos Retrospectivos
8.
JAMA Netw Open ; 3(6): e206653, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32568399

RESUMO

Importance: Despite the high prevalence and potential outcomes of major depressive disorder, whether and how patients will respond to antidepressant medications is not easily predicted. Objective: To identify the extent to which a machine learning approach, using gradient-boosted decision trees, can predict acute improvement for individual depressive symptoms with antidepressants based on pretreatment symptom scores and electroencephalographic (EEG) measures. Design, Setting, and Participants: This prognostic study analyzed data collected as part of the International Study to Predict Optimized Treatment in Depression, a randomized, prospective open-label trial to identify clinically useful predictors and moderators of response to commonly used first-line antidepressant medications. Data collection was conducted at 20 sites spanning 5 countries and including 518 adult outpatients (18-65 years of age) from primary care or specialty care practices who received a diagnosis of current major depressive disorder between December 1, 2008, and September 30, 2013. Patients were antidepressant medication naive or willing to undergo a 1-week washout period of any nonprotocol antidepressant medication. Statistical analysis was conducted from January 5 to June 30, 2019. Exposures: Participants with major depressive disorder were randomized in a 1:1:1 ratio to undergo 8 weeks of treatment with escitalopram oxalate (n = 162), sertraline hydrochloride (n = 176), or extended-release venlafaxine hydrochloride (n = 180). Main Outcomes and Measures: The primary objective was to predict improvement in individual symptoms, defined as the difference in score for each of the symptoms on the 21-item Hamilton Rating Scale for Depression from baseline to week 8, evaluated using the C index. Results: The resulting data set contained 518 patients (274 women; mean [SD] age, 39.0 [12.6] years; mean [SD] 21-item Hamilton Rating Scale for Depression score improvement, 13.0 [7.0]). With the use of 5-fold cross-validation for evaluation, the machine learning model achieved C index scores of 0.8 or higher on 12 of 21 clinician-rated symptoms, with the highest C index score of 0.963 (95% CI, 0.939-1.000) for loss of insight. The importance of any single EEG feature was higher than 5% for prediction of 7 symptoms, with the most important EEG features being the absolute delta band power at the occipital electrode sites (O1, 18.8%; Oz, 6.7%) for loss of insight. Over and above the use of baseline symptom scores alone, the use of both EEG and baseline symptom features was associated with a significant increase in the C index for improvement in 4 symptoms: loss of insight (C index increase, 0.012 [95% CI, 0.001-0.020]), energy loss (C index increase, 0.035 [95% CI, 0.011-0.059]), appetite changes (C index increase, 0.017 [95% CI, 0.003-0.030]), and psychomotor retardation (C index increase, 0.020 [95% CI, 0.008-0.032]). Conclusions and Relevance: This study suggests that machine learning may be used to identify independent associations of symptoms and EEG features to predict antidepressant-associated improvements in specific symptoms of depression. The approach should next be prospectively validated in clinical trials and settings. Trial Registration: ClinicalTrials.gov Identifier: NCT00693849.


Assuntos
Antidepressivos/uso terapêutico , Citalopram/uso terapêutico , Transtorno Depressivo Maior/tratamento farmacológico , Eletroencefalografia/métodos , Aprendizado de Máquina/estatística & dados numéricos , Sertralina/uso terapêutico , Cloridrato de Venlafaxina/uso terapêutico , Adulto , Algoritmos , Regras de Decisão Clínica , Transtorno Depressivo Maior/diagnóstico , Transtorno Depressivo Maior/psicologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Prognóstico , Estudos Prospectivos , Resultado do Tratamento
9.
Sci Rep ; 10(1): 3958, 2020 03 03.
Artigo em Inglês | MEDLINE | ID: mdl-32127625

RESUMO

The development of deep learning algorithms for complex tasks in digital medicine has relied on the availability of large labeled training datasets, usually containing hundreds of thousands of examples. The purpose of this study was to develop a 3D deep learning model, AppendiXNet, to detect appendicitis, one of the most common life-threatening abdominal emergencies, using a small training dataset of less than 500 training CT exams. We explored whether pretraining the model on a large collection of natural videos would improve the performance of the model over training the model from scratch. AppendiXNet was pretrained on a large collection of YouTube videos called Kinetics, consisting of approximately 500,000 video clips and annotated for one of 600 human action classes, and then fine-tuned on a small dataset of 438 CT scans annotated for appendicitis. We found that pretraining the 3D model on natural videos significantly improved the performance of the model from an AUC of 0.724 (95% CI 0.625, 0.823) to 0.810 (95% CI 0.725, 0.895). The application of deep learning to detect abnormalities on CT examinations using video pretraining could generalize effectively to other challenging cross-sectional medical imaging tasks when training data is limited.


Assuntos
Algoritmos , Apendicite/diagnóstico , Apendicite/metabolismo , Aprendizado Profundo , Adulto , Estudos Transversais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade
11.
NPJ Digit Med ; 3: 61, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32352039

RESUMO

Pulmonary embolism (PE) is a life-threatening clinical problem and computed tomography pulmonary angiography (CTPA) is the gold standard for diagnosis. Prompt diagnosis and immediate treatment are critical to avoid high morbidity and mortality rates, yet PE remains among the diagnoses most frequently missed or delayed. In this study, we developed a deep learning model-PENet, to automatically detect PE on volumetric CTPA scans as an end-to-end solution for this purpose. The PENet is a 77-layer 3D convolutional neural network (CNN) pretrained on the Kinetics-600 dataset and fine-tuned on a retrospective CTPA dataset collected from a single academic institution. The PENet model performance was evaluated in detecting PE on data from two different institutions: one as a hold-out dataset from the same institution as the training data and a second collected from an external institution to evaluate model generalizability to an unrelated population dataset. PENet achieved an AUROC of 0.84 [0.82-0.87] on detecting PE on the hold out internal test set and 0.85 [0.81-0.88] on external dataset. PENet also outperformed current state-of-the-art 3D CNN models. The results represent successful application of an end-to-end 3D CNN model for the complex task of PE diagnosis without requiring computationally intensive and time consuming preprocessing and demonstrates sustained performance on data from an external institution. Our model could be applied as a triage tool to automatically identify clinically important PEs allowing for prioritization for diagnostic radiology interpretation and improved care pathways via more efficient diagnosis.

12.
NPJ Digit Med ; 2: 111, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31754637

RESUMO

Human-in-the-loop (HITL) AI may enable an ideal symbiosis of human experts and AI models, harnessing the advantages of both while at the same time overcoming their respective limitations. The purpose of this study was to investigate a novel collective intelligence technology designed to amplify the diagnostic accuracy of networked human groups by forming real-time systems modeled on biological swarms. Using small groups of radiologists, the swarm-based technology was applied to the diagnosis of pneumonia on chest radiographs and compared against human experts alone, as well as two state-of-the-art deep learning AI models. Our work demonstrates that both the swarm-based technology and deep-learning technology achieved superior diagnostic accuracy than the human experts alone. Our work further demonstrates that when used in combination, the swarm-based technology and deep-learning technology outperformed either method alone. The superior diagnostic accuracy of the combined HITL AI solution compared to radiologists and AI alone has broad implications for the surging clinical AI deployment and implementation strategies in future practice.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA