Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
BMC Public Health ; 20(1): 608, 2020 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-32357871

RESUMEN

BACKGROUND: Risk adjustment models are employed to prevent adverse selection, anticipate budgetary reserve needs, and offer care management services to high-risk individuals. We aimed to address two unknowns about risk adjustment: whether machine learning (ML) and inclusion of social determinants of health (SDH) indicators improve prospective risk adjustment for health plan payments. METHODS: We employed a 2-by-2 factorial design comparing: (i) linear regression versus ML (gradient boosting) and (ii) demographics and diagnostic codes alone, versus additional ZIP code-level SDH indicators. Healthcare claims from privately-insured US adults (2016-2017), and Census data were used for analysis. Data from 1.02 million adults were used for derivation, and data from 0.26 million to assess performance. Model performance was measured using coefficient of determination (R2), discrimination (C-statistic), and mean absolute error (MAE) for the overall population, and predictive ratio and net compensation for vulnerable subgroups. We provide 95% confidence intervals (CI) around each performance measure. RESULTS: Linear regression without SDH indicators achieved moderate determination (R2 0.327, 95% CI: 0.300, 0.353), error ($6992; 95% CI: $6889, $7094), and discrimination (C-statistic 0.703; 95% CI: 0.701, 0.705). ML without SDH indicators improved all metrics (R2 0.388; 95% CI: 0.357, 0.420; error $6637; 95% CI: $6539, $6735; C-statistic 0.717; 95% CI: 0.715, 0.718), reducing misestimation of cost by $3.5 M per 10,000 members. Among people living in areas with high poverty, high wealth inequality, or high prevalence of uninsured, SDH indicators reduced underestimation of cost, improving the predictive ratio by 3% (~$200/person/year). CONCLUSIONS: ML improved risk adjustment models and the incorporation of SDH indicators reduced underpayment in several vulnerable populations.


Asunto(s)
Promoción de la Salud/economía , Promoción de la Salud/estadística & datos numéricos , Seguro de Salud/economía , Seguro de Salud/estadística & datos numéricos , Aprendizaje Automático/economía , Aprendizaje Automático/estadística & datos numéricos , Determinantes Sociales de la Salud/economía , Determinantes Sociales de la Salud/estadística & datos numéricos , Adulto , Análisis Costo-Beneficio , Femenino , Humanos , Masculino , Persona de Mediana Edad , Estudios Prospectivos , Ajuste de Riesgo
2.
PLoS Med ; 15(11): e1002699, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30481176

RESUMEN

BACKGROUND: Magnetic resonance imaging (MRI) of the knee is the preferred method for diagnosing knee injuries. However, interpretation of knee MRI is time-intensive and subject to diagnostic error and variability. An automated system for interpreting knee MRI could prioritize high-risk patients and assist clinicians in making diagnoses. Deep learning methods, in being able to automatically learn layers of features, are well suited for modeling the complex relationships between medical images and their interpretations. In this study we developed a deep learning model for detecting general abnormalities and specific diagnoses (anterior cruciate ligament [ACL] tears and meniscal tears) on knee MRI exams. We then measured the effect of providing the model's predictions to clinical experts during interpretation. METHODS AND FINDINGS: Our dataset consisted of 1,370 knee MRI exams performed at Stanford University Medical Center between January 1, 2001, and December 31, 2012 (mean age 38.0 years; 569 [41.5%] female patients). The majority vote of 3 musculoskeletal radiologists established reference standard labels on an internal validation set of 120 exams. We developed MRNet, a convolutional neural network for classifying MRI series and combined predictions from 3 series per exam using logistic regression. In detecting abnormalities, ACL tears, and meniscal tears, this model achieved area under the receiver operating characteristic curve (AUC) values of 0.937 (95% CI 0.895, 0.980), 0.965 (95% CI 0.938, 0.993), and 0.847 (95% CI 0.780, 0.914), respectively, on the internal validation set. We also obtained a public dataset of 917 exams with sagittal T1-weighted series and labels for ACL injury from Clinical Hospital Centre Rijeka, Croatia. On the external validation set of 183 exams, the MRNet trained on Stanford sagittal T2-weighted series achieved an AUC of 0.824 (95% CI 0.757, 0.892) in the detection of ACL injuries with no additional training, while an MRNet trained on the rest of the external data achieved an AUC of 0.911 (95% CI 0.864, 0.958). We additionally measured the specificity, sensitivity, and accuracy of 9 clinical experts (7 board-certified general radiologists and 2 orthopedic surgeons) on the internal validation set both with and without model assistance. Using a 2-sided Pearson's chi-squared test with adjustment for multiple comparisons, we found no significant differences between the performance of the model and that of unassisted general radiologists in detecting abnormalities. General radiologists achieved significantly higher sensitivity in detecting ACL tears (p-value = 0.002; q-value = 0.019) and significantly higher specificity in detecting meniscal tears (p-value = 0.003; q-value = 0.019). Using a 1-tailed t test on the change in performance metrics, we found that providing model predictions significantly increased clinical experts' specificity in identifying ACL tears (p-value < 0.001; q-value = 0.006). The primary limitations of our study include lack of surgical ground truth and the small size of the panel of clinical experts. CONCLUSIONS: Our deep learning model can rapidly generate accurate clinical pathology classifications of knee MRI exams from both internal and external datasets. Moreover, our results support the assertion that deep learning models can improve the performance of clinical experts during medical imaging interpretation. Further research is needed to validate the model prospectively and to determine its utility in the clinical setting.


Asunto(s)
Lesiones del Ligamento Cruzado Anterior/diagnóstico por imagen , Aprendizaje Profundo , Diagnóstico por Computador/métodos , Interpretación de Imagen Asistida por Computador/métodos , Rodilla/diagnóstico por imagen , Imagen por Resonancia Magnética/métodos , Lesiones de Menisco Tibial/diagnóstico por imagen , Adulto , Automatización , Bases de Datos Factuales , Femenino , Humanos , Masculino , Persona de Mediana Edad , Valor Predictivo de las Pruebas , Reproducibilidad de los Resultados , Estudios Retrospectivos , Adulto Joven
3.
PLoS Med ; 15(11): e1002686, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30457988

RESUMEN

BACKGROUND: Chest radiograph interpretation is critical for the detection of thoracic diseases, including tuberculosis and lung cancer, which affect millions of people worldwide each year. This time-consuming task typically requires expert radiologists to read the images, leading to fatigue-based diagnostic error and lack of diagnostic expertise in areas of the world where radiologists are not available. Recently, deep learning approaches have been able to achieve expert-level performance in medical image interpretation tasks, powered by large network architectures and fueled by the emergence of large labeled datasets. The purpose of this study is to investigate the performance of a deep learning algorithm on the detection of pathologies in chest radiographs compared with practicing radiologists. METHODS AND FINDINGS: We developed CheXNeXt, a convolutional neural network to concurrently detect the presence of 14 different pathologies, including pneumonia, pleural effusion, pulmonary masses, and nodules in frontal-view chest radiographs. CheXNeXt was trained and internally validated on the ChestX-ray8 dataset, with a held-out validation set consisting of 420 images, sampled to contain at least 50 cases of each of the original pathology labels. On this validation set, the majority vote of a panel of 3 board-certified cardiothoracic specialist radiologists served as reference standard. We compared CheXNeXt's discriminative performance on the validation set to the performance of 9 radiologists using the area under the receiver operating characteristic curve (AUC). The radiologists included 6 board-certified radiologists (average experience 12 years, range 4-28 years) and 3 senior radiology residents, from 3 academic institutions. We found that CheXNeXt achieved radiologist-level performance on 11 pathologies and did not achieve radiologist-level performance on 3 pathologies. The radiologists achieved statistically significantly higher AUC performance on cardiomegaly, emphysema, and hiatal hernia, with AUCs of 0.888 (95% confidence interval [CI] 0.863-0.910), 0.911 (95% CI 0.866-0.947), and 0.985 (95% CI 0.974-0.991), respectively, whereas CheXNeXt's AUCs were 0.831 (95% CI 0.790-0.870), 0.704 (95% CI 0.567-0.833), and 0.851 (95% CI 0.785-0.909), respectively. CheXNeXt performed better than radiologists in detecting atelectasis, with an AUC of 0.862 (95% CI 0.825-0.895), statistically significantly higher than radiologists' AUC of 0.808 (95% CI 0.777-0.838); there were no statistically significant differences in AUCs for the other 10 pathologies. The average time to interpret the 420 images in the validation set was substantially longer for the radiologists (240 minutes) than for CheXNeXt (1.5 minutes). The main limitations of our study are that neither CheXNeXt nor the radiologists were permitted to use patient history or review prior examinations and that evaluation was limited to a dataset from a single institution. CONCLUSIONS: In this study, we developed and validated a deep learning algorithm that classified clinically important abnormalities in chest radiographs at a performance level comparable to practicing radiologists. Once tested prospectively in clinical settings, the algorithm could have the potential to expand patient access to chest radiograph diagnostics.


Asunto(s)
Competencia Clínica , Aprendizaje Profundo , Diagnóstico por Computador/métodos , Neumonía/diagnóstico por imagen , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Radiografía Torácica/métodos , Radiólogos , Humanos , Valor Predictivo de las Pruebas , Reproducibilidad de los Resultados , Estudios Retrospectivos
4.
Pac Symp Biocomput ; 29: 120-133, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38160274

RESUMEN

Lack of diagnosis coding is a barrier to leveraging veterinary notes for medical and public health research. Previous work is limited to develop specialized rule-based or customized supervised learning models to predict diagnosis coding, which is tedious and not easily transferable. In this work, we show that open-source large language models (LLMs) pretrained on general corpus can achieve reasonable performance in a zero-shot setting. Alpaca-7B can achieve a zero-shot F1 of 0.538 on CSU test data and 0.389 on PP test data, two standard benchmarks for coding from veterinary notes. Furthermore, with appropriate fine-tuning, the performance of LLMs can be substantially boosted, exceeding those of strong state-of-the-art supervised models. VetLLM, which is fine-tuned on Alpaca-7B using just 5000 veterinary notes, can achieve a F1 of 0.747 on CSU test data and 0.637 on PP test data. It is of note that our fine-tuning is data-efficient: using 200 notes can outperform supervised models trained with more than 100,000 notes. The findings demonstrate the great potential of leveraging LLMs for language processing tasks in medicine, and we advocate this new paradigm for processing clinical text.


Asunto(s)
Camélidos del Nuevo Mundo , Humanos , Animales , Procesamiento de Lenguaje Natural , Biología Computacional , Lenguaje
5.
AMIA Annu Symp Proc ; 2023: 1007-1016, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38222438

RESUMEN

Low-yield repetitive laboratory diagnostics burden patients and inflate cost of care. In this study, we assess whether stability in repeated laboratory diagnostic measurements is predictable with uncertainty estimates using electronic health record data available before the diagnostic is ordered. We use probabilistic regression to predict a distribution of plausible values, allowing use-time customization for various definitions of "stability" given dynamic ranges and clinical scenarios. After converting distributions into "stability" scores, the models achieve a sensitivity of 29% for white blood cells, 60% for hemoglobin, 100% for platelets, 54% for potassium, 99% for albumin and 35% for creatinine for predicting stability at 90% precision, suggesting those fractions of repetitive tests could be reduced with low risk of missing important changes. The findings demonstrate the feasibility of using electronic health record data to identify low-yield repetitive tests and offer personalized guidance for better usage of testing while ensuring high quality care.


Asunto(s)
Técnicas de Laboratorio Clínico , Hemoglobinas , Humanos
6.
Patterns (N Y) ; 4(9): 100802, 2023 Sep 08.
Artículo en Inglés | MEDLINE | ID: mdl-37720336

RESUMEN

Artificial intelligence (AI) models for automatic generation of narrative radiology reports from images have the potential to enhance efficiency and reduce the workload of radiologists. However, evaluating the correctness of these reports requires metrics that can capture clinically pertinent differences. In this study, we investigate the alignment between automated metrics and radiologists' scoring of errors in report generation. We address the limitations of existing metrics by proposing new metrics, RadGraph F1 and RadCliQ, which demonstrate stronger correlation with radiologists' evaluations. In addition, we analyze the failure modes of the metrics to understand their limitations and provide guidance for metric selection and interpretation. This study establishes RadGraph F1 and RadCliQ as meaningful metrics for guiding future research in radiology report generation.

7.
Patterns (N Y) ; 3(1): 100400, 2022 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-35079716

RESUMEN

Data labeling is often the limiting step in machine learning because it requires time from trained experts. To address the limitation on labeled data, contrastive learning, among other unsupervised learning methods, leverages unlabeled data to learn representations of data. Here, we propose a contrastive learning framework that utilizes metadata for selecting positive and negative pairs when training on unlabeled data. We demonstrate its application in the healthcare domain on heart and lung sound recordings. The increasing availability of heart and lung sound recordings due to adoption of digital stethoscopes lends itself as an opportunity to demonstrate the application of our contrastive learning method. Compared to contrastive learning with augmentations, the contrastive learning model leveraging metadata for pair selection utilizes clinical information associated with lung and heart sound recordings. This approach uses shared context of the recordings on the patient level using clinical information including age, sex, weight, location of sounds, etc. We show improvement in downstream tasks for diagnosing heart and lung sounds when leveraging patient-specific representations in selecting positive and negative pairs. This study paves the path for medical applications of contrastive learning that leverage clinical information. We have made our code available here: https://github.com/stanfordmlgroup/selfsupervised-lungandheartsounds.

8.
Nat Biomed Eng ; 6(12): 1399-1406, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36109605

RESUMEN

In tasks involving the interpretation of medical images, suitably trained machine-learning models often exceed the performance of medical experts. Yet such a high-level of performance typically requires that the models be trained with relevant datasets that have been painstakingly annotated by experts. Here we show that a self-supervised model trained on chest X-ray images that lack explicit annotations performs pathology-classification tasks with accuracies comparable to those of radiologists. On an external validation dataset of chest X-rays, the self-supervised model outperformed a fully supervised model in the detection of three pathologies (out of eight), and the performance generalized to pathologies that were not explicitly annotated for model training, to multiple image-interpretation tasks and to datasets from multiple institutions.


Asunto(s)
Aprendizaje Automático , Aprendizaje Automático Supervisado , Rayos X
9.
J Am Med Inform Assoc ; 29(11): 1908-1918, 2022 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-35994003

RESUMEN

OBJECTIVE: Chest pain is common, and current risk-stratification methods, requiring 12-lead electrocardiograms (ECGs) and serial biomarker assays, are static and restricted to highly resourced settings. Our objective was to predict myocardial injury using continuous single-lead ECG waveforms similar to those obtained from wearable devices and to evaluate the potential of transfer learning from labeled 12-lead ECGs to improve these predictions. METHODS: We studied 10 874 Emergency Department (ED) patients who received continuous ECG monitoring and troponin testing from 2020 to 2021. We defined myocardial injury as newly elevated troponin in patients with chest pain or shortness of breath. We developed deep learning models of myocardial injury using continuous lead II ECG from bedside monitors as well as conventional 12-lead ECGs from triage. We pretrained single-lead models on a pre-existing corpus of labeled 12-lead ECGs. We compared model predictions to those of ED physicians. RESULTS: A transfer learning strategy, whereby models for continuous single-lead ECGs were first pretrained on 12-lead ECGs from a separate cohort, predicted myocardial injury as accurately as models using patients' own 12-lead ECGs: area under the receiver operating characteristic curve 0.760 (95% confidence interval [CI], 0.721-0.799) and area under the precision-recall curve 0.321 (95% CI, 0.251-0.397). Models demonstrated a high negative predictive value for myocardial injury among patients with chest pain or shortness of breath, exceeding the predictive performance of ED physicians, while attending to known stigmata of myocardial injury. CONCLUSIONS: Deep learning models pretrained on labeled 12-lead ECGs can predict myocardial injury from noisy, continuous monitor data early in a patient's presentation. The utility of continuous single-lead ECG in the risk stratification of chest pain has implications for wearable devices and preclinical settings, where external validation of the approach is needed.


Asunto(s)
Dolor en el Pecho , Electrocardiografía , Biomarcadores , Dolor en el Pecho/diagnóstico , Dolor en el Pecho/etiología , Disnea/diagnóstico , Disnea/etiología , Electrocardiografía/métodos , Servicio de Urgencia en Hospital , Humanos , Aprendizaje Automático , Troponina
10.
J Thorac Imaging ; 37(3): 162-167, 2022 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-34561377

RESUMEN

PURPOSE: Patients with pneumonia often present to the emergency department (ED) and require prompt diagnosis and treatment. Clinical decision support systems for the diagnosis and management of pneumonia are commonly utilized in EDs to improve patient care. The purpose of this study is to investigate whether a deep learning model for detecting radiographic pneumonia and pleural effusions can improve functionality of a clinical decision support system (CDSS) for pneumonia management (ePNa) operating in 20 EDs. MATERIALS AND METHODS: In this retrospective cohort study, a dataset of 7434 prior chest radiographic studies from 6551 ED patients was used to develop and validate a deep learning model to identify radiographic pneumonia, pleural effusions, and evidence of multilobar pneumonia. Model performance was evaluated against 3 radiologists' adjudicated interpretation and compared with performance of the natural language processing of radiology reports used by ePNa. RESULTS: The deep learning model achieved an area under the receiver operating characteristic curve of 0.833 (95% confidence interval [CI]: 0.795, 0.868) for detecting radiographic pneumonia, 0.939 (95% CI: 0.911, 0.962) for detecting pleural effusions and 0.847 (95% CI: 0.800, 0.890) for identifying multilobar pneumonia. On all 3 tasks, the model achieved higher agreement with the adjudicated radiologist interpretation compared with ePNa. CONCLUSIONS: A deep learning model demonstrated higher agreement with radiologists than the ePNa CDSS in detecting radiographic pneumonia and related findings. Incorporating deep learning models into pneumonia CDSS could enhance diagnostic performance and improve pneumonia management.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Aprendizaje Profundo , Derrame Pleural , Neumonía , Servicio de Urgencia en Hospital , Humanos , Derrame Pleural/diagnóstico por imagen , Neumonía/diagnóstico por imagen , Radiografía Torácica , Estudios Retrospectivos
11.
Sci Data ; 8(1): 135, 2021 05 20.
Artículo en Inglés | MEDLINE | ID: mdl-34017010

RESUMEN

Diffuse Large B-Cell Lymphoma (DLBCL) is the most common non-Hodgkin lymphoma. Though histologically DLBCL shows varying morphologies, no morphologic features have been consistently demonstrated to correlate with prognosis. We present a morphologic analysis of histology sections from 209 DLBCL cases with associated clinical and cytogenetic data. Duplicate tissue core sections were arranged in tissue microarrays (TMAs), and replicate sections were stained with H&E and immunohistochemical stains for CD10, BCL6, MUM1, BCL2, and MYC. The TMAs are accompanied by pathologist-annotated regions-of-interest (ROIs) that identify areas of tissue representative of DLBCL. We used a deep learning model to segment all tumor nuclei in the ROIs, and computed several geometric features for each segmented nucleus. We fit a Cox proportional hazards model to demonstrate the utility of these geometric features in predicting survival outcome, and found that it achieved a C-index (95% CI) of 0.635 (0.574,0.691). Our finding suggests that geometric features computed from tumor nuclei are of prognostic importance, and should be validated in prospective studies.


Asunto(s)
Aprendizaje Profundo , Linfoma de Células B Grandes Difuso/genética , Linfoma de Células B Grandes Difuso/patología , Núcleo Celular/ultraestructura , Eosina Amarillenta-(YS) , Hematoxilina , Humanos , Pronóstico , Coloración y Etiquetado , Análisis de Matrices Tisulares
12.
JAMA Netw Open ; 4(7): e2117391, 2021 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-34297075

RESUMEN

Importance: Physicians are required to work with rapidly growing amounts of medical data. Approximately 62% of time per patient is devoted to reviewing electronic health records (EHRs), with clinical data review being the most time-consuming portion. Objective: To determine whether an artificial intelligence (AI) system developed to organize and display new patient referral records would improve a clinician's ability to extract patient information compared with the current standard of care. Design, Setting, and Participants: In this prognostic study, an AI system was created to organize patient records and improve data retrieval. To evaluate the system on time and accuracy, a nonblinded, prospective study was conducted at a single academic medical center. Recruitment emails were sent to all physicians in the gastroenterology division, and 12 clinicians agreed to participate. Each of the clinicians participating in the study received 2 referral records: 1 AI-optimized patient record and 1 standard (non-AI-optimized) patient record. For each record, clinicians were asked 22 questions requiring them to search the assigned record for clinically relevant information. Clinicians reviewed records from June 1 to August 30, 2020. Main Outcomes and Measures: The time required to answer each question, along with accuracy, was measured for both records, with and without AI optimization. Participants were asked to assess overall satisfaction with the AI system, their preferred review method (AI-optimized vs standard), and other topics to assess clinical utility. Results: Twelve gastroenterology physicians/fellows completed the study. Compared with standard (non-AI-optimized) patient record review, the AI system saved first-time physician users 18% of the time used to answer the clinical questions (10.5 [95% CI, 8.5-12.6] vs 12.8 [95% CI, 9.4-16.2] minutes; P = .02). There was no significant decrease in accuracy when physicians retrieved important patient information (83.7% [95% CI, 79.3%-88.2%] with the AI-optimized vs 86.0% [95% CI, 81.8%-90.2%] without the AI-optimized record; P = .81). Survey responses from physicians were generally positive across all questions. Eleven of 12 physicians (92%) preferred the AI-optimized record review to standard review. Despite a learning curve pointed out by respondents, 11 of 12 physicians believed that the technology would save them time to assess new patient records and were interested in using this technology in their clinic. Conclusions and Relevance: In this prognostic study, an AI system helped physicians extract relevant patient information in a shorter time while maintaining high accuracy. This finding is particularly germane to the ever-increasing amounts of medical data and increased stressors on clinicians. Increased user familiarity with the AI system, along with further enhancements in the system itself, hold promise to further improve physician data extraction from large quantities of patient health records.


Asunto(s)
Inteligencia Artificial , Almacenamiento y Recuperación de la Información/métodos , Registros Médicos , Médicos/psicología , Diseño Centrado en el Usuario , Centros Médicos Académicos , Adulto , Femenino , Humanos , Satisfacción en el Trabajo , Masculino , Persona de Mediana Edad , Estudios Prospectivos , Derivación y Consulta , Análisis y Desempeño de Tareas , Factores de Tiempo , Carga de Trabajo/psicología
13.
EBioMedicine ; 71: 103546, 2021 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-34419924

RESUMEN

BACKGROUND: Respiratory virus infections are significant causes of morbidity and mortality, and may induce host metabolite alterations by infecting respiratory epithelial cells. We investigated the use of liquid chromatography quadrupole time-of-flight mass spectrometry (LC/Q-TOF) combined with machine learning for the diagnosis of influenza infection. METHODS: We analyzed nasopharyngeal swab samples by LC/Q-TOF to identify distinct metabolic signatures for diagnosis of acute illness. Machine learning models were performed for classification, followed by Shapley additive explanation (SHAP) analysis to analyze feature importance and for biomarker discovery. FINDINGS: A total of 236 samples were tested in the discovery phase by LC/Q-TOF, including 118 positive samples (40 influenza A 2009 H1N1, 39 influenza H3 and 39 influenza B) as well as 118 age and sex-matched negative controls with acute respiratory illness. Analysis showed an area under the receiver operating characteristic curve (AUC) of 1.00 (95% confidence interval [95% CI] 0.99, 1.00), sensitivity of 1.00 (95% CI 0.86, 1.00) and specificity of 0.96 (95% CI 0.81, 0.99). The metabolite most strongly associated with differential classification was pyroglutamic acid. Independent validation of a biomarker signature based on the top 20 differentiating ion features was performed in a prospective cohort of 96 symptomatic individuals including 48 positive samples (24 influenza A 2009 H1N1, 5 influenza H3 and 19 influenza B) and 48 negative samples. Testing performed using a clinically-applicable targeted approach, liquid chromatography triple quadrupole mass spectrometry, showed an AUC of 1.00 (95% CI 0.998, 1.00), sensitivity of 0.94 (95% CI 0.83, 0.98), and specificity of 1.00 (95% CI 0.93, 1.00). Limitations include lack of sample suitability assessment, and need to validate these findings in additional patient populations. INTERPRETATION: This metabolomic approach has potential for diagnostic applications in infectious diseases testing, including other respiratory viruses, and may eventually be adapted for point-of-care testing. FUNDING: None.


Asunto(s)
Gripe Humana/diagnóstico , Aprendizaje Automático , Metaboloma , Técnicas de Diagnóstico Molecular/métodos , Adolescente , Adulto , Niño , Preescolar , Femenino , Cromatografía de Gases y Espectrometría de Masas/métodos , Humanos , Gripe Humana/metabolismo , Gripe Humana/virología , Masculino , Metabolómica/métodos , Mucosa Nasal/metabolismo , Mucosa Nasal/virología , Orthomyxoviridae/patogenicidad , Ácido Pirrolidona Carboxílico/análisis
14.
NPJ Digit Med ; 4(1): 88, 2021 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-34075194

RESUMEN

Coronary artery disease (CAD), the most common manifestation of cardiovascular disease, remains the most common cause of mortality in the United States. Risk assessment is key for primary prevention of coronary events and coronary artery calcium (CAC) scoring using computed tomography (CT) is one such non-invasive tool. Despite the proven clinical value of CAC, the current clinical practice implementation for CAC has limitations such as the lack of insurance coverage for the test, need for capital-intensive CT machines, specialized imaging protocols, and accredited 3D imaging labs for analysis (including personnel and software). Perhaps the greatest gap is the millions of patients who undergo routine chest CT exams and demonstrate coronary artery calcification, but their presence is not often reported or quantitation is not feasible. We present two deep learning models that automate CAC scoring demonstrating advantages in automated scoring for both dedicated gated coronary CT exams and routine non-gated chest CTs performed for other reasons to allow opportunistic screening. First, we trained a gated coronary CT model for CAC scoring that showed near perfect agreement (mean difference in scores = -2.86; Cohen's Kappa = 0.89, P < 0.0001) with current conventional manual scoring on a retrospective dataset of 79 patients and was found to perform the task faster (average time for automated CAC scoring using a graphics processing unit (GPU) was 3.5 ± 2.1 s vs. 261 s for manual scoring) in a prospective trial of 55 patients with little difference in scores compared to three technologists (mean difference in scores = 3.24, 5.12, and 5.48, respectively). Then using CAC scores from paired gated coronary CT as a reference standard, we trained a deep learning model on our internal data and a cohort from the Multi-Ethnic Study of Atherosclerosis (MESA) study (total training n = 341, Stanford test n = 42, MESA test n = 46) to perform CAC scoring on routine non-gated chest CT exams with validation on external datasets (total n = 303) obtained from four geographically disparate health systems. On identifying patients with any CAC (i.e., CAC ≥ 1), sensitivity and PPV was high across all datasets (ranges: 80-100% and 87-100%, respectively). For CAC ≥ 100 on routine non-gated chest CTs, which is the latest recommended threshold to initiate statin therapy, our model showed sensitivities of 71-94% and positive predictive values in the range of 88-100% across all the sites. Adoption of this model could allow more patients to be screened with CAC scoring, potentially allowing opportunistic early preventive interventions.

15.
Sci Rep ; 10(1): 3958, 2020 03 03.
Artículo en Inglés | MEDLINE | ID: mdl-32127625

RESUMEN

The development of deep learning algorithms for complex tasks in digital medicine has relied on the availability of large labeled training datasets, usually containing hundreds of thousands of examples. The purpose of this study was to develop a 3D deep learning model, AppendiXNet, to detect appendicitis, one of the most common life-threatening abdominal emergencies, using a small training dataset of less than 500 training CT exams. We explored whether pretraining the model on a large collection of natural videos would improve the performance of the model over training the model from scratch. AppendiXNet was pretrained on a large collection of YouTube videos called Kinetics, consisting of approximately 500,000 video clips and annotated for one of 600 human action classes, and then fine-tuned on a small dataset of 438 CT scans annotated for appendicitis. We found that pretraining the 3D model on natural videos significantly improved the performance of the model from an AUC of 0.724 (95% CI 0.625, 0.823) to 0.810 (95% CI 0.725, 0.895). The application of deep learning to detect abnormalities on CT examinations using video pretraining could generalize effectively to other challenging cross-sectional medical imaging tasks when training data is limited.


Asunto(s)
Algoritmos , Apendicitis/diagnóstico , Apendicitis/metabolismo , Aprendizaje Profundo , Adulto , Estudios Transversales , Femenino , Humanos , Masculino , Persona de Mediana Edad
16.
NPJ Digit Med ; 3: 115, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32964138

RESUMEN

Tuberculosis (TB) is the leading cause of preventable death in HIV-positive patients, and yet often remains undiagnosed and untreated. Chest x-ray is often used to assist in diagnosis, yet this presents additional challenges due to atypical radiographic presentation and radiologist shortages in regions where co-infection is most common. We developed a deep learning algorithm to diagnose TB using clinical information and chest x-ray images from 677 HIV-positive patients with suspected TB from two hospitals in South Africa. We then sought to determine whether the algorithm could assist clinicians in the diagnosis of TB in HIV-positive patients as a web-based diagnostic assistant. Use of the algorithm resulted in a modest but statistically significant improvement in clinician accuracy (p = 0.002), increasing the mean clinician accuracy from 0.60 (95% CI 0.57, 0.63) without assistance to 0.65 (95% CI 0.60, 0.70) with assistance. However, the accuracy of assisted clinicians was significantly lower (p < 0.001) than that of the stand-alone algorithm, which had an accuracy of 0.79 (95% CI 0.77, 0.82) on the same unseen test cases. These results suggest that deep learning assistance may improve clinician accuracy in TB diagnosis using chest x-rays, which would be valuable in settings with a high burden of HIV/TB co-infection. Moreover, the high accuracy of the stand-alone algorithm suggests a potential value particularly in settings with a scarcity of radiological expertise.

18.
NPJ Digit Med ; 3: 61, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32352039

RESUMEN

Pulmonary embolism (PE) is a life-threatening clinical problem and computed tomography pulmonary angiography (CTPA) is the gold standard for diagnosis. Prompt diagnosis and immediate treatment are critical to avoid high morbidity and mortality rates, yet PE remains among the diagnoses most frequently missed or delayed. In this study, we developed a deep learning model-PENet, to automatically detect PE on volumetric CTPA scans as an end-to-end solution for this purpose. The PENet is a 77-layer 3D convolutional neural network (CNN) pretrained on the Kinetics-600 dataset and fine-tuned on a retrospective CTPA dataset collected from a single academic institution. The PENet model performance was evaluated in detecting PE on data from two different institutions: one as a hold-out dataset from the same institution as the training data and a second collected from an external institution to evaluate model generalizability to an unrelated population dataset. PENet achieved an AUROC of 0.84 [0.82-0.87] on detecting PE on the hold out internal test set and 0.85 [0.81-0.88] on external dataset. PENet also outperformed current state-of-the-art 3D CNN models. The results represent successful application of an end-to-end 3D CNN model for the complex task of PE diagnosis without requiring computationally intensive and time consuming preprocessing and demonstrates sustained performance on data from an external institution. Our model could be applied as a triage tool to automatically identify clinically important PEs allowing for prioritization for diagnostic radiology interpretation and improved care pathways via more efficient diagnosis.

19.
NPJ Digit Med ; 3: 23, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32140566

RESUMEN

Artificial intelligence (AI) algorithms continue to rival human performance on a variety of clinical tasks, while their actual impact on human diagnosticians, when incorporated into clinical workflows, remains relatively unexplored. In this study, we developed a deep learning-based assistant to help pathologists differentiate between two subtypes of primary liver cancer, hepatocellular carcinoma and cholangiocarcinoma, on hematoxylin and eosin-stained whole-slide images (WSI), and evaluated its effect on the diagnostic performance of 11 pathologists with varying levels of expertise. Our model achieved accuracies of 0.885 on a validation set of 26 WSI, and 0.842 on an independent test set of 80 WSI. Although use of the assistant did not change the mean accuracy of the 11 pathologists (p = 0.184, OR = 1.281), it significantly improved the accuracy (p = 0.045, OR = 1.499) of a subset of nine pathologists who fell within well-defined experience levels (GI subspecialists, non-GI subspecialists, and trainees). In the assisted state, model accuracy significantly impacted the diagnostic decisions of all 11 pathologists. As expected, when the model's prediction was correct, assistance significantly improved accuracy (p = 0.000, OR = 4.289), whereas when the model's prediction was incorrect, assistance significantly decreased accuracy (p = 0.000, OR = 0.253), with both effects holding across all pathologist experience levels and case difficulty levels. Our results highlight the challenges of translating AI models into the clinical setting, and emphasize the importance of taking into account potential unintended negative consequences of model assistance when designing and testing medical AI-assistance tools.

20.
IEEE Trans Pattern Anal Mach Intell ; 31(5): 824-40, 2009 May.
Artículo en Inglés | MEDLINE | ID: mdl-19299858

RESUMEN

We consider the problem of estimating detailed 3D structure from a single still image of an unstructured environment. Our goal is to create 3D models that are both quantitatively accurate as well as visually pleasing. For each small homogeneous patch in the image, we use a Markov Random Field (MRF) to infer a set of "plane parameters" that capture both the 3D location and 3D orientation of the patch. The MRF, trained via supervised learning, models both image depth cues as well as the relationships between different parts of the image. Other than assuming that the environment is made up of a number of small planes, our model makes no explicit assumptions about the structure of the scene; this enables the algorithm to capture much more detailed 3D structure than does prior art and also give a much richer experience in the 3D flythroughs created using image-based rendering, even for scenes with significant nonvertical structure. Using this approach, we have created qualitatively correct 3D models for 64.9 percent of 588 images downloaded from the Internet. We have also extended our model to produce large-scale 3D models from a few images.


Asunto(s)
Algoritmos , Inteligencia Artificial , Interpretación de Imagen Asistida por Computador/métodos , Imagenología Tridimensional/métodos , Almacenamiento y Recuperación de la Información/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Fotograbar/métodos , Aumento de la Imagen/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA