Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38630580

RESUMO

OBJECTIVE: To solve major clinical natural language processing (NLP) tasks using a unified text-to-text learning architecture based on a generative large language model (LLM) via prompt tuning. METHODS: We formulated 7 key clinical NLP tasks as text-to-text learning and solved them using one unified generative clinical LLM, GatorTronGPT, developed using GPT-3 architecture and trained with up to 20 billion parameters. We adopted soft prompts (ie, trainable vectors) with frozen LLM, where the LLM parameters were not updated (ie, frozen) and only the vectors of soft prompts were updated, known as prompt tuning. We added additional soft prompts as a prefix to the input layer, which were optimized during the prompt tuning. We evaluated the proposed method using 7 clinical NLP tasks and compared them with previous task-specific solutions based on Transformer models. RESULTS AND CONCLUSION: The proposed approach achieved state-of-the-art performance for 5 out of 7 major clinical NLP tasks using one unified generative LLM. Our approach outperformed previous task-specific transformer models by ∼3% for concept extraction and 7% for relation extraction applied to social determinants of health, 3.4% for clinical concept normalization, 3.4%-10% for clinical abbreviation disambiguation, and 5.5%-9% for natural language inference. Our approach also outperformed a previously developed prompt-based machine reading comprehension (MRC) model, GatorTron-MRC, for clinical concept and relation extraction. The proposed approach can deliver the "one model for all" promise from training to deployment using a unified generative LLM.

2.
NPJ Digit Med ; 6(1): 210, 2023 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-37973919

RESUMO

There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billion words of text including (1) 82 billion words of clinical text from 126 clinical departments and approximately 2 million patients at the University of Florida Health and (2) 195 billion words of diverse general English text. We train GatorTronGPT using a GPT-3 architecture with up to 20 billion parameters and evaluate its utility for biomedical natural language processing (NLP) and healthcare text generation. GatorTronGPT improves biomedical natural language processing. We apply GatorTronGPT to generate 20 billion words of synthetic text. Synthetic NLP models trained using synthetic text generated by GatorTronGPT outperform models trained using real-world clinical text. Physicians' Turing test using 1 (worst) to 9 (best) scale shows that there are no significant differences in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights into the opportunities and challenges of LLMs for medical research and healthcare.

3.
Nature ; 619(7969): 357-362, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37286606

RESUMO

Physicians make critical time-constrained decisions every day. Clinical predictive models can help physicians and administrators make decisions by forecasting clinical and operational events. Existing structured data-based clinical predictive models have limited use in everyday practice owing to complexity in data processing, as well as model development and deployment1-3. Here we show that unstructured clinical notes from the electronic health record can enable the training of clinical language models, which can be used as all-purpose clinical predictive engines with low-resistance development and deployment. Our approach leverages recent advances in natural language processing4,5 to train a large language model for medical language (NYUTron) and subsequently fine-tune it across a wide range of clinical and operational predictive tasks. We evaluated our approach within our health system for five such tasks: 30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay prediction, and insurance denial prediction. We show that NYUTron has an area under the curve (AUC) of 78.7-94.9%, with an improvement of 5.36-14.7% in the AUC compared with traditional models. We additionally demonstrate the benefits of pretraining with clinical text, the potential for increasing generalizability to different sites through fine-tuning and the full deployment of our system in a prospective, single-arm trial. These results show the potential for using clinical language models in medicine to read alongside physicians and provide guidance at the point of care.


Assuntos
Tomada de Decisão Clínica , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Médicos , Humanos , Tomada de Decisão Clínica/métodos , Readmissão do Paciente , Mortalidade Hospitalar , Comorbidade , Tempo de Internação , Cobertura do Seguro , Área Sob a Curva , Sistemas Automatizados de Assistência Junto ao Leito/tendências , Ensaios Clínicos como Assunto
4.
NPJ Digit Med ; 5(1): 194, 2022 Dec 26.
Artigo em Inglês | MEDLINE | ID: mdl-36572766

RESUMO

There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model-GatorTron-using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on five clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve five clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og .

5.
PLoS One ; 17(10): e0273262, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36240135

RESUMO

The fundamental challenge in machine learning is ensuring that trained models generalize well to unseen data. We developed a general technique for ameliorating the effect of dataset shift using generative adversarial networks (GANs) on a dataset of 149,298 handwritten digits and dataset of 868,549 chest radiographs obtained from four academic medical centers. Efficacy was assessed by comparing area under the curve (AUC) pre- and post-adaptation. On the digit recognition task, the baseline CNN achieved an average internal test AUC of 99.87% (95% CI, 99.87-99.87%), which decreased to an average external test AUC of 91.85% (95% CI, 91.82-91.88%), with an average salvage of 35% from baseline upon adaptation. On the lung pathology classification task, the baseline CNN achieved an average internal test AUC of 78.07% (95% CI, 77.97-78.17%) and an average external test AUC of 71.43% (95% CI, 71.32-71.60%), with a salvage of 25% from baseline upon adaptation. Adversarial domain adaptation leads to improved model performance on radiographic data derived from multiple out-of-sample healthcare populations. This work can be applied to other medical imaging domains to help shape the deployment toolkit of machine learning in medicine.


Assuntos
Aprendizado Profundo , Aprendizado de Máquina , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Radiografia
6.
Math Biosci Eng ; 19(7): 6795-6813, 2022 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-35730283

RESUMO

A significant amount of clinical research is observational by nature and derived from medical records, clinical trials, and large-scale registries. While there is no substitute for randomized, controlled experimentation, such experiments or trials are often costly, time consuming, and even ethically or practically impossible to execute. Combining classical regression and structural equation modeling with matching techniques can leverage the value of observational data. Nevertheless, identifying variables of greatest interest in high-dimensional data is frequently challenging, even with application of classical dimensionality reduction and/or propensity scoring techniques. Here, we demonstrate that projecting high-dimensional medical data onto a lower-dimensional manifold using deep autoencoders and post-hoc generation of treatment/control cohorts based on proximity in the lower-dimensional space results in better matching of confounding variables compared to classical propensity score matching (PSM) in the original high-dimensional space (P<0.0001) and performs similarly to PSM models constructed by experts with prior knowledge of the underlying pathology when evaluated on predicting risk ratios from real-world clinical data. Thus, in cases when the underlying problem is poorly understood and the data is high-dimensional in nature, matching in the autoencoder latent space might be of particular benefit.


Assuntos
Projetos de Pesquisa , Estudos de Coortes , Humanos , Pontuação de Propensão
7.
Sci Rep ; 11(1): 7482, 2021 04 05.
Artigo em Inglês | MEDLINE | ID: mdl-33820942

RESUMO

Real-time seizure detection is a resource intensive process as it requires continuous monitoring of patients on stereoelectroencephalography. This study improves real-time seizure detection in drug resistant epilepsy (DRE) patients by developing patient-specific deep learning models that utilize a novel self-supervised dynamic thresholding approach. Deep neural networks were constructed on over 2000 h of high-resolution, multichannel SEEG and video recordings from 14 DRE patients. Consensus labels from a panel of epileptologists were used to evaluate model efficacy. Self-supervised dynamic thresholding exhibited improvements in positive predictive value (PPV; difference: 39.0%; 95% CI 4.5-73.5%; Wilcoxon-Mann-Whitney test; N = 14; p = 0.03) with similar sensitivity (difference: 14.3%; 95% CI - 21.7 to 50.3%; Wilcoxon-Mann-Whitney test; N = 14; p = 0.42) compared to static thresholds. In some models, training on as little as 10 min of SEEG data yielded robust detection. Cross-testing experiments reduced PPV (difference: 56.5%; 95% CI 25.8-87.3%; Wilcoxon-Mann-Whitney test; N = 14; p = 0.002), while multimodal detection significantly improved sensitivity (difference: 25.0%; 95% CI 0.2-49.9%; Wilcoxon-Mann-Whitney test; N = 14; p < 0.05). Self-supervised dynamic thresholding improved the efficacy of real-time seizure predictions. Multimodal models demonstrated potential to improve detection. These findings are promising for future deployment in epilepsy monitoring units to enable real-time seizure detection without annotated data and only minimal training time in individual patients.


Assuntos
Eletroencefalografia , Convulsões/diagnóstico por imagem , Técnicas Estereotáxicas , Gravação em Vídeo , Algoritmos , Fenômenos Eletrofisiológicos , Feminino , Humanos , Masculino , Imagem Multimodal , Redes Neurais de Computação , Convulsões/fisiopatologia , Adulto Jovem
8.
Radiol Artif Intell ; 3(2): e200098, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33928257

RESUMO

PURPOSE: To train a deep learning classification algorithm to predict chest radiograph severity scores and clinical outcomes in patients with coronavirus disease 2019 (COVID-19). MATERIALS AND METHODS: In this retrospective cohort study, patients aged 21-50 years who presented to the emergency department (ED) of a multicenter urban health system from March 10 to 26, 2020, with COVID-19 confirmation at real-time reverse-transcription polymerase chain reaction screening were identified. The initial chest radiographs, clinical variables, and outcomes, including admission, intubation, and survival, were collected within 30 days (n = 338; median age, 39 years; 210 men). Two fellowship-trained cardiothoracic radiologists examined chest radiographs for opacities and assigned a clinically validated severity score. A deep learning algorithm was trained to predict outcomes on a holdout test set composed of patients with confirmed COVID-19 who presented between March 27 and 29, 2020 (n = 161; median age, 60 years; 98 men) for both younger (age range, 21-50 years; n = 51) and older (age >50 years, n = 110) populations. Bootstrapping was used to compute CIs. RESULTS: The model trained on the chest radiograph severity score produced the following areas under the receiver operating characteristic curves (AUCs): 0.80 (95% CI: 0.73, 0.88) for the chest radiograph severity score, 0.76 (95% CI: 0.68, 0.84) for admission, 0.66 (95% CI: 0.56, 0.75) for intubation, and 0.59 (95% CI: 0.49, 0.69) for death. The model trained on clinical variables produced an AUC of 0.64 (95% CI: 0.55, 0.73) for intubation and an AUC of 0.59 (95% CI: 0.50, 0.68) for death. Combining chest radiography and clinical variables increased the AUC of intubation and death to 0.88 (95% CI: 0.79, 0.96) and 0.82 (95% CI: 0.72, 0.91), respectively. CONCLUSION: The combination of imaging and clinical information improves outcome predictions.Supplemental material is available for this article.© RSNA, 2020.

9.
Sci Rep ; 11(1): 1381, 2021 01 14.
Artigo em Inglês | MEDLINE | ID: mdl-33446890

RESUMO

Early admission to the neurosciences intensive care unit (NSICU) is associated with improved patient outcomes. Natural language processing offers new possibilities for mining free text in electronic health record data. We sought to develop a machine learning model using both tabular and free text data to identify patients requiring NSICU admission shortly after arrival to the emergency department (ED). We conducted a single-center, retrospective cohort study of adult patients at the Mount Sinai Hospital, an academic medical center in New York City. All patients presenting to our institutional ED between January 2014 and December 2018 were included. Structured (tabular) demographic, clinical, bed movement record data, and free text data from triage notes were extracted from our institutional data warehouse. A machine learning model was trained to predict likelihood of NSICU admission at 30 min from arrival to the ED. We identified 412,858 patients presenting to the ED over the study period, of whom 1900 (0.5%) were admitted to the NSICU. The daily median number of ED presentations was 231 (IQR 200-256) and the median time from ED presentation to the decision for NSICU admission was 169 min (IQR 80-324). A model trained only with text data had an area under the receiver-operating curve (AUC) of 0.90 (95% confidence interval (CI) 0.87-0.91). A structured data-only model had an AUC of 0.92 (95% CI 0.91-0.94). A combined model trained on structured and text data had an AUC of 0.93 (95% CI 0.92-0.95). At a false positive rate of 1:100 (99% specificity), the combined model was 58% sensitive for identifying NSICU admission. A machine learning model using structured and free text data can predict NSICU admission soon after ED arrival. This may potentially improve ED and NSICU resource allocation. Further studies should validate our findings.


Assuntos
Serviço Hospitalar de Emergência , Hospitalização , Aprendizado de Máquina , Processamento de Linguagem Natural , Doenças do Sistema Nervoso/diagnóstico , Triagem , Adulto , Feminino , Humanos , Masculino , Neurociências , Cidade de Nova Iorque , Estudos Retrospectivos
10.
J Neurointerv Surg ; 12(1): 72-76, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31273074

RESUMO

INTRODUCTION: Improved functional outcomes after mechanical thrombectomy for emergent large vessel occlusion depend on expedient reperfusion after clinical presentation. Device technology has improved substantially over the years, and several commercial options exist for both large-bore aspiration catheters and suction pump systems. OBJECTIVE: To compare various vacuum pumps and examine the aspiration forces they generate as well as the force of catheter tip detachment from an artificial thrombus. METHODS: Using an artificial thrombus made from polyvinyl alcohol gel, we tested various mechanical characteristics of commercially available suction pumps, including the Penumbra Jet Engine, Penumbra Max, Stryker Medela AXS, Microvention Gomco, and a 60 cc syringe. Both aspiration pressure and tip force generated were analyzed. Subsequently, a cohort of thrombectomy catheters were assessed using the Penumbra Jet Engine to determine tip forces generated on an artificial thrombus. One-way analysis of variance was used to assess statistical significance. RESULTS: The Penumbra Jet Engine system generated both the highest maximum aspiration pressures (28.8 inches Hg) and the highest tip force (23.68 grams force (gf)) on an artificial thrombus, with statistical significance compared with the other pump systems. Using the Jet Engine, the largest-bore catheter was associated with the highest tip force (32.12 gf). The overall correlation coefficient between catheter inner diameter and tip force was 0.98. CONCLUSIONS: The Penumbra Jet Engine pump generates significantly higher vacuum pressures and tip forces than the other commercially available aspiration pump systems. Furthermore, catheters with a larger inner diameter generate higher tip suction forces on aspiration. Whether these mechanical features lead to improved clinical outcomes is yet to be determined.


Assuntos
Trombectomia/instrumentação , Trombectomia/métodos , Curetagem a Vácuo/instrumentação , Curetagem a Vácuo/métodos , Catéteres , Humanos , Sucção/instrumentação , Sucção/métodos , Seringas , Resultado do Tratamento
11.
J Neuroimaging ; 30(1): 40-44, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31721362

RESUMO

BACKGROUND AND PURPOSE: We aimed to evaluate the feasibility of an ultrafast whole head contrast-enhanced MRA (CE-MRA) in morphometric assessment of intracranial aneurysms in comparison to routinely used time-of-flight (TOF)-MRA. METHODS: In this prospective single institutional study, patients with known untreated intracranial aneurysm underwent MRA. Routine multislab TOF-MRA was obtained with a 3D voxel sizes of .6 × .6 × 1 (6-minute acquisition time). CE-MRA of whole head was obtained using Differential Subsampling with Cartesian Ordering (DISCO) and 2D Auto-calibrating Reconstruction for Cartesian imaging with a 3D voxel-sizes of .75 × .75 × 1 mm3 during a 6-second temporal resolution. Morphometric features of intracranial aneurysms, including size, aneurysm sac morphology, and the presence of intraluminal thrombosis, were assessed on both techniques. Statistical analysis was performed using a combination of Kappa test, Bland-Altman, and correlation coefficient analysis. RESULTS: A total of 34 aneurysms in 28 patients were included. Aneurysm size measurements (mean ± SD) were similar between DISCO-MRA (4.1 ± 2.3 mm) and TOF-MRA (4.3 ± 2.8 mm) (P = .27). Bland-Altman analysis showed a mean difference of .4 mm and there was excellent correlation r = .91 (95% CI: .87-.96). In six aneurysms (17.6%), TOF-MRA was nonconfidant to exclude intraluminal thrombosis. In seven aneurysms (20%), TOF-MRA was unable or nonconfidant in depicting aneurysm sac morphology. CONCLUSIONS: Described ultrafast high spatial-resolution MRA is superior to routinely used TOF-MRA in assessment of morphometric features of intracranial aneurysms, such as intraluminal thrombosis and aneurysm morphology, and is obtained in a fraction of the time (6 seconds).


Assuntos
Encéfalo/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Aneurisma Intracraniano/diagnóstico por imagem , Angiografia por Ressonância Magnética/métodos , Adulto , Idoso , Idoso de 80 Anos ou mais , Meios de Contraste , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Prospectivos , Sensibilidade e Especificidade
12.
PLoS One ; 14(2): e0211057, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30759094

RESUMO

This study trained long short-term memory (LSTM) recurrent neural networks (RNNs) incorporating an attention mechanism to predict daily sepsis, myocardial infarction (MI), and vancomycin antibiotic administration over two week patient ICU courses in the MIMIC-III dataset. These models achieved next-day predictive AUC of 0.876 for sepsis, 0.823 for MI, and 0.833 for vancomycin administration. Attention maps built from these models highlighted those times when input variables most influenced predictions and could provide a degree of interpretability to clinicians. These models appeared to attend to variables that were proxies for clinician decision-making, demonstrating a challenge of using flexible deep learning approaches trained with EHR data to build clinical decision support. While continued development and refinement is needed, we believe that such models could one day prove useful in reducing information overload for ICU physicians by providing needed clinical decision support for a variety of clinically important tasks.


Assuntos
Tomada de Decisão Clínica , Aprendizado Profundo , Diagnóstico por Computador , Unidades de Terapia Intensiva , Modelos Biológicos , Infarto do Miocárdio/diagnóstico , Sepse/diagnóstico , Antibacterianos/administração & dosagem , Tomada de Decisão Clínica/métodos , Humanos , Infarto do Miocárdio/patologia , Estudos Retrospectivos , Sepse/tratamento farmacológico , Sepse/patologia , Vancomicina/administração & dosagem
13.
Radiol Artif Intell ; 1(1): e180019, 2019 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-33937782

RESUMO

PURPOSE: To determine if weakly supervised learning with surrogate metrics and active transfer learning can hasten clinical deployment of deep learning models. MATERIALS AND METHODS: By leveraging Liver Tumor Segmentation (LiTS) challenge 2017 public data (n = 131 studies), natural language processing of reports, and an active learning method, a model was trained to segment livers on 239 retrospectively collected portal venous phase abdominal CT studies obtained between January 1, 2014, and December 31, 2016. Absolute volume differences between predicted and originally reported liver volumes were used to guide active learning and assess accuracy. Overall survival based on liver volumes predicted by this model (n = 34 patients) versus radiology reports and Model for End-Stage Liver Disease with sodium (MELD-Na) scores was assessed. Differences in absolute liver volume were compared by using the paired Student t test, Bland-Altman analysis, and intraclass correlation; survival analysis was performed with the Kaplan-Meier method and a Mantel-Cox test. RESULTS: Data from patients with poor liver volume prediction (n = 10) with a model trained only with publicly available data were incorporated into an active learning method that trained a new model (LiTS data plus over- and underestimated active learning cases [LiTS-OU]) that performed significantly better on a held-out institutional test set (absolute volume difference of 231 vs 176 mL, P = .0005). In overall survival analysis, predicted liver volumes using the best active learning-trained model (LiTS-OU) were at least comparable with liver volumes extracted from radiology reports and MELD-Na scores in predicting survival. CONCLUSION: Active transfer learning using surrogate metrics facilitated deployment of deep learning models for clinically meaningful liver segmentation at a major liver transplant center.© RSNA, 2019Supplemental material is available for this article.

14.
PLoS Med ; 15(11): e1002683, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30399157

RESUMO

BACKGROUND: There is interest in using convolutional neural networks (CNNs) to analyze medical imaging to provide computer-aided diagnosis (CAD). Recent work has suggested that image classification CNNs may not generalize to new data as well as previously believed. We assessed how well CNNs generalized across three hospital systems for a simulated pneumonia screening task. METHODS AND FINDINGS: A cross-sectional design with multiple model training cohorts was used to evaluate model generalizability to external sites using split-sample validation. A total of 158,323 chest radiographs were drawn from three institutions: National Institutes of Health Clinical Center (NIH; 112,120 from 30,805 patients), Mount Sinai Hospital (MSH; 42,396 from 12,904 patients), and Indiana University Network for Patient Care (IU; 3,807 from 3,683 patients). These patient populations had an age mean (SD) of 46.9 years (16.6), 63.2 years (16.5), and 49.6 years (17) with a female percentage of 43.5%, 44.8%, and 57.3%, respectively. We assessed individual models using the area under the receiver operating characteristic curve (AUC) for radiographic findings consistent with pneumonia and compared performance on different test sets with DeLong's test. The prevalence of pneumonia was high enough at MSH (34.2%) relative to NIH and IU (1.2% and 1.0%) that merely sorting by hospital system achieved an AUC of 0.861 (95% CI 0.855-0.866) on the joint MSH-NIH dataset. Models trained on data from either NIH or MSH had equivalent performance on IU (P values 0.580 and 0.273, respectively) and inferior performance on data from each other relative to an internal test set (i.e., new data from within the hospital system used for training data; P values both <0.001). The highest internal performance was achieved by combining training and test data from MSH and NIH (AUC 0.931, 95% CI 0.927-0.936), but this model demonstrated significantly lower external performance at IU (AUC 0.815, 95% CI 0.745-0.885, P = 0.001). To test the effect of pooling data from sites with disparate pneumonia prevalence, we used stratified subsampling to generate MSH-NIH cohorts that only differed in disease prevalence between training data sites. When both training data sites had the same pneumonia prevalence, the model performed consistently on external IU data (P = 0.88). When a 10-fold difference in pneumonia rate was introduced between sites, internal test performance improved compared to the balanced model (10× MSH risk P < 0.001; 10× NIH P = 0.002), but this outperformance failed to generalize to IU (MSH 10× P < 0.001; NIH 10× P = 0.027). CNNs were able to directly detect hospital system of a radiograph for 99.95% NIH (22,050/22,062) and 99.98% MSH (8,386/8,388) radiographs. The primary limitation of our approach and the available public data is that we cannot fully assess what other factors might be contributing to hospital system-specific biases. CONCLUSION: Pneumonia-screening CNNs achieved better internal than external performance in 3 out of 5 natural comparisons. When models were trained on pooled data from sites with different pneumonia prevalence, they performed better on new pooled data from these sites but not on external data. CNNs robustly identified hospital system and department within a hospital, which can have large differences in disease burden and may confound predictions.


Assuntos
Aprendizado Profundo , Diagnóstico por Computador/métodos , Pneumonia/diagnóstico por imagem , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Radiografia Torácica/métodos , Adulto , Idoso , Estudos Transversais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Sistemas de Informação em Radiologia , Reprodutibilidade dos Testes , Estudos Retrospectivos , Estados Unidos
15.
Oper Neurosurg (Hagerstown) ; 15(2): 184-193, 2018 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-29040677

RESUMO

BACKGROUND: The use of intraoperative navigation during microscope cases can be limited when attention needs to be divided between the operative field and the navigation screens. Heads-up display (HUD), also referred to as augmented reality, permits visualization of navigation information during surgery workflow. OBJECTIVE: To detail our initial experience with HUD. METHODS: We retrospectively reviewed patients who underwent HUD-assisted surgery from April 2016 through April 2017. All lesions were assessed for accuracy and those from the latter half of the study were assessed for utility. RESULTS: Seventy-nine patients with 84 pathologies were included. Pathologies included aneurysms (14), arteriovenous malformations (6), cavernous malformations (5), intracranial stenosis (3), meningiomas (27), metastasis (4), craniopharygniomas (4), gliomas (4), schwannomas (3), epidermoid/dermoids (3), pituitary adenomas (2) hemangioblastoma (2), choroid plexus papilloma (1), lymphoma (1), osteoblastoma (1), clival chordoma (1), cerebrospinal fluid leak (1), abscess (1), and a cerebellopontine angle Teflon granuloma (1). Fifty-nine lesions were deep and 25 were superficial. Structures identified included the lesion (81), vessels (48), and nerves/brain tissue (31). Accuracy was deemed excellent (71.4%), good (20.2%), or poor (8.3%). Deep lesions were less likely to have excellent accuracy (P = .029). HUD was used during bed/head positioning (50.0%), skin incision (17.3%), craniotomy (23.1%), dural opening (26.9%), corticectomy (13.5%), arachnoid opening (36.5%), and intracranial drilling (13.5%). HUD was deactivated at some point during the surgery in 59.6% of cases. There were no complications related to HUD use. CONCLUSION: HUD can be safely used for a wide variety of vascular and oncologic intracranial pathologies and can be utilized during multiple stages of surgery.


Assuntos
Neoplasias Encefálicas/cirurgia , Aneurisma Intracraniano/cirurgia , Neuronavegação/métodos , Cirurgia Assistida por Computador/métodos , Neoplasias Encefálicas/diagnóstico por imagem , Feminino , Humanos , Imageamento Tridimensional , Aneurisma Intracraniano/diagnóstico por imagem , Imageamento por Ressonância Magnética , Masculino , Pessoa de Meia-Idade , Procedimentos Neurocirúrgicos/métodos , Estudos Retrospectivos
16.
Phys Rev E ; 95(2-1): 022102, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-28297958

RESUMO

According to the van der Waals picture, attractive and repulsive forces play distinct roles in the structure of simple fluids. Here, we examine their roles in dynamics; specifically, in the degree of deterministic chaos using the Kolmogorov-Sinai (KS) entropy rate and the spectra of Lyapunov exponents. With computer simulations of three-dimensional Lennard-Jones and Weeks-Chandler-Andersen fluids, we find repulsive forces dictate these dynamical properties, with attractive forces reducing the KS entropy at a given thermodynamic state. Regardless of interparticle forces, the maximal Lyapunov exponent is intensive for systems ranging from 200 to 2000 particles. Our finite-size scaling analysis also shows that the KS entropy is both extensive (a linear function of system-size) and additive. Both temperature and density control the "dynamical chemical potential," the rate of linear growth of the KS entropy with system size. At fixed system-size, both the KS entropy and the largest exponent exhibit a maximum as a function of density. We attribute the maxima to the competition between two effects: as particles are forced to be in closer proximity, there is an enhancement from the sharp curvature of the repulsive potential and a suppression from the diminishing free volume and particle mobility. The extensivity and additivity of the KS entropy and the intensivity of the largest Lyapunov exponent, however, hold over a range of temperatures and densities across the liquid and liquid-vapor coexistence regimes.

17.
World Neurosurg ; 89: 1-8, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-26724633

RESUMO

OBJECTIVES: Although technical skills are fundamental in neurosurgery, there is little agreement on how to describe, measure, or compare skills among surgeons. The primary goal of this study was to develop a quantitative grading scale for technical surgical performance that distinguishes operator skill when graded by domain experts (residents, attendings, and nonsurgeons). Scores provided by raters should be highly reliable with respect to scores from other observers. METHODS: Neurosurgery residents were fitted with a head-mounted video camera while performing craniotomies under attending supervision. Seven videos, 1 from each postgraduate year (PGY) level (1-7), were anonymized and scored by 16 attendings, 8 residents, and 7 nonsurgeons using a grading scale. Seven skills were graded: incision, efficiency of instrument use, cauterization, tissue handling, drilling/craniotomy, confidence, and training level. RESULTS: A strong correlation was found between skills score and PGY year (P < 0.001, analysis of variance). Junior residents (PGY 1-3) had significantly lower scores than did senior residents (PGY 4-7, P < 0.001, t test). Significant variation among junior residents was observed, and senior residents' scores were not significantly different from one another. Interrater reliability, measured against other observers, was high (r = 0.581 ± 0.245, Spearman), as was assessment of resident training level (r = 0.583 ± 0.278, Spearman). Both variables were strongly correlated (r = 0.90, Pearson). Attendings, residents, and nonsurgeons did not score differently (P = 0.46, analysis of variance). CONCLUSIONS: Technical skills of neurosurgery residents recorded during craniotomy can be measured with high interrater reliability. Surgeons and nonsurgeons alike readily distinguish different skill levels. This type of assessment could be used to coach residents, to track performance over time, and potentially to compare skill levels. Developing an objective tool to evaluate surgical performance would be useful in several areas of neurosurgery education.


Assuntos
Competência Clínica , Craniotomia/educação , Internato e Residência , Neurocirurgia/educação , Gravação de Videoteipe , Humanos , Variações Dependentes do Observador , Reprodutibilidade dos Testes , Gravação de Videoteipe/instrumentação , Gravação de Videoteipe/métodos
18.
Int J Comput Assist Radiol Surg ; 10(11): 1853-62, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-25805306

RESUMO

PURPOSE: Develop measures to differentiate between experienced and inexperienced neurosurgeons in a virtual reality brain surgery simulator environment. METHODS: Medical students (n = 71) and neurosurgery residents (n = 12) completed four simulated Glioblastoma multiforme resections. Simulated surgeries took place over four days with intermittent spacing in between (average time between surgeries of 4.77 ± 0.73 days). The volume of tumor removed (cc), volume of healthy brain removed (cc), and instrument path length (mm) were recorded. Additionally, surgical effectiveness (% tumor removed divided by % healthy brain removed) and efficiency (% tumor removed divided by instrument movement in mm) were calculated. Performance was compared (1) between groups, and (2) for each participant over time to assess the learning curve. In addition, the effect of real-time instruction ("coaching") was assessed with a randomly selected group of medical students. RESULTS: Neurosurgery residents removed less healthy brain, were more effective in removing tumor and sparing healthy brain tissue, required less instrument movement, and were more efficient in removing tumor tissue than medical students. Medical students approached the resident level of performance over serial sessions. Coached medical students showed more conservative surgical behavior, removing both less tumor and less healthy brain. In sum, neurosurgery residents removed more tumor, removed less healthy brain, and required less instrument movement than medical students. Coaching modified medical student performance. CONCLUSIONS: Virtual Reality brain surgery can differentiate operators based on both recent and long-term experience and may be useful in the acquisition and assessment of neurosurgical skills. Coaching alters the learning curve of naïve inexperienced individuals.


Assuntos
Neoplasias Encefálicas/cirurgia , Simulação por Computador , Glioblastoma/cirurgia , Internato e Residência , Curva de Aprendizado , Neurocirurgia/normas , Procedimentos Neurocirúrgicos/normas , Estudantes de Medicina , Interface Usuário-Computador , Competência Clínica , Computadores , Feminino , Humanos , Masculino , Modelos Anatômicos , Neurocirurgia/educação , Procedimentos Neurocirúrgicos/educação
19.
Proc Natl Acad Sci U S A ; 110(41): 16339-43, 2013 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-24065832

RESUMO

Connections between microscopic dynamical observables and macroscopic nonequilibrium (NE) properties have been pursued in statistical physics since Boltzmann, Gibbs, and Maxwell. The simulations we describe here establish a relationship between the Kolmogorov-Sinai entropy and the energy dissipated as heat from a NE system to its environment. First, we show that the Kolmogorov-Sinai or dynamical entropy can be separated into system and bath components and that the entropy of the system characterizes the dynamics of energy dissipation. Second, we find that the average change in the system dynamical entropy is linearly related to the average change in the energy dissipated to the bath. The constant energy and time scales of the bath fix the dynamical relationship between these two quantities. These results provide a link between microscopic dynamical variables and the macroscopic energetics of NE processes.


Assuntos
Entropia , Temperatura Alta , Modelos Teóricos , Termodinâmica , Simulação por Computador , Nanoestruturas
20.
Anal Bioanal Chem ; 401(6): 1949-61, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21789488

RESUMO

There has been a recent surge in applications of mass spectrometry (MS) to tissue analysis, particularly lipid-based tissue imaging using ambient ionization techniques. This recent growth highlights the need to examine the effects of sample handling, storage conditions, and experimental protocols on the quality of the data obtained. Variables such as time before freezing after organ removal, storage time at -80 °C, time stored at room temperature, heating, and freeze/thaw cycles were investigated for their effect on the data quality obtained in desorption electrospray ionization (DESI)-MS using mouse brain. In addition, analytical variables such as tissue thickness, drying times, and instrumental conditions were also examined for their impact on DESI-MS data. While no immediate changes were noted in the DESI-MS lipid profiles of the mouse brain tissue after spending 1 h at room temperature when compared to being frozen immediately following removal, minor changes were noted between the tissue samples after 7 months of storage at -80 °C. In tissue sections stored at room temperature, degradation was noted in 24 h by the appearance of fatty acid dimers, which are indicative of high fatty acid concentrations, while in contrast, those sections stored at -80 °C for 7 months showed no significant degradation. Tissue sections were also subjected to up to six freeze/thaw cycles and showed increasing degradation following each cycle. In addition, tissue pieces were subjected to 50 °C temperatures and analyzed at specific time points. In as little as 2 h, degradation was observed in the form of increased fatty acid dimer formation, indicating that enzymatic processes forming free fatty acids were still active in the tissue. We have associated these dimers with high concentrations of free fatty acids present in the tissue during DESI-MS experiments. Analytical variables such as tissue thickness and time left to dry under nitrogen were also investigated, with no change in the resulting profiles at thickness from 10 to 25 µm and with optimal signal obtained after just 20 min in the dessicator. Experimental conditions such as source parameters, spray solvents, and sample surfaces are all shown to impact the quality of the data. Inter-section (relative standard deviation (%RSD), 0.44-7.2%) and intra-sample (%RSD, 4.0-8.0%) reproducibility data show the high quality information DESI-MS provides. Overall, the many variables investigated here showed DESI-MS to be a robust technique, with sample storage conditions having the most effect on the data obtained, and with unacceptable sample degradation occurring during room temperature storage.


Assuntos
Química Encefálica , Lipídeos/análise , Espectrometria de Massas por Ionização por Electrospray/métodos , Animais , Ácidos Graxos/análise , Congelamento , Camundongos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA