Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 294
Filtrar
2.
Epilepsy Res ; 207: 107451, 2024 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-39276641

RESUMEN

OBJECTIVES: Monitoring seizure control metrics is key to clinical care of patients with epilepsy. Manually abstracting these metrics from unstructured text in electronic health records (EHR) is laborious. We aimed to abstract the date of last seizure and seizure frequency from clinical notes of patients with epilepsy using natural language processing (NLP). METHODS: We extracted seizure control metrics from notes of patients seen in epilepsy clinics from two hospitals in Boston. Extraction was performed with the pretrained model RoBERTa_for_seizureFrequency_QA, for both date of last seizure and seizure frequency, combined with regular expressions. We designed the algorithm to categorize the timing of last seizure ("today", "1-6 days ago", "1-4 weeks ago", "more than 1-3 months ago", "more than 3-6 months ago", "more than 6-12 months ago", "more than 1-2 years ago", "more than 2 years ago") and seizure frequency ("innumerable", "multiple", "daily", "weekly", "monthly", "once per year", "less than once per year"). Our ground truth consisted of structured questionnaires filled out by physicians. Model performance was measured using the areas under the receiving operating characteristic curve (AUROC) and precision recall curve (AUPRC) for categorical labels, and median absolute error (MAE) for ordinal labels, with 95 % confidence intervals (CI) estimated via bootstrapping. RESULTS: Our cohort included 1773 adult patients with a total of 5658 visits with reported seizure control metrics, seen in epilepsy clinics between December 2018 and May 2022. The cohort average age was 42 years old, the majority were female (57 %), White (81 %) and non-Hispanic (85 %). The models achieved an MAE (95 % CI) for date of last seizure of 4 (4.00-4.86) weeks, and for seizure frequency of 0.02 (0.02-0.02) seizures per day. CONCLUSIONS: Our NLP approach demonstrates that the extraction of seizure control metrics from EHR is feasible allowing for large-scale EHR research.

3.
Ann Am Thorac Soc ; 2024 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-39288402

RESUMEN

RATIONALE: Multiple mechanisms are involved in the pathogenesis of obstructive sleep apnea (OSA). Elevated loop gain is a key target for precision OSA care and may be associated with treatment intolerance when the upper airway is the sole therapeutic target. Morphological or computational estimation of LG is not yet widely available or fully validated - there is a need for improved phenotyping/endotyping of apnea to advance its therapy and prognosis. OBJECTIVES: This study proposes a new algorithm to assess self-similarity as a signature of elevated loop gain using respiratory effort signals and presents its use to predict the probability of acute failure (high residual event counts) of continuous positive airway pressure (CPAP) therapy. METHODS: Effort signals from 2145 split-night polysomnography studies from the Massachusetts General Hospital were analyzed for SS and used to predict acute CPAP therapy effectiveness. Logistic regression models were trained and evaluated using 5-fold cross-validation. RESULTS: Receiver operating characteristic (ROC) and precision-recall (PR) curves with AUC values of 0.82 and 0.84, respectively, were obtained. Self-similarity combined with the central apnea index (CAI) and hypoxic burden outperformed CAI alone. Even in those with a low CAI by conventional scoring criteria or only mild desaturation, SS was related to poor therapy outcomes. CONCLUSIONS: The proposed algorithm for assessing SS as a measure of expressed high loop gain is accurate, non-invasive, and has the potential to improve phenotyping/endotyping of apnea, leading to more precise sleep apnea treatment strategies.

4.
Sleep ; 2024 Aug 19.
Artículo en Inglés | MEDLINE | ID: mdl-39155830

RESUMEN

The ability to assess sleep at home, capture sleep stages, and detect the occurrence of apnea (without on-body sensors) simply by analyzing the radio waves bouncing off people's bodies while they sleep is quite powerful. Such a capability would allow for longitudinal data collection in patients' homes, informing our understanding of sleep and its interaction with various diseases and their therapeutic responses, both in clinical trials and routine care. In this article, we develop an advanced machine learning algorithm for passively monitoring sleep and nocturnal breathing from radio waves reflected off people while asleep. Validation results in comparison with the gold standard (i.e., polysomnography) (n=880) demonstrate that the model captures the sleep hypnogram (with an accuracy of 80.5% for 30-second epochs categorized into Wake, Light Sleep, Deep Sleep, or REM), detects sleep apnea (AUROC = 0.89), and measures the patient's Apnea-Hypopnea Index (ICC=0.90; 95% CI = [0.88, 0.91]). Notably, the model exhibits equitable performance across race, sex, and age. Moreover, the model uncovers informative interactions between sleep stages and a range of diseases including neurological, psychiatric, cardiovascular, and immunological disorders. These findings not only hold promise for clinical practice and interventional trials but also underscore the significance of sleep as a fundamental component in understanding and managing various diseases.

5.
Sleep ; 2024 Aug 31.
Artículo en Inglés | MEDLINE | ID: mdl-39215679

RESUMEN

STUDY OBJECTIVES: This study aimed to 1) improve sleep staging accuracy through transfer learning, to achieve or exceede human inter-expert agreement; 2) introduce a scorability model to assess the quality and trustworthiness of automated sleep staging. METHODS: A deep neural network (base model) was trained on a large multi-site polysomnography (PSG) dataset from the United States. Transfer learning was used to calibrate the model to a reduced montage and limited samples from the Korean Genome and Epidemiology Study (KoGES) dataset. Model performance was compared to inter-expert reliability among three human experts. A scorability assessment was developed to predict the agreement between the model and human experts. RESULTS: Initial sleep staging by the base model showed lower agreement with experts (κ=0.55) compared to inter-expert agreement (κ=0.62). Calibration with 324 randomly sampled training cases matched expert agreement levels. Further targeted sampling improved performance, with models exceeding inter-expert agreement (κ=0.70). The scorability assessment, combining biosignal quality and model confidence features, predicted model-expert agreement moderately well (R²=0.42). Recordings with higher scorability scores demonstrated greater model-expert agreement than inter-expert agreement. Even with lower scorability scores, model performance was comparable to inter-expert agreement. CONCLUSIONS: Fine-tuning a pre-trained neural network through targeted transfer learning significantly enhances sleep staging performance for an atypical montage, achieving and surpassing human expert agreement levels. The introduction of a scorability assessment provides a robust measure of reliability, ensuring quality control and enhancing the practical application of the system before deployment. This approach marks an important advancement in automated sleep analysis, demonstrating the potential for AI to exceed human performance in clinical settings.

6.
Front Psychiatry ; 15: 1373797, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39109366

RESUMEN

Introduction: The 21-point Brain Care Score (BCS) is a novel tool designed to motivate individuals and care providers to take action to reduce the risk of stroke and dementia by encouraging lifestyle changes. Given that late-life depression is increasingly recognized to share risk factors with stroke and dementia, and is an important clinical endpoint for brain health, we tested the hypothesis that a higher BCS is associated with a reduced incidence of future depression. Additionally, we examined its association with a brain health composite outcome comprising stroke, dementia, and late-life depression. Methods: The BCS was derived from the United Kingdom Biobank baseline evaluation in participants with complete data on BCS items. Associations of BCS with the risk of subsequent incident late-life depression and the composite brain health outcome were estimated using multivariable Cox proportional hazard models. These models were adjusted for age at baseline and sex assigned at birth. Results: A total of 363,323 participants were included in this analysis, with a median BCS at baseline of 12 (IQR: 11-14). There were 6,628 incident cases of late-life depression during a median follow-up period of 13 years. Each five-point increase in baseline BCS was associated with a 33% lower risk of incident late-life depression (95% CI: 29%-36%) and a 27% lower risk of the incident composite outcome (95% CI: 24%-30%). Discussion: These data further demonstrate the shared risk factors across depression, dementia, and stroke. The findings suggest that a higher BCS, indicative of healthier lifestyle choices, is significantly associated with a lower incidence of late-life depression and a composite brain health outcome. Additional validation of the BCS is warranted to assess the weighting of its components, its motivational aspects, and its acceptability and adaptability in routine clinical care worldwide.

7.
Am J Epidemiol ; 2024 Jul 26.
Artículo en Inglés | MEDLINE | ID: mdl-39060160

RESUMEN

Fall-related injuries (FRIs) are a major cause of hospitalizations among older patients, but identifying them in unstructured clinical notes poses challenges for large-scale research. In this study, we developed and evaluated Natural Language Processing (NLP) models to address this issue. We utilized all available clinical notes from the Mass General Brigham for 2,100 older adults, identifying 154,949 paragraphs of interest through automatic scanning for FRI-related keywords. Two clinical experts directly labeled 5,000 paragraphs to generate benchmark-standard labels, while 3,689 validated patterns were annotated, indirectly labeling 93,157 paragraphs as validated-standard labels. Five NLP models, including vanilla BERT, RoBERTa, Clinical-BERT, Distil-BERT, and SVM, were trained using 2,000 benchmark paragraphs and all validated paragraphs. BERT-based models were trained in three stages: Masked Language Modeling, General Boolean Question Answering (QA), and QA for FRI. For validation, 500 benchmark paragraphs were used, and the remaining 2,500 for testing. Performance metrics (precision, recall, F1 scores, Area Under ROC [AUROC] or Precision-Recall [AUPR] curves) were employed by comparison, with RoBERTa showing the best performance. Precision was 0.90 [0.88-0.91], recall [0.90-0.93], F1 score 0.90 [0.89-0.92], AUROC and AUPR curves of 0.96 [0.95-0.97]. These NLP models accurately identify FRIs from unstructured clinical notes, potentially enhancing clinical notes-based research efficiency.

8.
Neurology ; 103(4): e209687, 2024 Aug 27.
Artículo en Inglés | MEDLINE | ID: mdl-39052961

RESUMEN

OBJECTIVES: To investigate associations between health-related behaviors as measured using the Brain Care Score (BCS) and neuroimaging markers of white matter injury. METHODS: This prospective cohort study in the UK Biobank assessed the BCS, a novel tool designed to empower patients to address 12 dementia and stroke risk factors. The BCS ranges from 0 to 21, with higher scores suggesting better brain care. Outcomes included white matter hyperintensities (WMH) volume, fractional anisotropy (FA), and mean diffusivity (MD) obtained during 2 imaging assessments, as well as their progression between assessments, using multivariable linear regression adjusted for age and sex. RESULTS: We included 34,509 participants (average age 55 years, 53% female) with no stroke or dementia history. At first and repeat imaging assessments, every 5-point increase in baseline BCS was linked to significantly lower WMH volumes (25% 95% CI [23%-27%] first, 33% [27%-39%] repeat) and higher FA (18% [16%-20%] first, 22% [15%-28%] repeat), with a decrease in MD (9% [7%-11%] first, 10% [4%-16%] repeat). In addition, a higher baseline BCS was associated with a 10% [3%-17%] reduction in WMH progression and FA decline over time. DISCUSSION: This study extends the impact of the BCS to neuroimaging markers of clinically silent cerebrovascular disease. Our results suggest that improving one's BCS could be a valuable intervention to prevent early brain health decline.


Asunto(s)
Neuroimagen , Humanos , Femenino , Masculino , Persona de Mediana Edad , Neuroimagen/métodos , Estudios Prospectivos , Encéfalo/diagnóstico por imagen , Sustancia Blanca/diagnóstico por imagen , Sustancia Blanca/patología , Imagen por Resonancia Magnética , Estudios de Cohortes , Imagen de Difusión Tensora , Factores de Riesgo , Anciano , Adulto
9.
Neurocrit Care ; 2024 Jul 24.
Artículo en Inglés | MEDLINE | ID: mdl-39043984

RESUMEN

BACKGROUND: Identical bursts on electroencephalography (EEG) are considered a specific predictor of poor outcomes in cardiac arrest, but its relationship with structural brain injury severity on magnetic resonance imaging (MRI) is not known. METHODS: This was a retrospective analysis of clinical, EEG, and MRI data from adult comatose patients after cardiac arrest. Burst similarity in first 72 h from the time of return of spontaneous circulation were calculated using dynamic time-warping (DTW) for bursts of equal (i.e., 500 ms) and varying (i.e., 100-500 ms) lengths and cross-correlation for bursts of equal lengths. Structural brain injury severity was measured using whole brain mean apparent diffusion coefficient (ADC) on MRI. Pearson's correlation coefficients were calculated between mean burst similarity across consecutive 12-24-h time blocks and mean whole brain ADC values. Good outcome was defined as Cerebral Performance Category of 1-2 (i.e., independence for activities of daily living) at the time of hospital discharge. RESULTS: Of 113 patients with cardiac arrest, 45 patients had burst suppression (mean cardiac arrest to MRI time 4.3 days). Three study participants with burst suppression had a good outcome. Burst similarity calculated using DTW with bursts of varying lengths was correlated with mean ADC value in the first 36 h after cardiac arrest: Pearson's r: 0-12 h: - 0.69 (p = 0.039), 12-24 h: - 0.54 (p = 0.002), 24-36 h: - 0.41 (p = 0.049). Burst similarity measured with bursts of equal lengths was not associated with mean ADC value with cross-correlation or DTW, except for DTW at 60-72 h (- 0.96, p = 0.04). CONCLUSIONS: Burst similarity on EEG after cardiac arrest may be associated with acute brain injury severity on MRI. This association was time dependent when measured using DTW.

10.
Ann Clin Transl Neurol ; 11(7): 1681-1690, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38867375

RESUMEN

BACKGROUND/OBJECTIVES: Epileptiform activity (EA), including seizures and periodic patterns, worsens outcomes in patients with acute brain injuries (e.g., aneurysmal subarachnoid hemorrhage [aSAH]). Randomized control trials (RCTs) assessing anti-seizure interventions are needed. Due to scant drug efficacy data and ethical reservations with placebo utilization, and complex physiology of acute brain injury, RCTs are lacking or hindered by design constraints. We used a pharmacological model-guided simulator to design and determine the feasibility of RCTs evaluating EA treatment. METHODS: In a single-center cohort of adults (age >18) with aSAH and EA, we employed a mechanistic pharmacokinetic-pharmacodynamic framework to model treatment response using observational data. We subsequently simulated RCTs for levetiracetam and propofol, each with three treatment arms mirroring clinical practice and an additional placebo arm. Using our framework, we simulated EA trajectories across treatment arms. We predicted discharge modified Rankin Scale as a function of baseline covariates, EA burden, and drug doses using a double machine learning model learned from observational data. Differences in outcomes across arms were used to estimate the required sample size. RESULTS: Sample sizes ranged from 500 for levetiracetam 7 mg/kg versus placebo, to >4000 for levetiracetam 15 versus 7 mg/kg to achieve 80% power (5% type I error). For propofol 1 mg/kg/h versus placebo, 1200 participants were needed. Simulations comparing propofol at varying doses did not reach 80% power even at samples >1200. CONCLUSIONS: Our simulations using drug efficacy show sample sizes are infeasible, even for potentially unethical placebo-control trials. We highlight the strength of simulations with observational data to inform the null hypotheses and propose use of this simulation-based RCT paradigm to assess the feasibility of future trials of anti-seizure treatment in acute brain injury.


Asunto(s)
Anticonvulsivantes , Levetiracetam , Convulsiones , Humanos , Anticonvulsivantes/administración & dosificación , Levetiracetam/administración & dosificación , Convulsiones/tratamiento farmacológico , Convulsiones/etiología , Adulto , Persona de Mediana Edad , Masculino , Femenino , Propofol/administración & dosificación , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Lesiones Encefálicas/tratamiento farmacológico , Lesiones Encefálicas/complicaciones , Hemorragia Subaracnoidea/tratamiento farmacológico , Hemorragia Subaracnoidea/complicaciones , Anciano , Proyectos de Investigación
11.
NEJM AI ; 1(6)2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38872809

RESUMEN

BACKGROUND: In intensive care units (ICUs), critically ill patients are monitored with electroencephalography (EEG) to prevent serious brain injury. EEG monitoring is constrained by clinician availability, and EEG interpretation can be subjective and prone to interobserver variability. Automated deep-learning systems for EEG could reduce human bias and accelerate the diagnostic process. However, existing uninterpretable (black-box) deep-learning models are untrustworthy, difficult to troubleshoot, and lack accountability in real-world applications, leading to a lack of both trust and adoption by clinicians. METHODS: We developed an interpretable deep-learning system that accurately classifies six patterns of potentially harmful EEG activity - seizure, lateralized periodic discharges (LPDs), generalized periodic discharges (GPDs), lateralized rhythmic delta activity (LRDA), generalized rhythmic delta activity (GRDA), and other patterns - while providing faithful case-based explanations of its predictions. The model was trained on 50,697 total 50-second continuous EEG samples collected from 2711 patients in the ICU between July 2006 and March 2020 at Massachusetts General Hospital. EEG samples were labeled as one of the six EEG patterns by 124 domain experts and trained annotators. To evaluate the model, we asked eight medical professionals with relevant backgrounds to classify 100 EEG samples into the six pattern categories - once with and once without artificial intelligence (AI) assistance - and we assessed the assistive power of this interpretable system by comparing the diagnostic accuracy of the two methods. The model's discriminatory performance was evaluated with area under the receiver-operating characteristic curve (AUROC) and area under the precision-recall curve. The model's interpretability was measured with task-specific neighborhood agreement statistics that interrogated the similarities of samples and features. In a separate analysis, the latent space of the neural network was visualized by using dimension reduction techniques to examine whether the ictal-interictal injury continuum hypothesis, which asserts that seizures and seizure-like patterns of brain activity lie along a spectrum, is supported by data. RESULTS: The performance of all users significantly improved when provided with AI assistance. Mean user diagnostic accuracy improved from 47 to 71% (P<0.04). The model achieved AUROCs of 0.87, 0.93, 0.96, 0.92, 0.93, and 0.80 for the classes seizure, LPD, GPD, LRDA, GRDA, and other patterns, respectively. This performance was significantly higher than that of a corresponding uninterpretable black-box model (with P<0.0001). Videos traversing the ictal-interictal injury manifold from dimension reduction (a two-dimensional representation of the original high-dimensional feature space) give insight into the layout of EEG patterns within the network's latent space and illuminate relationships between EEG patterns that were previously hypothesized but had not yet been shown explicitly. These results indicate that the ictal-interictal injury continuum hypothesis is supported by data. CONCLUSIONS: Users showed significant pattern classification accuracy improvement with the assistance of this interpretable deep-learning model. The interpretable design facilitates effective human-AI collaboration; this system may improve diagnosis and patient care in clinical settings. The model may also provide a better understanding of how EEG patterns relate to each other along the ictal-interictal injury continuum. (Funded by the National Science Foundation, National Institutes of Health, and others.).

12.
Epilepsia ; 65(7): e104-e112, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38776216

RESUMEN

Studies suggest that self-reported seizure diaries suffer from 50% under-reporting on average. It is unknown to what extent this impacts medication management. This study used simulation to predict the seizure outcomes of a large heterogeneous clinic population treated with a standardized algorithm based on self-reported seizures. Using CHOCOLATES, a state-of-the-art realistic seizure diary simulator, 100 000 patients were simulated over 10 years. A standard algorithm for medication management was employed at 3 month intervals for all patients. The impact on true seizure rates, expected seizure rates, and time-to-steady-dose were computed for self-reporting sensitivities 0%-100%. Time-to-steady-dose and medication use mostly did not depend on sensitivity. True seizure rate decreased minimally with increasing self-reporting in a non-linear fashion, with the largest decreases at low sensitivity rates (0%-10%). This study suggests that an extremely wide range of sensitivity will have similar seizure outcomes when patients are clinically treated using an algorithm similar to the one presented. Conversely, patients with sensitivity ≤10% would be expected to benefit (via lower seizure rates) from objective devices that provide even small improvements in seizure sensitivity.


Asunto(s)
Algoritmos , Anticonvulsivantes , Epilepsia , Convulsiones , Autoinforme , Humanos , Anticonvulsivantes/uso terapéutico , Epilepsia/tratamiento farmacológico , Convulsiones/tratamiento farmacológico , Convulsiones/diagnóstico , Masculino , Femenino , Resultado del Tratamiento , Simulación por Computador , Adulto
13.
J Neuropsychiatry Clin Neurosci ; : appineuropsych20230174, 2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38720623

RESUMEN

OBJECTIVE: Generalized periodic discharges are a repeated and generalized electroencephalography (EEG) pattern that can be seen in the context of altered mental status. This article describes a series of five individuals with generalized periodic discharges who demonstrated signs and symptoms of catatonia, a treatable neuropsychiatric condition. METHODS: Inpatients with a clinical diagnosis of catatonia, determined with the Bush-Francis Catatonia Rating Scale (BFCRS), and EEG recordings with generalized periodic discharges were analyzed in a retrospective case series. RESULTS: Five patients with catatonia and generalized periodic discharges on EEG were evaluated from among 106 patients with catatonia and contemporaneous EEG measurements. Four of these patients showed an improvement in catatonia severity when treated with benzodiazepines, with an average reduction of 6.75 points on the BFCRS. CONCLUSIONS: Among patients with generalized periodic discharges, catatonia should be considered, in the appropriate clinical context. Patients with generalized periodic discharges and catatonia may benefit from treatment with empiric trials of benzodiazepines.

14.
medRxiv ; 2024 May 16.
Artículo en Inglés | MEDLINE | ID: mdl-38798669

RESUMEN

Work is ongoing to advance seizure forecasting, but the performance metrics used to evaluate model effectiveness can sometimes lead to misleading outcomes. For example, some metrics improve when tested on patients with a particular range of seizure frequencies (SF). This study illustrates the connection between SF and metrics. Additionally, we compared benchmarks for testing performance: a moving average (MA) or the commonly used permutation benchmark. Three data sets were used for the evaluations: (1) Self-reported seizure diaries of 3,994 Seizure Tracker patients; (2) Automatically detected (and sometimes manually reported or edited) generalized tonic-clonic seizures from 2,350 Empatica Embrace 2 and Mate App seizure diary users, and (3) Simulated datasets with varying SFs. Metrics of calibration and discrimination were computed for each dataset, comparing MA and permutation performance across SF values. Most metrics were found to depend on SF. The MA model outperformed or matched the permutation model in all cases. The findings highlight SF's role in seizure forecasting accuracy and the MA model's suitability as a benchmark. This underscores the need for considering patient SF in forecasting studies and suggests the MA model may provide a better standard for evaluating future seizure forecasting models.

15.
Epilepsia ; 65(6): 1730-1736, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38606580

RESUMEN

OBJECTIVE: Recently, a deep learning artificial intelligence (AI) model forecasted seizure risk using retrospective seizure diaries with higher accuracy than random forecasts. The present study sought to prospectively evaluate the same algorithm. METHODS: We recruited a prospective cohort of 46 people with epilepsy; 25 completed sufficient data entry for analysis (median = 5 months). We used the same AI method as in our prior study. Group-level and individual-level Brier Skill Scores (BSSs) compared random forecasts and simple moving average forecasts to the AI. RESULTS: The AI had an area under the receiver operating characteristic curve of .82. At the group level, the AI outperformed random forecasting (BSS = .53). At the individual level, AI outperformed random in 28% of cases. At the group and individual level, the moving average outperformed the AI. If pre-enrollment (nonverified) diaries (with presumed underreporting) were included, the AI significantly outperformed both comparators. Surveys showed most did not mind poor-quality LOW-RISK or HIGH-RISK forecasts, yet 91% wanted access to these forecasts. SIGNIFICANCE: The previously developed AI forecasting tool did not outperform a very simple moving average forecasting in this prospective cohort, suggesting that the AI model should be replaced.


Asunto(s)
Predicción , Convulsiones , Humanos , Femenino , Masculino , Estudios Prospectivos , Adulto , Convulsiones/diagnóstico , Persona de Mediana Edad , Predicción/métodos , Epilepsia/diagnóstico , Inteligencia Artificial/tendencias , Adulto Joven , Aprendizaje Profundo/tendencias , Algoritmos , Diarios como Asunto , Estudios de Cohortes , Anciano
16.
medRxiv ; 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38562831

RESUMEN

Importance: The analysis of electronic medical records at scale to learn from clinical experience is currently very challenging. The integration of artificial intelligence (AI), specifically foundational large language models (LLMs), into an analysis pipeline may overcome some of the current limitations of modest input sizes, inaccuracies, biases, and incomplete knowledge bases. Objective: To explore the effectiveness of using an LLM for generating realistic clinical data and other LLMs for summarizing and synthesizing information in a model system, simulating a randomized clinical trial (RCT) in epilepsy to demonstrate the potential of inductive reasoning via medical chart review. Design: An LLM-generated simulated RCT based on a RCT for treatment with an antiseizure medication, cenobamate, including a placebo arm and a full-strength drug arm, evaluated by an LLM-based pipeline versus a human reader. Setting: Simulation based on realistic seizure diaries, treatment effects, reported symptoms and clinical notes generated by LLMs with multiple different neurologist writing styles. Participants: Simulated cohort of 240 patients, divided 1:1 into placebo and drug arms. Intervention: Utilization of LLMs for the generation of clinical notes and for the synthesis of data from these notes, aiming to evaluate the efficacy and safety of cenobamate in seizure control either with a human evaluator or AI-pipeline. Measures: The AI and human analysis focused on identifying the number of seizures, symptom reports, and treatment efficacy, with statistical analysis comparing the 50%-responder rate and median percentage change between the placebo and drug arms, as well as side effect rates in each arm. Results: AI closely mirrored human analysis, demonstrating the drug's efficacy with marginal differences (<3%) in identifying both drug efficacy and reported symptoms. Conclusions and Relevance: This study showcases the potential of LLMs accurately simulate and analyze clinical trials. Significantly, it highlights the ability of LLMs to reconstruct essential trial elements, identify treatment effects, and recognize reported symptoms, within a realistic clinical framework. The findings underscore the relevance of LLMs in future clinical research, offering a scalable, efficient alternative to traditional data mining methods without the need for specialized medical language training.

17.
Epileptic Disord ; 26(4): 444-459, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38669007

RESUMEN

OBJECTIVE: To assess the effectiveness of an educational program leveraging technology-enhanced learning and retrieval practice to teach trainees how to correctly identify interictal epileptiform discharges (IEDs). METHODS: This was a bi-institutional prospective randomized controlled educational trial involving junior neurology residents. The intervention consisted of three video tutorials focused on the six IFCN criteria for IED identification and rating 500 candidate IEDs with instant feedback either on a web browser (intervention 1) or an iOS app (intervention 2). The control group underwent no educational intervention ("inactive control"). All residents completed a survey and a test at the onset and offset of the study. Performance metrics were calculated for each participant. RESULTS: Twenty-one residents completed the study: control (n = 8); intervention 1 (n = 6); intervention 2 (n = 7). All but two had no prior EEG experience. Intervention 1 residents improved from baseline (mean) in multiple metrics including AUC (.74; .85; p < .05), sensitivity (.53; .75; p < .05), and level of confidence (LOC) in identifying IEDs/committing patients to therapy (1.33; 2.33; p < .05). Intervention 2 residents improved in multiple metrics including AUC (.81; .86; p < .05) and LOC in identifying IEDs (2.00; 3.14; p < .05) and spike-wave discharges (2.00; 3.14; p < .05). Controls had no significant improvements in any measure. SIGNIFICANCE: This program led to significant subjective and objective improvements in IED identification. Rating candidate IEDs with instant feedback on a web browser (intervention 1) generated greater objective improvement in comparison to rating candidate IEDs on an iOS app (intervention 2). This program can complement trainee education concerning IED identification.


Asunto(s)
Electroencefalografía , Internado y Residencia , Neurología , Humanos , Proyectos Piloto , Neurología/educación , Electroencefalografía/métodos , Epilepsia/fisiopatología , Epilepsia/diagnóstico , Estudios Prospectivos , Competencia Clínica , Adulto , Masculino , Femenino
18.
medRxiv ; 2024 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-38559062

RESUMEN

BACKGROUND: Multi-center electronic health records (EHR) can support quality improvement initiatives and comparative effectiveness research in stroke care. However, limitations of EHR-based research include challenges in abstracting key clinical variables from non-structured data at scale. This is further compounded by missing data. Here we develop a natural language processing (NLP) model that automatically reads EHR notes to determine the NIH stroke scale (NIHSS) score of patients with acute stroke. METHODS: The study included notes from acute stroke patients (>= 18 years) admitted to the Massachusetts General Hospital (MGH) (2015-2022). The MGH data were divided into training (70%) and hold-out test (30%) sets. A two-stage model was developed to predict the admission NIHSS. A linear model with the least absolute shrinkage and selection operator (LASSO) was trained within the training set. For notes in the test set where the NIHSS was documented, the scores were extracted using regular expressions (stage 1), for notes where NIHSS was not documented, LASSO was used for prediction (stage 2). The reference standard for NIHSS was obtained from Get With The Guidelines Stroke Registry. The two-stage model was tested on the hold-out test set and validated in the MIMIC-III dataset (Medical Information Mart for Intensive Care-MIMIC III 2001-2012) v1.4, using root mean squared error (RMSE) and Spearman correlation (SC). RESULTS: We included 4,163 patients (MGH = 3,876; MIMIC = 287); average age of 69 [SD 15] years; 53% male, and 72% white. 90% patients had ischemic stroke and 10% hemorrhagic stroke. The two-stage model achieved a RMSE [95% CI] of 3.13 [2.86-3.41] (SC = 0.90 [0.88-0. 91]) in the MGH hold-out test set and 2.01 [1.58-2.38] (SC = 0.96 [0.94-0.97]) in the MIMIC validation set. CONCLUSIONS: The automatic NLP-based model can enable large-scale stroke severity phenotyping from EHR and therefore support real-world quality improvement and comparative effectiveness studies in stroke.

19.
medRxiv ; 2024 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-38370813

RESUMEN

Background: Benzodiazepine use in older adults following acute ischemic stroke (AIS) is common, yet short-term safety concerning falls or fall-related injuries remains unexplored. Methods: We emulated a hypothetical randomized trial of benzodiazepine use during the acute post stroke recovery period to assess incidence of falls or fall related injuries in older adults. Using linked data from the Get With the Guidelines Registry and Mass General Brigham's electronic health records, we selected patients aged 65 and older admitted for Acute Ischemic Stroke (AIS) between 2014 and 2021 with no documented prior stroke and no benzodiazepine prescriptions in the previous 3 months. Potential for immortal-time and confounding biases was addressed via separate inverse-probability weighting strategies. Results: The study included 495 patients who initiated inpatient benzodiazepines within three days of admission and 2,564 who did not. After standardization, the estimated 10-day risk of falls or fall-related injuries was 694 events per 1000 (95% confidence interval CI: 676-709) for the benzodiazepine initiation strategy and 584 events per 1000 (95% CI: 575-595) for the non-initiation strategy. Subgroup analyses showed risk differences of 142 events per 1000 (95% CI: 111-165) and 85 events per 1000 (95% CI: 64-107) for patients aged 65 to 74 years and for those aged 75 years or older, respectively. Risk differences were 187 events per 1000 (95% CI: 159-206) for patients with minor (NIHSS≤ 4) AIS and 32 events per 1000 (95% CI: 10-58) for those with moderate-to-severe AIS. Conclusions: Initiating inpatient benzodiazepines within three days of AIS is associated with an elevated 10-day risk of falls or fall-related injuries, particularly for patients aged 65 to 74 years and for those with minor strokes. This underscores the need for caution with benzodiazepines, especially among individuals likely to be ambulatory during the acute and sub-acute post-stroke period.

20.
J Alzheimers Dis ; 98(1): 209-220, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38393904

RESUMEN

Background: Fractal motor activity regulation (FMAR), characterized by self-similar temporal patterns in motor activity across timescales, is robust in healthy young humans but degrades with aging and in Alzheimer's disease (AD). Objective: To determine the timescales where alterations of FMAR can best predict the clinical onset of AD. Methods: FMAR was assessed from actigraphy at baseline in 1,077 participants who had annual follow-up clinical assessments for up to 15 years. Survival analysis combined with deep learning (DeepSurv) was used to examine how baseline FMAR at different timescales from 3 minutes up to 6 hours contributed differently to the risk for incident clinical AD. Results: Clinical AD occurred in 270 participants during the follow-up. DeepSurv identified three potential regions of timescales in which FMAR alterations were significantly linked to the risk for clinical AD: <10, 20-40, and 100-200 minutes. Confirmed by the Cox and random survival forest models, the effect of FMAR alterations in the timescale of <10 minutes was the strongest, after adjusting for covariates. Conclusions: Subtle changes in motor activity fluctuations predicted the clinical onset of AD, with the strongest association observed in activity fluctuations at timescales <10 minutes. These findings suggest that short actigraphy recordings may be used to assess the risk of AD.


Asunto(s)
Enfermedad de Alzheimer , Humanos , Enfermedad de Alzheimer/diagnóstico , Enfermedad de Alzheimer/complicaciones , Envejecimiento , Actividad Motora
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA