RESUMEN
Large language models (LLMs) have significantly impacted various fields with their ability to understand and generate human-like text. This study explores the potential benefits and limitations of integrating LLMs, such as ChatGPT, into haematology practices. Utilizing systematic review methodologies, we analysed studies published after 1 December 2022, from databases like PubMed, Web of Science and Scopus, and assessing each for bias with the QUADAS-2 tool. We reviewed 10 studies that applied LLMs in various haematology contexts. These models demonstrated proficiency in specific tasks, such as achieving 76% diagnostic accuracy for haemoglobinopathies. However, the research highlighted inconsistencies in performance and reference accuracy, indicating variability in reliability across different uses. Additionally, the limited scope of these studies and constraints on datasets could potentially limit the generalizability of our findings. The findings suggest that, while LLMs provide notable advantages in enhancing diagnostic processes and educational resources within haematology, their integration into clinical practice requires careful consideration. Before implementing them in haematology, rigorous testing and specific adaptation are essential. This involves validating their accuracy and reliability across different scenarios. Given the field's complexity, it is also critical to continuously monitor these models and adapt them responsively.
RESUMEN
BACKGROUND: Most patients with lower risk myelodysplastic neoplasms or syndromes (MDSs) become RBC transfusion-dependent, resulting in iron overload, which is associated with an increased oxidative stress state. Iron-chelation therapy is applied to attenuate the toxic effects of this state. Deferiprone (DFP) is an oral iron chelator, which is not commonly used in this patient population, due to safety concerns, mainly agranulocytosis. The purpose of this study was to assess the effect of DFP, on oxidative stress parameters in iron-overloaded RBC transfusion-dependent patients with lower risk MDSs. METHODS: Adult lower risk MDS patients with a cumulative transfusion burden of >20 red blood cell units and evidence of iron overload (serum ferritin >1,000 ng/mL) were included in this study. DFP was administered (100 mg/kg/day) for 4 months. Blood samples for oxidative stress parameters and iron overload parameters were done at baseline and monthly: reactive oxygen species (ROS), phosphatidylserine, reduced glutathione, membrane lipid peroxidation, serum ferritin, and cellular labile iron pool. The primary efficacy variable was ROS. Tolerability and side effects were recorded as well. A paired t test was applied for statistical analyses. RESULTS: Eighteen patients were treated with DFP. ROS significantly decreased in all cell lineages: median decrease of 58.6% in RBC, 33.3% in PMN, and 39.8% in platelets (p < 0.01 for all). Other oxidative stress markers improved: phosphatidylserine decreased by 57.95%, lipid peroxidase decreased by 141.3%, and reduced gluthathione increased by 72.8% (p < 0.01 for all). The iron-overload marker and cellular labile iron pool decreased by 35% in RBCs, 44.3% in PMN, and 46.3% in platelets (p < 0.01 for all). No significant changes were observed in SF levels. There were no events of agranulocytosis. All AEs were grades 1-2. CONCLUSIONS: Herein, we showed preliminary evidence that DFP decreases iron-induced oxidative stress in MDS patients with a good tolerability profile (albeit a short follow-up period). No cases of severe neutropenia or agranulocytosis were reported. The future challenge is to prove that reduction in iron toxicity will eventually be translated into a clinically meaningful improvement.
Asunto(s)
Deferiprona , Quelantes del Hierro , Sobrecarga de Hierro , Síndromes Mielodisplásicos , Estrés Oxidativo , Humanos , Deferiprona/uso terapéutico , Deferiprona/farmacología , Estrés Oxidativo/efectos de los fármacos , Quelantes del Hierro/uso terapéutico , Quelantes del Hierro/farmacología , Sobrecarga de Hierro/tratamiento farmacológico , Sobrecarga de Hierro/etiología , Síndromes Mielodisplásicos/tratamiento farmacológico , Síndromes Mielodisplásicos/metabolismo , Masculino , Femenino , Anciano , Persona de Mediana Edad , Piridonas/uso terapéutico , Piridonas/efectos adversos , Piridonas/administración & dosificación , Anciano de 80 o más Años , Adulto , Israel , Administración Oral , Especies Reactivas de Oxígeno/metabolismo , Transfusión de Eritrocitos , Ferritinas/sangreRESUMEN
OBJECTIVES: With smartphones and wearable devices becoming ubiquitous, they offer an opportunity for large-scale voice sampling. This systematic review explores the application of deep learning models for the automated analysis of voice samples to detect vocal cord pathologies. METHODS: We conducted a systematic literature review following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) reporting guidelines. We searched MEDLINE and Embase databases for original publications on deep learning applications for diagnosing vocal cord pathologies between 2002 and 2022. Risk of bias was assessed using Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2). RESULTS: Out of the 14 studies that met the inclusion criteria, data from a total of 3037 patients were analyzed. All studies were retrospective. Deep learning applications targeted Reinke's edema, nodules, polyps, cysts, unilateral cord paralysis, and vocal fold cancer detection. Most pathologies had detection accuracy above 90%. Thirteen studies (93%) exhibited a high risk of bias and concerns about applicability. CONCLUSIONS: Technology holds promise for enhancing the screening and diagnosis of vocal cord pathologies. While current research is limited, the presented studies offer proof of concept for developing larger-scale solutions.
Asunto(s)
Aprendizaje Profundo , Edema Laríngeo , Parálisis de los Pliegues Vocales , Humanos , Pliegues Vocales/patología , Estudios Retrospectivos , Parálisis de los Pliegues Vocales/diagnóstico , Parálisis de los Pliegues Vocales/cirugíaRESUMEN
INTRODUCTION: In the past HIV infection was a common complication of haemophilia therapy. Gene therapy trials in Haemophilia patients using rAAV have shown promising results; Unfortunately, the majority of gene therapy trials studies have excluded HIV positive patients. We decided to systematically review the published clinical trials using rAAV for HIV prevention. METHODS: A comprehensive literature search was performed to identify studies evaluating clinical trials using rAAV for HIV. The search was conducted using the MEDLINE/PubMed databases. Search keywords included 'gene therapy', 'adeno-associated virus', 'HIV' and 'clinical trial'. RESULTS: Three studies met our inclusion criteria. Two were phase 1 studies and one was a phase 2 study. One study examined an AAV coding for human monoclonal IgG1 antibody whereas the other two studies delivered a vector coding for viral protease and part of reverse transcriptase. All studies administered the vaccine intramuscularly and showed a response as well a good safety profile. DISCUSSION: The concept of using a viral vector to prevent a viral infection is revolutionary. Due to the paucity of information regarding application of any gene therapy in HIV patients and the potential use of gene therapy in haemophilia patients with HIV in the future warrants attention.
Asunto(s)
Infecciones por VIH , Hemofilia A , Humanos , Hemofilia A/terapia , Hemofilia A/tratamiento farmacológico , Infecciones por VIH/complicaciones , Infecciones por VIH/terapia , Dependovirus/genética , Terapia Genética/métodos , Vectores Genéticos/uso terapéuticoRESUMEN
OBJECTIVE: This study aimed to investigate the accuracy of convolutional neural network models in the assessment of embryos using time-lapse monitoring. DATA SOURCES: A systematic search was conducted in PubMed and Web of Science databases from January 2016 to December 2022. The search strategy was carried out by using key words and MeSH (Medical Subject Headings) terms. STUDY ELIGIBILITY CRITERIA: Studies were included if they reported the accuracy of convolutional neural network models for embryo evaluation using time-lapse monitoring. The review was registered with PROSPERO (International Prospective Register of Systematic Reviews; identification number CRD42021275916). METHODS: Two reviewer authors independently screened results using the Covidence systematic review software. The full-text articles were reviewed when studies met the inclusion criteria or in any uncertainty. Nonconsensus was resolved by a third reviewer. Risk of bias and applicability were evaluated using the QUADAS-2 tool and the modified Joanna Briggs Institute or JBI checklist. RESULTS: Following a systematic search of the literature, 22 studies were identified as eligible for inclusion. All studies were retrospective. A total of 522,516 images of 222,998 embryos were analyzed. Three main outcomes were evaluated: successful in vitro fertilization, blastocyst stage classification, and blastocyst quality. Most studies reported >80% accuracy, and embryologists were outperformed in some. Ten studies had a high risk of bias, mostly because of patient bias. CONCLUSION: The application of artificial intelligence in time-lapse monitoring has the potential to provide more efficient, accurate, and objective embryo evaluation. Models that examined blastocyst stage classification showed the best predictions. Models that predicted live birth had a low risk of bias, used the largest databases, and had external validation, which heightens their relevance to clinical application. Our systematic review is limited by the high heterogeneity among the included studies. Researchers should share databases and standardize reporting.
Asunto(s)
Inteligencia Artificial , Aprendizaje Profundo , Embarazo , Femenino , Humanos , Índice de Embarazo , Estudios Retrospectivos , Imagen de Lapso de Tiempo/métodos , Revisiones Sistemáticas como Asunto , Pruebas Diagnósticas de RutinaRESUMEN
PURPOSE: Sarcoidosis is a complex disease which can affect nearly every organ system with manifestations ranging from asymptomatic imaging findings to sudden cardiac death. As such, diagnosis and prognostication are topics of continued investigation. Recent technological advancements have introduced multiple modalities of artificial intelligence (AI) to the study of sarcoidosis. Machine learning, deep learning, and radiomics have predominantly been used to study sarcoidosis. METHODS: Articles were collected by searching online databases using keywords such as sarcoid, machine learning, artificial intelligence, radiomics, and deep learning. Article titles and abstracts were reviewed for relevance by a single reviewer. Articles written in languages other than English were excluded. CONCLUSIONS: Machine learning may be used to help diagnose pulmonary sarcoidosis and prognosticate in cardiac sarcoidosis. Deep learning is most comprehensively studied for diagnosis of pulmonary sarcoidosis and has less frequently been applied to prognostication in cardiac sarcoidosis. Radiomics has primarily been used to differentiate sarcoidosis from malignancy. To date, the use of AI in sarcoidosis is limited by the rarity of this disease, leading to small, suboptimal training sets. Nevertheless, there are applications of AI that have been used to study other systemic diseases, which may be adapted for use in sarcoidosis. These applications include discovery of new disease phenotypes, discovery of biomarkers of disease onset and activity, and treatment optimization.
Asunto(s)
Sarcoidosis Pulmonar , Sarcoidosis , Humanos , Inteligencia Artificial , Sarcoidosis/diagnóstico por imagen , Aprendizaje Automático , Bases de Datos FactualesRESUMEN
BACKGROUND: Jejunal disease is associated with worse prognosis in Crohn's disease. The added value of diffusion weighted imaging for evaluating jejunal inflammation related to Crohn's Disease is scarce. OBJECTIVES: To compare diffusion weighted imaging, video capsule endoscopy, and inflammatory biomarkers in the assessment of Crohn's disease involving the jejunum. METHODS: Crohn's disease patients in clinical remission were prospectively recruited and underwent magnetic resonance (MR)-enterography and video capsule endoscopy. C-reactive protein and fecal-calprotectin levels were obtained. MR-enterography images were evaluated for restricted diffusion, and apparent diffusion coefficient values were measured. The video capsule endoscopy-based Lewis score was calculated. Associations between diffusion weighted imaging, apparent diffusion coefficient, Lewis score, and inflammatory biomarkers were evaluated. RESULTS: The study included 51 patients, and 27/51 (52.9%) with video capsule endoscopies showed jejunal mucosal inflammation. Sensitivity and specificity of restricted diffusion for video capsule endoscopy mucosal inflammation were 59.3% and 37.5% for the first reader, and 66.7% and 37.5% for the second reader, respectively. Diffusion weighted imaging was not statistically associated with jejunal video capsule endoscopy inflammation (P = 0.813). CONCLUSIONS: Diffusion weighted imaging was not an effective test for evaluation of jejunal inflammation as seen by video capsule endoscopy in patients with quiescent Crohn's disease.
Asunto(s)
Endoscopía Capsular , Enfermedad de Crohn , Humanos , Enfermedad de Crohn/diagnóstico , Enfermedad de Crohn/diagnóstico por imagen , Endoscopía Capsular/métodos , Yeyuno/diagnóstico por imagen , Imagen de Difusión por Resonancia Magnética/métodos , Inflamación/diagnóstico , Imagen por Resonancia Magnética , Biomarcadores/análisisRESUMEN
BACKGROUND: Research regarding the association between severe obesity and in-hospital mortality is inconsistent. We evaluated the impact of body mass index (BMI) levels on mortality in the medical wards. The analysis was performed separately before and during the COVID-19 pandemic. METHODS: We retrospectively retrieved data of adult patients admitted to the medical wards at the Mount Sinai Health System in New York City. The study was conducted between January 1, 2011, to March 23, 2021. Patients were divided into two sub-cohorts: pre-COVID-19 and during-COVID-19. Patients were then clustered into groups based on BMI ranges. A multivariate logistic regression analysis compared the mortality rate among the BMI groups, before and during the pandemic. RESULTS: Overall, 179,288 patients were admitted to the medical wards and had a recorded BMI measurement. 149,098 were admitted before the COVID-19 pandemic and 30,190 during the pandemic. Pre-pandemic, multivariate analysis showed a "J curve" between BMI and mortality. Severe obesity (BMI > 40) had an aOR of 0.8 (95% CI:0.7-1.0, p = 0.018) compared to the normal BMI group. In contrast, during the pandemic, the analysis showed a "U curve" between BMI and mortality. Severe obesity had an aOR of 1.7 (95% CI:1.3-2.4, p < 0.001) compared to the normal BMI group. CONCLUSIONS: Medical ward patients with severe obesity have a lower risk for mortality compared to patients with normal BMI. However, this does not apply during COVID-19, where obesity was a leading risk factor for mortality in the medical wards. It is important for the internal medicine physician to understand the intricacies of the association between obesity and medical ward mortality.
Asunto(s)
Índice de Masa Corporal , COVID-19/mortalidad , Mortalidad Hospitalaria/tendencias , Hospitalización/estadística & datos numéricos , Obesidad/fisiopatología , SARS-CoV-2/aislamiento & purificación , Anciano , COVID-19/epidemiología , COVID-19/patología , COVID-19/virología , Estudios de Casos y Controles , Femenino , Humanos , Masculino , Persona de Mediana Edad , Ciudad de Nueva York/epidemiología , Pronóstico , Estudios Retrospectivos , Factores de Riesgo , Tasa de SupervivenciaRESUMEN
OBJECTIVES: Physicians continuously make tough decisions when discharging patients. Alerting on poor outcomes may help in this decision. This study evaluates a machine learning model for predicting 30-day mortality in emergency department (ED) discharged patients. METHODS: We retrospectively analysed visits of adult patients discharged from a single ED (1/2014-12/2018). Data included demographics, evaluation and treatment in the ED, and discharge diagnosis. The data comprised of both structured and free-text fields. A gradient boosting model was trained to predict mortality within 30 days of release from the ED. The model was trained on data from the years 2014-2017 and validated on data from the year 2018. In order to reduce potential end-of-life bias, a subgroup analysis was performed for non-oncological patients. RESULTS: Overall, 363 635 ED visits of discharged patients were analysed. The 30-day mortality rate was 0.8%. A majority of the mortality cases (65.3%) had a known oncological disease. The model yielded an area under the curve (AUC) of 0.97 (95% CI 0.96 to 0.97) for predicting 30-day mortality. For a sensitivity of 84% (95% CI 0.81 to 0.86), this model had a false positive rate of 1:20. For patients without a known malignancy, the model yielded an AUC of 0.94 (95% CI 0.92 to 0.95). CONCLUSIONS: Although not frequent, patients may die following ED discharge. Machine learning-based tools may help ED physicians identify patients at risk. An optimised decision for hospitalisation or palliative management may improve patient care and system resource allocation.
Asunto(s)
Servicio de Urgencia en Hospital , Alta del Paciente , Adulto , Hospitalización , Humanos , Aprendizaje Automático , Estudios RetrospectivosRESUMEN
BACKGROUND: Acute mesenteric ischemia (AMI) is a medical condition with high levels of morbidity and mortality. However, most patients suspected of AMI will eventually have a different diagnosis. Nevertheless, these patients have a high risk for co-morbidities. OBJECTIVES: To analyze patients with suspected AMI with an alternative final diagnosis, and to evaluate a machine learning algorithm for prognosis prediction in this population. METHODS: In a retrospective search, we retrieved patient charts of those who underwent computed tomography angiography (CTA) for suspected AMI between January 2012 and December 2015. Non-AMI patients were defined as patients with negative CTA and a final clinical diagnosis other than AMI. Correlation of past medical history, laboratory values, and mortality rates were evaluated. We evaluated gradient boosting (XGBoost) model for mortality prediction. RESULTS: The non-AMI group comprised 325 patients. The two most common groups of diseases included gastrointestinal (33%) and biliary-pancreatic diseases (27%). Mortality rate was 24.6% for the entire cohort. Medical history of chronic kidney disease (CKD) had higher risk for mortality (odds ratio 2.2). Laboratory studies revealed that lactate dehydrogenase (LDH) had the highest diagnostic ability for predicting mortality in the entire cohort (AUC 0.70). The gradient boosting model showed an area under the curve of 0.82 for predicting mortality. CONCLUSIONS: Patients with suspected AMI with an alternative final diagnosis showed a 25% mortality rate. A past medical history of CKD and elevated LDH were associated with increased mortality. Non-linear machine learning algorithms can augment single variable inputs for predicting mortality.
Asunto(s)
Isquemia Mesentérica , Insuficiencia Renal Crónica , Humanos , Isquemia Mesentérica/diagnóstico por imagen , Angiografía por Tomografía Computarizada , Estudios Retrospectivos , Angiografía , IsquemiaRESUMEN
BACKGROUND AND AIMS: Capsule endoscopy (CE) is an important modality for diagnosis and follow-up of Crohn's disease (CD). The severity of ulcers at endoscopy is significant for predicting the course of CD. Deep learning has been proven accurate in detecting ulcers on CE. However, endoscopic classification of ulcers by deep learning has not been attempted. The aim of our study was to develop a deep learning algorithm for automated grading of CD ulcers on CE. METHODS: We retrospectively collected CE images of CD ulcers from our CE database. In experiment 1, the severity of each ulcer was graded by 2 capsule readers based on the PillCam CD classification (grades 1-3 from mild to severe), and the inter-reader variability was evaluated. In experiment 2, a consensus reading by 3 capsule readers was used to train an ordinal convolutional neural network (CNN) to automatically grade images of ulcers, and the resulting algorithm was tested against the consensus reading. A pretraining stage included training the network on images of normal mucosa and ulcerated mucosa. RESULTS: Overall, our dataset included 17,640 CE images from 49 patients; 7391 images with mucosal ulcers and 10,249 normal images. A total of 2598 randomly selected pathologic images were further graded from 1 to 3 according to ulcer severity in the 2 different experiments. In experiment 1, overall inter-reader agreement occurred for 31% of the images (345 of 1108) and 76% (752 of 989) for distinction of grades 1 and 3. In experiment 2, the algorithm was trained on 1242 images. It achieved an overall agreement for consensus reading of 67% (166 of 248) and 91% (158 of 173) for distinction of grades 1 and 3. The classification accuracy of the algorithm was 0.91 (95% confidence interval, 0.867-0.954) for grade 1 versus grade 3 ulcers, 0.78 (95% confidence interval, 0.716-0.844) for grade 2 versus grade 3, and 0.624 (95% confidence interval, 0.547-0.701) for grade 1 versus grade 2. CONCLUSIONS: CNN achieved high accuracy in detecting severe CD ulcerations. CNN-assisted CE readings in patients with CD can potentially facilitate and improve diagnosis and monitoring in these patients.
Asunto(s)
Endoscopía Capsular , Enfermedad de Crohn , Enfermedad de Crohn/diagnóstico por imagen , Humanos , Intestino Delgado , Redes Neurales de la Computación , Estudios Retrospectivos , Úlcera/diagnóstico por imagenRESUMEN
BACKGROUND AND AIMS: While biopsy is the gold standard for liver fibrosis staging, it poses significant risks. Noninvasive assessment of liver fibrosis is a growing field. Recently, deep learning (DL) technology has revolutionized medical image analysis. This technology has the potential to enhance noninvasive fibrosis assessment. We systematically examined the application of DL in noninvasive liver fibrosis imaging. METHODS: Embase, MEDLINE, Web of Science, and IEEE Xplore databases were used to identify studies that reported on the accuracy of DL for classification of liver fibrosis on noninvasive imaging. The search keywords were "liver or hepatic," "fibrosis or cirrhosis," and "neural or DL networks." Risk of bias and applicability were evaluated using the QUADAS-2 tool. RESULTS: Sixteen studies were retrieved. Imaging modalities included ultrasound (n = 10), computed tomography (n = 3), and magnetic resonance imaging (n = 3). The studies analyzed a total of 40 405 radiological images from 15 853 patients. All but two of the studies were retrospective. In most studies the "ground truth" reference was the METAVIR score for pathological staging (n = 9.56%). The majority of the studies reported an accuracy >85% when compared to histopathology. Fourteen studies (87.5%) had a high risk of bias and concerns regarding applicability. CONCLUSIONS: Deep learning has the potential to play an emerging role in liver fibrosis classification. Yet, it is still limited by a relatively small number of retrospective studies. Clinicians should facilitate the use of this technology by sharing databases and standardized reports. This may optimize the noninvasive evaluation of liver fibrosis on a large scale.
Asunto(s)
Aprendizaje Profundo , Diagnóstico por Imagen de Elasticidad , Humanos , Cirrosis Hepática/diagnóstico por imagen , Imagen por Resonancia Magnética , Estudios Retrospectivos , UltrasonografíaRESUMEN
BACKGROUND: Pediatric research is a diverse field that is constantly growing. Current machine learning advancements have prompted a technique termed text-mining. In text-mining, information is extracted from texts using algorithms. This technique can be applied to analyze trends and to investigate the dynamics in a research field. We aimed to use text-mining to provide a high-level analysis of pediatric literature over the past two decades. METHODS: We retrieved all available MEDLINE/PubMed annual data sets until December 31, 2018. Included studies were categorized into topics using text-mining. RESULTS: Two hundred and twenty-five journals were categorized as Pediatrics, Perinatology, and Child Health based on Scimago ranking for medicine journals. We included 201,141 pediatric papers published between 1999 and 2018. The most frequently cited publications were clinical guidelines and meta-analyses. We found that there is a shift in the trend of topics. Epidemiological studies are gaining more publications while other topics are relatively decreasing. CONCLUSIONS: The topics in pediatric literature have shifted in the past two decades, reflecting changing trends in the field. Text-mining enables analysis of trends in publications and can serve as a high-level academic tool. IMPACT: Text-mining enables analysis of trends in publications and can serve as a high-level academic tool. This is the first study using text-mining techniques to analyze pediatric publications. Our findings indicate that text-mining techniques enable better understanding of trends in publications and should be implemented when analyzing research.
Asunto(s)
Minería de Datos/tendencias , Pediatría , Algoritmos , Humanos , PubMed , Publicaciones/tendenciasRESUMEN
BACKGROUND: In the past decade, deep learning has revolutionized medical image processing. This technique may advance laparoscopic surgery. Study objective was to evaluate whether deep learning networks accurately analyze videos of laparoscopic procedures. METHODS: Medline, Embase, IEEE Xplore, and the Web of science databases were searched from January 2012 to May 5, 2020. Selected studies tested a deep learning model, specifically convolutional neural networks, for video analysis of laparoscopic surgery. Study characteristics including the dataset source, type of operation, number of videos, and prediction application were compared. A random effects model was used for estimating pooled sensitivity and specificity of the computer algorithms. Summary receiver operating characteristic curves were calculated by the bivariate model of Reitsma. RESULTS: Thirty-two out of 508 studies identified met inclusion criteria. Applications included instrument recognition and detection (45%), phase recognition (20%), anatomy recognition and detection (15%), action recognition (13%), surgery time prediction (5%), and gauze recognition (3%). The most common tested procedures were cholecystectomy (51%) and gynecological-mainly hysterectomy and myomectomy (26%). A total of 3004 videos were analyzed. Publications in clinical journals increased in 2020 compared to bio-computational ones. Four studies provided enough data to construct 8 contingency tables, enabling calculation of test accuracy with a pooled sensitivity of 0.93 (95% CI 0.85-0.97) and specificity of 0.96 (95% CI 0.84-0.99). Yet, the majority of papers had a high risk of bias. CONCLUSIONS: Deep learning research holds potential in laparoscopic surgery, but is limited in methodologies. Clinicians may advance AI in surgery, specifically by offering standardized visual databases and reporting.
Asunto(s)
Aprendizaje Profundo/normas , Pruebas Diagnósticas de Rutina/métodos , Laparoscopía/métodos , Femenino , Humanos , MasculinoRESUMEN
PURPOSE OF THE STUDY: Hypophosphataemia and hyperphosphataemia are frequently encountered in hospitalised patients and are associated with significant clinical consequences. However, the prognostic value of normal-range phosphorus levels on all-cause mortality and hospitalisations is not well established. Therefore, we examined the association between normal-range phosphorus levels, all-cause mortality and hospitalisations in patients presenting to the emergency department of a tertiary medical centre in Israel. STUDY DESIGN: A retrospective analysis of patients presenting to the Chaim Sheba Medical Center emergency department between 2012 and 2018. The cohort was divided into quartiles based on emergency department phosphorus levels: 'very-low-normal' (pâ≥â2 mg/dL and pâ≤â2.49 mg/dL), 'low-normal' (pâ≥â2.5 mg/dL and pâ≤â2.99 mg/dL), 'high-normal' (p≥ââ3 mg/dL and p≤3.49 mg/dL) and 'very-high-normal' (pâ≥ââ3.5 mg/dL and pâ≤â4 mg/dL). We analysed the association between emergency department phosphorus levels, hospitalisation rate and 30-day and 90-day all-cause mortality. RESULTS: Our final analysis included 223 854 patients with normal-range phosphorus levels. Patients with 'very-low-normal' phosphorus levels had the highest mortality rate. Compared with patients with 'high-normal' phosphorus levels, patients with 'very-low-normal' levels had increased 30-day all-cause mortality (OR 1.3, 95% CI 1.1 to 1.4, p<0.001), and increased 90-day all-cause mortality (OR 1.2, 95% CI 1.1 to 1.3, p<0.001). Lower serum phosphorus levels were also associated with a higher hospitalisation rate, both for the internal medicine and general surgery wards (p<0.001). CONCLUSIONS: Lower phosphorus levels, within the normal range, are associated with higher 30-day and 90-day all-cause mortality and hospitalisation rate.
Asunto(s)
Causas de Muerte , Servicio de Urgencia en Hospital , Fósforo/sangre , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Femenino , Hospitalización/estadística & datos numéricos , Humanos , Hiperfosfatemia/diagnóstico , Hiperfosfatemia/mortalidad , Hipofosfatemia/diagnóstico , Hipofosfatemia/mortalidad , Israel , Masculino , Persona de Mediana Edad , Pronóstico , Valores de Referencia , Estudios RetrospectivosRESUMEN
AIM: To identify clinical and laboratory parameters that can assist in the differential diagnosis of coronavirus disease 2019 (COVID-19), influenza, and respiratory syncytial virus (RSV) infections. METHODS: In this retrospective cohort study, we obtained basic demographics and laboratory data from all 685 hospitalized patients confirmed with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), influenza virus, or RSV from 2018 to 2020. A multiple logistic regression was employed to investigate the relationship between COVID-19 and laboratory parameters. RESULTS: SARS-CoV-2 patients were significantly younger than RSV (P=0.001) and influenza virus (P=0.022) patients. SARS-CoV-2 patients also displayed a significant male predominance over influenza virus patients (P=0.047). They also had significantly lower white blood cell count (median 6.3×106 cells/µ) compared with influenza virus (P<0.001) and RSV (P=0.001) patients. Differences were also observed in other laboratory values but were insignificant in a multivariate analysis. CONCLUSIONS: Male sex, younger age, and low white blood cell count can assist in the diagnosis of COVID-19 over other viral infections. However, the differences between the groups were not substantial enough and would probably not suffice to distinguish between the viral illnesses in the emergency department.
Asunto(s)
COVID-19 , Gripe Humana , Infecciones por Virus Sincitial Respiratorio , Humanos , Gripe Humana/diagnóstico , Gripe Humana/epidemiología , Laboratorios , Masculino , Infecciones por Virus Sincitial Respiratorio/diagnóstico , Infecciones por Virus Sincitial Respiratorio/epidemiología , Estudios Retrospectivos , SARS-CoV-2RESUMEN
BACKGROUND: Emergency departments (ED) are becoming increasingly overwhelmed, increasing poor outcomes. Triage scores aim to optimize the waiting time and prioritize the resource usage. Artificial intelligence (AI) algorithms offer advantages for creating predictive clinical applications. OBJECTIVE: Evaluate a state-of-the-art machine learning model for predicting mortality at the triage level and, by validating this automatic tool, improve the categorization of patients in the ED. DESIGN: An institutional review board (IRB) approval was granted for this retrospective study. Information of consecutive adult patients (ages 18-100) admitted at the emergency department (ED) of one hospital were retrieved (January 1, 2012-December 31, 2018). Features included the following: demographics, admission date, arrival mode, referral code, chief complaint, previous ED visits, previous hospitalizations, comorbidities, home medications, vital signs, and Emergency Severity Index (ESI). The following outcomes were evaluated: early mortality (up to 2 days post ED registration) and short-term mortality (2-30 days post ED registration). A gradient boosting model was trained on data from years 2012-2017 and examined on data from the final year (2018). The area under the curve (AUC) for mortality prediction was used as an outcome metric. Single-variable analysis was conducted to develop a nine-point triage score for early mortality. KEY RESULTS: Overall, 799,522 ED visits were available for analysis. The early and short-term mortality rates were 0.6% and 2.5%, respectively. Models trained on the full set of features yielded an AUC of 0.962 for early mortality and 0.923 for short-term mortality. A model that utilized the nine features with the highest single-variable AUC scores (age, arrival mode, chief complaint, five primary vital signs, and ESI) yielded an AUC of 0.962 for early mortality. CONCLUSION: The gradient boosting model shows high predictive ability for screening patients at risk of early mortality utilizing data available at the time of triage in the ED.
Asunto(s)
Inteligencia Artificial , Triaje , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Servicio de Urgencia en Hospital , Mortalidad Hospitalaria , Humanos , Aprendizaje Automático , Persona de Mediana Edad , Estudios Retrospectivos , Adulto JovenRESUMEN
BACKGROUND AND AIMS: Deep learning is an innovative algorithm based on neural networks. Wireless capsule endoscopy (WCE) is considered the criterion standard for detecting small-bowel diseases. Manual examination of WCE is time-consuming and can benefit from automatic detection using artificial intelligence (AI). We aimed to perform a systematic review of the current literature pertaining to deep learning implementation in WCE. METHODS: We conducted a search in PubMed for all original publications on the subject of deep learning applications in WCE published between January 1, 2016 and December 15, 2019. Evaluation of the risk of bias was performed using tailored Quality Assessment of Diagnostic Accuracy Studies-2. Pooled sensitivity and specificity were calculated. Summary receiver operating characteristic curves were plotted. RESULTS: Of the 45 studies retrieved, 19 studies were included. All studies were retrospective. Deep learning applications for WCE included detection of ulcers, polyps, celiac disease, bleeding, and hookworm. Detection accuracy was above 90% for most studies and diseases. Pooled sensitivity and specificity for ulcer detection were .95 (95% confidence interval [CI], .89-.98) and .94 (95% CI, .90-.96), respectively. Pooled sensitivity and specificity for bleeding or bleeding source were .98 (95% CI, .96-.99) and .99 (95% CI, .97-.99), respectively. CONCLUSIONS: Deep learning has achieved excellent performance for the detection of a range of diseases in WCE. Notwithstanding, current research is based on retrospective studies with a high risk of bias. Thus, future prospective, multicenter studies are necessary for this technology to be implemented in the clinical use of WCE.
Asunto(s)
Endoscopía Capsular , Aprendizaje Profundo , Inteligencia Artificial , Humanos , Redes Neurales de la Computación , Estudios RetrospectivosRESUMEN
BACKGROUND AND AIMS: The aim of our study was to develop and evaluate a deep learning algorithm for the automated detection of small-bowel ulcers in Crohn's disease (CD) on capsule endoscopy (CE) images of individual patients. METHODS: We retrospectively collected CE images of known CD patients and control subjects. Each image was labeled by an expert gastroenterologist as either normal mucosa or containing mucosal ulcers. A convolutional neural network was trained to classify images into either normal mucosa or mucosal ulcers. First, we trained the network on 5-fold randomly split images (each fold with 80% training images and 20% images testing). We then conducted 10 experiments in which images from n - 1 patients were used to train a network and images from a different individual patient were used to test the network. Results of the networks were compared for randomly split images and for individual patients. Area under the curves (AUCs) and accuracies were computed for each individual network. RESULTS: Overall, our dataset included 17,640 CE images from 49 patients: 7391 images with mucosal ulcers and 10,249 images of normal mucosa. For randomly split images results were excellent, with AUCs of .99 and accuracies ranging from 95.4% to 96.7%. For individual patient-level experiments, the AUCs were also excellent (.94-.99). CONCLUSIONS: Deep learning technology provides accurate and fast automated detection of mucosal ulcers on CE images. Individual patient-level analysis provided high and consistent diagnostic accuracy with shortened reading time; in the future, deep learning algorithms may augment and facilitate CE reading.
Asunto(s)
Endoscopía Capsular , Enfermedad de Crohn , Aprendizaje Profundo , Intestino Delgado/diagnóstico por imagen , Úlcera/diagnóstico por imagen , Algoritmos , Automatización , Endoscopía Capsular/métodos , Enfermedad de Crohn/complicaciones , Enfermedad de Crohn/diagnóstico por imagen , Humanos , Mucosa Intestinal/diagnóstico por imagen , Redes Neurales de la Computación , Distribución Aleatoria , Reproducibilidad de los Resultados , Estudios Retrospectivos , Úlcera/etiologíaRESUMEN
PURPOSE: Natural language processing (NLP) can be used for automatic flagging of radiology reports. We assessed deep learning models for classifying non-English head CT reports. METHODS: We retrospectively collected head CT reports (2011-2018). Reports were signed in Hebrew. Emergency department (ED) reports of adult patients from January to February for each year (2013-2018) were manually labeled. All other reports were used to pre-train an embedding layer. We explored two use cases: (1) general labeling use case, in which reports were labeled as normal vs. pathological; (2) specific labeling use case, in which reports were labeled as with and without intra-cranial hemorrhage. We tested long short-term memory (LSTM) and LSTM-attention (LSTM-ATN) networks for classifying reports. We also evaluated the improvement of adding Word2Vec word embedding. Deep learning models were compared with a bag-of-words (BOW) model. RESULTS: We retrieved 176,988 head CT reports for pre-training. We manually labeled 7784 reports as normal (46.3%) or pathological (53.7%), and 7.1% with intra-cranial hemorrhage. For the general labeling, LSTM-ATN-Word2Vec showed the best results (AUC = 0.967 ± 0.006, accuracy 90.8% ± 0.01). For the specific labeling, all methods showed similar accuracies between 95.0 and 95.9%. Both LSTM-ATN-Word2Vec and BOW had the highest AUC (0.970). CONCLUSION: For a general use case, word embedding using a large cohort of non-English head CT reports and ATN improves NLP performance. For a more specific task, BOW and deep learning showed similar results. Models should be explored and tailored to the NLP task.