Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 65
Filtrar
1.
Cureus ; 16(5): e59661, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38836155

RESUMEN

Heart failure (HF) is prevalent globally. It is a dynamic disease with varying definitions and classifications due to multiple pathophysiologies and etiologies. The diagnosis, clinical staging, and treatment of HF become complex and subjective, impacting patient prognosis and mortality. Technological advancements, like artificial intelligence (AI), have been significant roleplays in medicine and are increasingly used in cardiovascular medicine to transform drug discovery, clinical care, risk prediction, diagnosis, and treatment. Medical and surgical interventions specific to HF patients rely significantly on early identification of HF. Hospitalization and treatment costs for HF are high, with readmissions increasing the burden. AI can help improve diagnostic accuracy by recognizing patterns and using them in multiple areas of HF management. AI has shown promise in offering early detection and precise diagnoses with the help of ECG analysis, advanced cardiac imaging, leveraging biomarkers, and cardiopulmonary stress testing. However, its challenges include data access, model interpretability, ethical concerns, and generalizability across diverse populations. Despite these ongoing efforts to refine AI models, it suggests a promising future for HF diagnosis. After applying exclusion and inclusion criteria, we searched for data available on PubMed, Google Scholar, and the Cochrane Library and found 150 relevant papers. This review focuses on AI's significant contribution to HF diagnosis in recent years, drastically altering HF treatment and outcomes.

2.
Front Pediatr ; 12: 1405780, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38895195

RESUMEN

Background: Necrotizing enterocolitis (NEC) is a severe neonatal intestinal disease, often occurring in preterm infants following the administration of hyperosmolar formula. It is one of the leading causes of neonatal mortality in the NICU, and currently, there are no clear standards for surgical intervention, which typically depends on the joint discretion of surgeons and neonatologists. In recent years, deep learning has been extensively applied in areas such as image segmentation, fracture and pneumonia classification, drug development, and pathological diagnosis. Objective: Investigating deep learning applications using bedside x-rays to help optimizing surgical decision-making in neonatal NEC. Methods: Through a retrospective analysis of anteroposterior bedside chest and abdominal x-rays from 263 infants diagnosed with NEC between January 2015 and April 2023, including a surgery group (94 cases) and a non-surgery group (169 cases), the infants were divided into a training set and a validation set in a 7:3 ratio. Models were built based on Resnet18, Densenet121, and SimpleViT to predict whether NEC patients required surgical intervention. Finally, the model's performance was tested using an additional 40 cases, including both surgical and non-surgical NEC cases, as a test group. To enhance the interpretability of the models, the study employed 2D-Grad-CAM technology to describe the models' focus on significant areas within the x-ray images. Results: Resnet18 demonstrated outstanding performance in binary diagnostic capability, achieving an accuracy of 0.919 with its precise lesion imaging and interpretability particularly highlighted. Its precision, specificity, sensitivity, and F1 score were significantly high, proving its advantages in optimizing surgical decision-making for neonatal NEC. Conclusion: The Resnet18 deep learning model, constructed using bedside chest and abdominal imaging, effectively assists clinical physicians in determining whether infants with NEC require surgical intervention.

3.
Cureus ; 16(5): e60318, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38882956

RESUMEN

BACKGROUND: The integration of artificial intelligence (AI) in medicine, particularly through AI-based language models like ChatGPT, offers a promising avenue for enhancing patient education and healthcare delivery. This study aims to evaluate the quality of medical information provided by Chat Generative Pre-trained Transformer (ChatGPT) regarding common orthopedic and trauma surgical procedures, assess its limitations, and explore its potential as a supplementary source for patient education. METHODS: Using the GPT-3.5-Turbo version of ChatGPT, simulated patient information was generated for 20 orthopedic and trauma surgical procedures. The study utilized standardized information forms as a reference for evaluating ChatGPT's responses. The accuracy and quality of the provided information were assessed using a modified DISCERN instrument, and a global medical assessment was conducted to categorize the information's usefulness and reliability. RESULTS: ChatGPT mentioned an average of 47% of relevant keywords across procedures, with a variance in the mention rate between 30.5% and 68.6%. The average modified DISCERN (mDISCERN) score was 2.4 out of 5, indicating a moderate to low quality of information. None of the ChatGPT-generated fact sheets were rated as "very useful," with 45% deemed "somewhat useful," 35% "not useful," and 20% classified as "dangerous." A positive correlation was found between higher mDISCERN scores and better physician ratings, suggesting that information quality directly impacts perceived utility. CONCLUSION: While AI-based language models like ChatGPT hold significant promise for medical education and patient care, the current quality of information provided in the field of orthopedics and trauma surgery is suboptimal. Further development and refinement of AI sources and algorithms are necessary to improve the accuracy and reliability of medical information. This study underscores the need for ongoing research and development in AI applications in healthcare, emphasizing the critical role of accurate, high-quality information in patient education and informed consent processes.

4.
Cureus ; 16(3): e56472, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38638735

RESUMEN

This narrative literature review undertakes a comprehensive examination of the burgeoning field, tracing the development of artificial intelligence (AI)-powered tools for depression and anxiety detection from the level of intricate algorithms to practical applications. Delivering essential mental health care services is now a significant public health priority. In recent years, AI has become a game-changer in the early identification and intervention of these pervasive mental health disorders. AI tools can potentially empower behavioral healthcare services by helping psychiatrists collect objective data on patients' progress and tasks. This study emphasizes the current understanding of AI, the different types of AI, its current use in multiple mental health disorders, advantages, disadvantages, and future potentials. As technology develops and the digitalization of the modern era increases, there will be a rise in the application of artificial intelligence in psychiatry; therefore, a comprehensive understanding will be needed. We searched PubMed, Google Scholar, and Science Direct using keywords for this. In a recent review of studies using electronic health records (EHR) with AI and machine learning techniques for diagnosing all clinical conditions, roughly 99 publications have been found. Out of these, 35 studies were identified for mental health disorders in all age groups, and among them, six studies utilized EHR data sources. By critically analyzing prominent scholarly works, we aim to illuminate the current state of this technology, exploring its successes, limitations, and future directions. In doing so, we hope to contribute to a nuanced understanding of AI's potential to revolutionize mental health diagnostics and pave the way for further research and development in this critically important domain.

5.
Cureus ; 16(3): e55991, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38606229

RESUMEN

INTRODUCTION: Large language models (LLMs) have transformed various domains in medicine, aiding in complex tasks and clinical decision-making, with OpenAI's GPT-4, GPT-3.5, Google's Bard, and Anthropic's Claude among the most widely used. While GPT-4 has demonstrated superior performance in some studies, comprehensive comparisons among these models remain limited. Recognizing the significance of the National Board of Medical Examiners (NBME) exams in assessing the clinical knowledge of medical students, this study aims to compare the accuracy of popular LLMs on NBME clinical subject exam sample questions. METHODS: The questions used in this study were multiple-choice questions obtained from the official NBME website and are publicly available. Questions from the NBME subject exams in medicine, pediatrics, obstetrics and gynecology, clinical neurology, ambulatory care, family medicine, psychiatry, and surgery were used to query each LLM. The responses from GPT-4, GPT-3.5, Claude, and Bard were collected in October 2023. The response by each LLM was compared to the answer provided by the NBME and checked for accuracy. Statistical analysis was performed using one-way analysis of variance (ANOVA). RESULTS: A total of 163 questions were queried by each LLM. GPT-4 scored 163/163 (100%), GPT-3.5 scored 134/163 (82.2%), Bard scored 123/163 (75.5%), and Claude scored 138/163 (84.7%). The total performance of GPT-4 was statistically superior to that of GPT-3.5, Claude, and Bard by 17.8%, 15.3%, and 24.5%, respectively. The total performance of GPT-3.5, Claude, and Bard was not significantly different. GPT-4 significantly outperformed Bard in specific subjects, including medicine, pediatrics, family medicine, and ambulatory care, and GPT-3.5 in ambulatory care and family medicine. Across all LLMs, the surgery exam had the highest average score (18.25/20), while the family medicine exam had the lowest average score (3.75/5).  Conclusion: GPT-4's superior performance on NBME clinical subject exam sample questions underscores its potential in medical education and practice. While LLMs exhibit promise, discernment in their application is crucial, considering occasional inaccuracies. As technological advancements continue, regular reassessments and refinements are imperative to maintain their reliability and relevance in medicine.

6.
Cureus ; 16(3): e56187, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38618446

RESUMEN

Background While large language models show potential as beneficial tools in medicine, their reliability, especially in the realm of obstetrics and gynecology (OB-GYN), is not fully comprehended. This study seeks to measure and contrast the performance of ChatGPT and HuggingChat in addressing OB-GYN-related medical examination questions, offering insights into their effectiveness in this specialized field. Methods ChatGPT and HuggingChat were subjected to two standardized multiple-choice question banks: Test 1, developed by the National Board of Medical Examiners (NBME), and Test 2, gathered from the Association of Professors of Gynecology & Obstetrics (APGO) Web-Based Interactive Self-Evaluation (uWISE). Responses were analyzed and compared for correctness. Results The two-proportion z-test revealed no statistically significant difference in performance between ChatGPT and HuggingChat on both medical examinations. For Test 1, ChatGPT scored 90%, while HuggingChat scored 85% (p = 0.6). For Test 2, ChatGPT correctly answered 70% of questions, while HuggingChat correctly answered 62% of questions (p = 0.4). Conclusion Awareness of the strengths and weaknesses of artificial intelligence allows for the proper and effective use of its knowledge. Our findings indicate that there is no statistically significant difference in performance between ChatGPT and HuggingChat in addressing medical inquiries. Nonetheless, both platforms demonstrate considerable promise for applications within the medical domain.

7.
J Am Coll Emerg Physicians Open ; 5(2): e13133, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38481520

RESUMEN

Objectives: This study presents a design framework to enhance the accuracy by which large language models (LLMs), like ChatGPT can extract insights from clinical notes. We highlight this framework via prompt refinement for the automated determination of HEART (History, ECG, Age, Risk factors, Troponin risk algorithm) scores in chest pain evaluation. Methods: We developed a pipeline for LLM prompt testing, employing stochastic repeat testing and quantifying response errors relative to physician assessment. We evaluated the pipeline for automated HEART score determination across a limited set of 24 synthetic clinical notes representing four simulated patients. To assess whether iterative prompt design could improve the LLMs' ability to extract complex clinical concepts and apply rule-based logic to translate them to HEART subscores, we monitored diagnostic performance during prompt iteration. Results: Validation included three iterative rounds of prompt improvement for three HEART subscores with 25 repeat trials totaling 1200 queries each for GPT-3.5 and GPT-4. For both LLM models, from initial to final prompt design, there was a decrease in the rate of responses with erroneous, non-numerical subscore answers. Accuracy of numerical responses for HEART subscores (discrete 0-2 point scale) improved for GPT-4 from the initial to final prompt iteration, decreasing from a mean error of 0.16-0.10 (95% confidence interval: 0.07-0.14) points. Conclusion: We established a framework for iterative prompt design in the clinical space. Although the results indicate potential for integrating LLMs in structured clinical note analysis, translation to real, large-scale clinical data with appropriate data privacy safeguards is needed.

8.
Cureus ; 16(2): e53897, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38465158

RESUMEN

BACKGROUND: Cochlear implantation is a critical surgical intervention for patients with severe hearing loss. Postoperative care is essential for successful rehabilitation, yet access to timely medical advice can be challenging, especially in remote or resource-limited settings. Integrating advanced artificial intelligence (AI) tools like Chat Generative Pre-trained Transformer (ChatGPT)-4 in post-surgical care could bridge the patient education and support gap. AIM: This study aimed to assess the effectiveness of ChatGPT-4 as a supplementary information resource for postoperative cochlear implant patients. The focus was on evaluating the AI chatbot's ability to provide accurate, clear, and relevant information, particularly in scenarios where access to healthcare professionals is limited. MATERIALS AND METHODS: Five common postoperative questions related to cochlear implant care were posed to ChatGPT-4. The AI chatbot's responses were analyzed for accuracy, response time, clarity, and relevance. The aim was to determine whether ChatGPT-4 could serve as a reliable source of information for patients in need, especially if the patients could not reach out to the hospital or the specialists at that moment. RESULTS: ChatGPT-4 provided responses aligned with current medical guidelines, demonstrating accuracy and relevance. The AI chatbot responded to each query within seconds, indicating its potential as a timely resource. Additionally, the responses were clear and understandable, making complex medical information accessible to non-medical audiences. These findings suggest that ChatGPT-4 could effectively supplement traditional patient education, providing valuable support in postoperative care. CONCLUSION: The study concluded that ChatGPT-4 has significant potential as a supportive tool for cochlear implant patients post surgery. While it cannot replace professional medical advice, ChatGPT-4 can provide immediate, accessible, and understandable information, which is particularly beneficial in special moments. This underscores the utility of AI in enhancing patient care and supporting cochlear implantation.

9.
Cureus ; 16(2): e53441, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38435177

RESUMEN

Introduction Uncontrolled hypertension significantly contributes to the development and deterioration of various medical conditions, such as myocardial infarction, chronic kidney disease, and cerebrovascular events. Despite being the most common preventable risk factor for all-cause mortality, only a fraction of affected individuals maintain their blood pressure in the desired range. In recent times, there has been a growing reliance on online platforms for medical information. While providing a convenient source of information, differentiating reliable from unreliable information can be daunting for the layperson, and false information can potentially hinder timely diagnosis and management of medical conditions. The surge in accessibility of generative artificial intelligence (GeAI) technology has led to increased use in obtaining health-related information. This has sparked debates among healthcare providers about the potential for misuse and misinformation while recognizing the role of GeAI in improving health literacy. This study aims to investigate the accuracy of AI-generated information specifically related to hypertension. Additionally, it seeks to explore the reproducibility of information provided by GeAI. Method A nonhuman-subject qualitative study was devised to evaluate the accuracy of information provided by ChatGPT regarding hypertension and its secondary complications. Frequently asked questions on hypertension were compiled by three study staff, internal medicine residents at an ACGME-accredited program, and then reviewed by a physician experienced in treating hypertension, resulting in a final set of 100 questions. Each question was posed to ChatGPT three times, once by each study staff, and the majority response was then assessed against the recommended guidelines. A board-certified internal medicine physician with over eight years of experience further reviewed the responses and categorized them into two classes based on their clinical appropriateness: appropriate (in line with clinical recommendations) and inappropriate (containing errors). Descriptive statistical analysis was employed to assess ChatGPT responses for accuracy and reproducibility. Result Initially, a pool of 130 questions was gathered, of which a final set of 100 questions was selected for the purpose of this study. When assessed against acceptable standard responses, ChatGPT responses were found to be appropriate in 92.5% of cases and inappropriate in 7.5%. Furthermore, ChatGPT had a reproducibility score of 93%, meaning that it could consistently reproduce answers that conveyed similar meanings across multiple runs. Conclusion ChatGPT showcased commendable accuracy in addressing commonly asked questions about hypertension. These results underscore the potential of GeAI in providing valuable information to patients. However, continued research and refinement are essential to evaluate further the reliability and broader applicability of ChatGPT within the medical field.

10.
Cureus ; 16(2): e55216, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38435218

RESUMEN

Artificial intelligence (AI) has become a revolutionary influence in the field of ophthalmology, providing unparalleled capabilities in data analysis and pattern recognition. This narrative review delves into the crucial role that AI plays, particularly in the context of anterior segment diseases with a genetic basis. Corneal dystrophies (CDs) exhibit significant genetic diversity, manifested by irregular substance deposition in the cornea. AI-driven diagnostic tools exhibit promising accuracy in the identification and classification of corneal diseases. Importantly, chat generative pre-trained transformer (ChatGPT)-4.0 shows significant advancement over its predecessor, ChatGPT-3.5. In the realm of glaucoma, AI significantly contributes to precise diagnostics through inventive algorithms and machine learning models, surpassing conventional methods. The incorporation of AI in predicting glaucoma progression and its role in augmenting diagnostic efficiency is readily apparent. Additionally, AI-powered models prove beneficial for early identification and risk assessment in cases of congenital cataracts, characterized by diverse inheritance patterns. Machine learning models achieving exceptional discrimination in identifying congenital cataracts underscore AI's remarkable potential. The review concludes by emphasizing the promising implications of AI in managing anterior segment diseases, spanning from early detection to the tailoring of personalized treatment strategies. These advancements signal a paradigm shift in ophthalmic care, offering optimism for enhanced patient outcomes and more streamlined healthcare delivery.

11.
Cureus ; 16(1): e52748, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38384621

RESUMEN

The recent integration of the latest image generation model DALL-E 3 into ChatGPT allows text prompts to easily generate the corresponding images, enabling multimodal output from ChatGPT. We explored the feasibility of DALL-E 3 for drawing a 12-lead ECG and found that it can draw rudimentary 12-lead electrocardiograms (ECG) displaying some of the parameters, although the details are not completely accurate. We also explored DALL-E 3's capacity to create vivid illustrations for teaching resuscitation-related medical knowledge. DALL-E 3 produced accurate CPR illustrations emphasizing proper hand placement and technique. For ECG principles, it produced creative heart-shaped waveforms tying ECGs to the heart. With further training, DALL-E 3 shows promise to expand easy-to-understand visual medical teaching materials and ECG simulations for different disease states. In conclusion, DALL-E 3 has the potential to generate realistic 12-lead ECGs and teaching schematics, but expert validation is still needed.

12.
Cureus ; 16(1): e51859, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38327947

RESUMEN

Artificial intelligence has experienced explosive growth in the past year that will have implications in all aspects of our lives, including medicine. In order to train a physician workforce that understands these new advancements, medical educators must take steps now to ensure that physicians are adequately trained in medical school, residency, and fellowship programs to become proficient in the usage of artificial intelligence in medical practice. This manuscript discusses the various considerations that leadership within medical training programs should be mindful of when deciding how to best integrate artificial intelligence into their curricula.

13.
Cureus ; 16(1): e51466, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38298326

RESUMEN

Background Artificial intelligence (AI) has taken on a variety of functions in the medical field, and research has proven that it can address complicated issues in various applications. It is unknown whether Lebanese medical students and residents have a detailed understanding of this concept, and little is known about their attitudes toward AI. Aim This study fills a critical gap by revealing the knowledge and attitude of Lebanese medical students toward AI. Methods A multi-centric survey targeting 365 medical students from seven medical schools across Lebanon was conducted to assess their knowledge of and attitudes toward AI in medicine. The survey consists of five sections: the first part includes socio-demographic variables, while the second comprises the 'Medical Artificial Intelligence Readiness Scale' for medical students. The third part focuses on attitudes toward AI in medicine, the fourth assesses understanding of deep learning, and the fifth targets considerations of radiology as a specialization. Results There is a notable awareness of AI among students who are eager to learn about it. Despite this interest, there exists a gap in knowledge regarding deep learning, albeit alongside a positive attitude towards it. Students who are more open to embracing AI technology tend to have a better understanding of AI concepts (p=0.001). Additionally, a higher percentage of students from Mount Lebanon (71.6%) showed an inclination towards using AI compared to Beirut (63.2%) (p=0.03). Noteworthy are the Lebanese University and Saint Joseph University, where the highest proportions of students are willing to integrate AI into the medical field (79.4% and 76.7%, respectively; p=0.001). Conclusion It was concluded that most Lebanese medical students might not necessarily comprehend the core technological ideas of AI and deep learning. This lack of understanding was evident from the substantial amount of misinformation among the students. Consequently, there appears to be a significant demand for the inclusion of AI technologies in Lebanese medical school courses.

14.
Stud Health Technol Inform ; 310: 991-995, 2024 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-38269963

RESUMEN

The use of Artificial Intelligence (AI) in medicine has attracted a great deal of attention in the medical literature, but less is known about how to assess the uncertainty of individual predictions in clinical applications. This paper demonstrates the use of Conformal Prediction (CP) to provide insight on racial stratification of uncertainty quantification for breast cancer risk prediction. The results presented here show that CP methods provide important information about the diminished quality of predictions for individuals of minority racial backgrounds.


Asunto(s)
Neoplasias de la Mama , Medicina , Humanos , Femenino , Inteligencia Artificial , Incertidumbre , Mama
15.
Cureus ; 15(11): e48788, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-38098921

RESUMEN

Large language models (LLMs) have broad potential applications in medicine, such as aiding with education, providing reassurance to patients, and supporting clinical decision-making. However, there is a notable gap in understanding their applicability and performance in the surgical domain and how their performance varies across specialties. This paper aims to evaluate the performance of LLMs in answering surgical questions relevant to clinical practice and to assess how this performance varies across different surgical specialties. We used the MedMCQA dataset, a large-scale multi-choice question-answer (MCQA) dataset consisting of clinical questions across all areas of medicine. We extracted the relevant 23,035 surgical questions and submitted them to the popular LLMs Generative Pre-trained Transformers (GPT)-3.5 and GPT-4 (OpenAI OpCo, LLC, San Francisco, CA). Generative Pre-trained Transformer is a large language model that can generate human-like text by predicting subsequent words in a sentence based on the context of the words that come before it. It is pre-trained on a diverse range of texts and can perform a variety of tasks, such as answering questions, without needing task-specific training. The question-answering accuracy of GPT was calculated and compared between the two models and across surgical specialties. Both GPT-3.5 and GPT-4 achieved accuracies of 53.3% and 64.4%, respectively, on surgical questions, showing a statistically significant difference in performance. When compared to their performance on the full MedMCQA dataset, the two models performed differently: GPT-4 performed worse on surgical questions than on the dataset as a whole, while GPT-3.5 showed the opposite pattern. Significant variations in accuracy were also observed across different surgical specialties, with strong performances in anatomy, vascular, and paediatric surgery and worse performances in orthopaedics, ENT, and neurosurgery. Large language models exhibit promising capabilities in addressing surgical questions, although the variability in their performance between specialties cannot be ignored. The lower performance of the latest GPT-4 model on surgical questions relative to questions across all medicine highlights the need for targeted improvements and continuous updates to ensure relevance and accuracy in surgical applications. Further research and continuous monitoring of LLM performance in surgical domains are crucial to fully harnessing their potential and mitigating the risks of misinformation.

16.
Cureus ; 15(11): e49019, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-38111405

RESUMEN

Background Natural language processing models are increasingly used in scientific research, and their ability to perform various tasks in the research process is rapidly advancing. This study aims to investigate whether Generative Pre-trained Transformer 4 (GPT-4) is equal to humans in writing introduction sections for scientific articles. Methods This randomized non-inferiority study was reported according to the Consolidated Standards of Reporting Trials for non-inferiority trials and artificial intelligence (AI) guidelines. GPT-4 was instructed to synthesize 18 introduction sections based on the aim of previously published studies, and these sections were compared to the human-written introductions already published in a medical journal. Eight blinded assessors randomly evaluated the introduction sections using 1-10 Likert scales. Results There was no significant difference between GPT-4 and human introductions regarding publishability and content quality. GPT-4 had one point significantly better scores in readability, which was considered a non-relevant difference. The majority of assessors (59%) preferred GPT-4, while 33% preferred human-written introductions. Based on Lix and Flesch-Kincaid scores, GPT-4 introductions were 10 and two points higher, respectively, indicating that the sentences were longer and had longer words. Conclusion GPT-4 was found to be equal to humans in writing introductions regarding publishability, readability, and content quality. The majority of assessors preferred GPT-4 introductions and less than half could determine which were written by GPT-4 or humans. These findings suggest that GPT-4 can be a useful tool for writing introduction sections, and further studies should evaluate its ability to write other parts of scientific articles.

17.
Cureus ; 15(10): e47755, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38021699

RESUMEN

Barrett's esophagus (BE) remains a significant precursor to esophageal adenocarcinoma, requiring accurate and efficient diagnosis and management. The increasing application of machine learning (ML) technologies presents a transformative opportunity for diagnosing and treating BE. This systematic review evaluates the effectiveness and accuracy of machine learning technologies in BE diagnosis and management by conducting a comprehensive search across PubMed, Scopus, and Web of Science databases up to the year 2023. The studies were organized into five categories: computer-aided systems, natural language processing and text-based systems, deep learning on histology and biopsy images, real-time and video analysis, and miscellaneous studies. Results indicate high sensitivity and specificity across machine learning applications. Specifically, computer-aided systems showed sensitivities ranging from 84% to 100% and specificities from 64% to 90.7%. Natural language processing and text-based systems achieved an accuracy as high as 98.7%. Deep learning techniques applied to histology and biopsy images displayed sensitivities up to greater than 90% and a specificity of 100%. Furthermore, real-time and video analysis technologies demonstrated high performance with assessment speeds of up to 48 frames per second (fps) and a mean average precision of 75.3%. Overall, the reviewed literature underscores the growing capability and efficiency of machine learning technologies in diagnosing and managing Barrett's esophagus, often outperforming traditional diagnostic methods. These findings highlight the promising future role of machine learning in enhancing clinical practice and improving patient care for Barrett's esophagus.

18.
IEEE Trans Artif Intell ; 4(4): 764-777, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37954545

RESUMEN

The black-box nature of machine learning models hinders the deployment of some high-accuracy medical diagnosis algorithms. It is risky to put one's life in the hands of models that medical researchers do not fully understand or trust. However, through model interpretation, black-box models can promptly reveal significant biomarkers that medical practitioners may have overlooked due to the surge of infected patients in the COVID-19 pandemic. This research leverages a database of 92 patients with confirmed SARS-CoV-2 laboratory tests between 18th January 2020 and 5th March 2020, in Zhuhai, China, to identify biomarkers indicative of infection severity prediction. Through the interpretation of four machine learning models, decision tree, random forests, gradient boosted trees, and neural networks using permutation feature importance, partial dependence plot, individual conditional expectation, accumulated local effects, local interpretable model-agnostic explanations, and Shapley additive explanation, we identify an increase in N-terminal pro-brain natriuretic peptide, C-reaction protein, and lactic dehydrogenase, a decrease in lymphocyte is associated with severe infection and an increased risk of death, which is consistent with recent medical research on COVID-19 and other research using dedicated models. We further validate our methods on a large open dataset with 5644 confirmed patients from the Hospital Israelita Albert Einstein, at São Paulo, Brazil from Kaggle, and unveil leukocytes, eosinophils, and platelets as three indicative biomarkers for COVID-19.

19.
Cureus ; 15(9): e46066, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37900468

RESUMEN

Due to the increased burden of chronic medical conditions in recent years, artificial intelligence (AI) is suggested in the medical field to optimize health care. Physicians could implement these automated problem-solving tools for their benefit, reducing their workload, assisting in diagnostics, and supporting clinical decision-making. These tools are being considered for future medical assistance in real life. A literature review was performed to assess the impact of AI on the patient population with chronic medical conditions, using standardized guidelines. A MeSH strategy was created, and the database was searched for appropriate studies using specific inclusion and exclusion criteria. The online database yielded 93 results from various databases, of which 10 moderate to high-quality studies were selected to be included in our systematic review after removing the duplicates, screening titles, and articles. Of the 10 studies, nine recommended using AI after considering the potential limitations such as privacy protection, medicolegal implications, and psychosocial aspects. Due to its non-fatigable nature, AI was found to be of immense help in image recognition. It was also found to be valuable in various disciplines related to administration, physician burden, and patient adherence. The newer technologies of Chatbots and eHealth applications are of great help when used safely and effectively after proper patient education. After a careful review conducted by our team members, it is safe to conclude that implementing AI in daily clinical practice could potentiate the cognitive ability of physicians and decrease the workload through various automated technologies such as image recognition, speech recognition, and voice recognition due to its unmatchable speed and non-fatigable nature when compared to clinicians. Despite its vast benefits to the medical field, a few limitations could hinder its effective implementation into real-life practice, which requires enormous research and strict regulations to support its role as a physician's aid. However, AI should only be used as a medical support system, in order to improve the primary outcomes such as reducing waiting time, healthcare costs, and workload. AI should not be meant to replace physicians.

20.
Cureus ; 15(9): e45911, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37885556

RESUMEN

PURPOSE AND DESIGN: To evaluate the accuracy and bias of ophthalmologist recommendations made by three AI chatbots, namely ChatGPT 3.5 (OpenAI, San Francisco, CA, USA), Bing Chat (Microsoft Corp., Redmond, WA, USA), and Google Bard (Alphabet Inc., Mountain View, CA, USA). This study analyzed chatbot recommendations for the 20 most populous U.S. cities. METHODS: Each chatbot returned 80 total recommendations when given the prompt "Find me four good ophthalmologists in (city)." Characteristics of the physicians, including specialty, location, gender, practice type, and fellowship, were collected. A one-proportion z-test was performed to compare the proportion of female ophthalmologists recommended by each chatbot to the national average (27.2% per the Association of American Medical Colleges (AAMC)). Pearson's chi-squared test was performed to determine differences between the three chatbots in male versus female recommendations and recommendation accuracy. RESULTS: Female ophthalmologists recommended by Bing Chat (1.61%) and Bard (8.0%) were significantly less than the national proportion of 27.2% practicing female ophthalmologists (p<0.001, p<0.01, respectively). ChatGPT recommended fewer female (29.5%) than male ophthalmologists (p<0.722). ChatGPT (73.8%), Bing Chat (67.5%), and Bard (62.5%) gave high rates of inaccurate recommendations. Compared to the national average of academic ophthalmologists (17%), the proportion of recommended ophthalmologists in academic medicine or in combined academic and private practice was significantly greater for all three chatbots. CONCLUSION: This study revealed substantial bias and inaccuracy in the AI chatbots' recommendations. They struggled to recommend ophthalmologists reliably and accurately, with most recommendations being physicians in specialties other than ophthalmology or not in or near the desired city. Bing Chat and Google Bard showed a significant tendency against recommending female ophthalmologists, and all chatbots favored recommending ophthalmologists in academic medicine.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...