RESUMEN
BACKGROUND: From a healthcare professional's perspective, the use of ChatGPT (Open AI), a large language model (LLM), offers huge potential as a practical and economic digital assistant. However, ChatGPT has not yet been evaluated for the interpretation of polysomnographic results in patients with suspected obstructive sleep apnea (OSA). AIMS/OBJECTIVES: To evaluate the agreement of polysomnographic result interpretation between ChatGPT-4o and a board-certified sleep physician and to shed light into the role of ChatGPT-4o in the field of medical decision-making in sleep medicine. MATERIAL AND METHODS: For this proof-of-concept study, 40 comprehensive patient profiles were designed, which represent a broad and typical spectrum of cases, ensuring a balanced distribution of demographics and clinical characteristics. After various prompts were tested, one prompt was used for initial diagnosis of OSA and a further for patients with positive airway pressure (PAP) therapy intolerance. Each polysomnographic result was independently evaluated by ChatGPT-4o and a board-certified sleep physician. Diagnosis and therapy suggestions were analyzed for agreement. RESULTS: ChatGPT-4o and the sleep physician showed 97% (29/30) concordance in the diagnosis of the simple cases. For the same cases the two assessment instances unveiled 100% (30/30) concordance regarding therapy suggestions. For cases with intolerance of treatment with positive airway pressure (PAP) ChatGPT-4o and the sleep physician revealed 70% (7/10) concordance in the diagnosis and 44% (22/50) concordance for therapy suggestions. CONCLUSION AND SIGNIFICANCE: Precise prompting improves the output of ChatGPT-4o and provides sleep physician-like polysomnographic result interpretation. Although ChatGPT shows some shortcomings in offering treatment advice, our results provide evidence for AI assisted automation and economization of polysomnographic interpretation by LLMs. Further research should explore data protection issues and demonstrate reproducibility with real patient data on a larger scale.
RESUMEN
BACKGROUND: Due to heterogeneous data, the indication for elective neck dissection (END) in patients with squamous cell carcinoma of the hypopharynx and oropharynx (HPSCC and OPSCC) in stages T1/2N0 is somewhat unclear. Therefore, in this multicenter study, we performed detailed analysis of the metastatic behavior of HPSCC and OPSCC. MATERIAL AND METHODS: The nodal metastatic patterns of 262 HPSCC and OPSCC patients who had undergone surgery was retrospectively investigated. In addition, recurrence-free and overall survival were recorded. Furthermore, a systematic literature review on the topic was completed. RESULTS: In patients with HPSCC, a discrepancy between clinical and pathologic N status was recorded in 62.1% of patients vs. 52.4% for p16- OPSCC, and 43.6% for p16+ OPSCC. The occult metastasis rate in cT1/2cN0 primary tumors was 38.9% for HPSCC vs. 17.8% (p16- OPSCC) and 11.1% (p16+ OPSCC). Contralateral metastases occurred in 22.2% of cases for HPSCC at stages cT1/2cN0, compared to only 9.1% for p16- OPSCC, and 0% for p16+ OPSCC patients.Patients with p16+ OPSCC had better recurrence-free and overall survival than p16- OPSCC and HPSCC patients. A direct association between patient survival and the extent of neck surgical therapy could not be demonstrated in our patients. CONCLUSION: Patients with HPSCC are at risk for bilateral neck metastases from stage cT1/2cN0, justifying bilateral END. Patients with T1/2 OPSCC present with occult metastases ipsilaterally in >20% of cases; however, the risk for contralateral occult metastasis is <10%. Hence, in strictly lateralized cT1/2CN0 tumors, omission of contralateral END may be considered.
RESUMEN
Background: Current interest surrounding large language models (LLMs) will lead to an increase in their use for medical advice. Although LLMs offer huge potential, they also pose potential misinformation hazards. Objective: This study evaluates three LLMs answering urology-themed clinical case-based questions by comparing the quality of answers to those provided by urology consultants. Methods: Forty-five case-based questions were answered by consultants and LLMs (ChatGPT 3.5, ChatGPT 4, Bard). Answers were blindly rated using a six-step Likert scale by four consultants in the categories: 'medical adequacy', 'conciseness', 'coherence' and 'comprehensibility'. Possible misinformation hazards were identified; a modified Turing test was included, and the character count was matched. Results: Higher ratings in every category were recorded for the consultants. LLMs' overall performance in language-focused categories (coherence and comprehensibility) was relatively high. Medical adequacy was significantly poorer compared with the consultants. Possible misinformation hazards were identified in 2.8% to 18.9% of answers generated by LLMs compared with <1% of consultant's answers. Poorer conciseness rates and a higher character count were provided by LLMs. Among individual LLMs, ChatGPT 4 performed best in medical accuracy (p < 0.0001) and coherence (p = 0.001), whereas Bard received the lowest scores. Generated responses were accurately associated with their source with 98% accuracy in LLMs and 99% with consultants. Conclusions: The quality of consultant answers was superior to LLMs in all categories. High semantic scores for LLM answers were found; however, the lack of medical accuracy led to potential misinformation hazards from LLM 'consultations'. Further investigations are necessary for new generations.
RESUMEN
BACKGROUND: Large language models (LLMs), such as ChatGPT (Open AI), are increasingly used in medicine and supplement standard search engines as information sources. This leads to more "consultations" of LLMs about personal medical symptoms. OBJECTIVE: This study aims to evaluate ChatGPT's performance in answering clinical case-based questions in otorhinolaryngology (ORL) in comparison to ORL consultants' answers. METHODS: We used 41 case-based questions from established ORL study books and past German state examinations for doctors. The questions were answered by both ORL consultants and ChatGPT 3. ORL consultants rated all responses, except their own, on medical adequacy, conciseness, coherence, and comprehensibility using a 6-point Likert scale. They also identified (in a blinded setting) if the answer was created by an ORL consultant or ChatGPT. Additionally, the character count was compared. Due to the rapidly evolving pace of technology, a comparison between responses generated by ChatGPT 3 and ChatGPT 4 was included to give an insight into the evolving potential of LLMs. RESULTS: Ratings in all categories were significantly higher for ORL consultants (P<.001). Although inferior to the scores of the ORL consultants, ChatGPT's scores were relatively higher in semantic categories (conciseness, coherence, and comprehensibility) compared to medical adequacy. ORL consultants identified ChatGPT as the source correctly in 98.4% (121/123) of cases. ChatGPT's answers had a significantly higher character count compared to ORL consultants (P<.001). Comparison between responses generated by ChatGPT 3 and ChatGPT 4 showed a slight improvement in medical accuracy as well as a better coherence of the answers provided. Contrarily, neither the conciseness (P=.06) nor the comprehensibility (P=.08) improved significantly despite the significant increase in the mean amount of characters by 52.5% (n= (1470-964)/964; P<.001). CONCLUSIONS: While ChatGPT provided longer answers to medical problems, medical adequacy and conciseness were significantly lower compared to ORL consultants' answers. LLMs have potential as augmentative tools for medical care, but their "consultation" for medical problems carries a high risk of misinformation as their high semantic quality may mask contextual deficits.
RESUMEN
Toxicity tests in rodents are still considered a controversial topic concerning their ethical justifiability. The chick embryo chorioallantoic membrane (CAM) assay may offer a simple and inexpensive alternative. The CAM assay is easy to perform and has low bureaucratic hurdles. At the same time, the CAM assay allows the application of a broad variety of analytical methods in the field of nanotoxicological research. We evaluated the CAM assay as a methodology for the determination of nanotoxicity. Therefore we calculated the median lethal dose (LD50), performed in vivo microscopy and immunohistochemistry to identify organ-specific accumulation profiles, potential organ damage, and the kinetics of the in vivo circulation of the nanoparticles. Zinc oxide nanoparticles were intravascularly injected on day 10 of the egg development and showed an LD50 of 17.5 µM (1.4 µg/mLeggcontent). In comparison, the LD50 of equivalent amounts of Zn2+ was 4.6 µM (0.6 µg/mLeggcontent). Silica encapsulated ZnO@SiO2 nanoparticles conjugated with fluorescein circulated in the bloodstream for at least 24 h. Particles accumulated mostly in the liver and kidney. In immunohistochemical staining, organ damage was detected only in liver tissue after intravascular injection of zinc oxide nanoparticles in very high concentrations. Zinc oxide nanoparticles showed a different pharmacokinetic profile compared to Zn2+ ions. In conclusion, the CAM assay has proven to be a promising methodology for evaluating nanotoxicity and for the assessment of the in vivo accumulation profiles of nanoparticles. These findings may qualify the methodology for risk assessment of innovative nanotherapeutics in the future.
Asunto(s)
Nanopartículas , Óxido de Zinc , Animales , Bioensayo , Embrión de Pollo , Membrana Corioalantoides , Nanopartículas/toxicidad , Dióxido de SilicioRESUMEN
The chorioallantoic-membrane (CAM)-assay is an established model for in vivo tumor research. Contrary to rodent-xenograft-models, the CAM-assay does not require breeding of immunodeficient strains due to native immunodeficiency. This allows xenografts to grow on the non-innervated CAM without pain or impairment for the embryo. Considering multidirectional tumor growth, limited monitoring capability of tumor size is the main methodological limitation of the CAM-assay for tumor research. Enclosure of the tumor by the radiopaque eggshell and the small structural size only allows monitoring from above and challenges established imaging techniques. We report the eligibility of ultrasonography for repetitive visualization of tumor growth and vascularization in the CAM-assay. After tumor ingrowth, ultrasonography was repetitively performed in ovo using a commercial ultrasonographic scanner. Finally, the tumor was excised and histologically analyzed. Tumor growth and angiogenesis were successfully monitored and findings in ultrasonographic imaging significantly correlated with results obtained in histological analysis. Ultrasonography is cost efficient and widely available. Tumor imaging in ovo enables the longitudinal monitoring of tumoral development, yet allowing high quantitative output due to the CAM-assays simple and cheap methodology. Thus, this methodological novelty improves reproducibility in the field of in vivo tumor experimentation emphasizing the CAM-assay as an alternative to rodent-xenograft-models.