Pesquisa | Secretaria de Estado da Saúde

Unveiling the clinical incapabilities: a benchmarking study of GPT-4V(ision) for ophthalmic multimodal image analysis.

Xu, Pusheng; Chen, Xiaolan; Zhao, Ziwei; Shi, Danli.

Br J Ophthalmol ; 2024 May 24.

Artigo em Inglês | MEDLINE | ID: mdl-38789133

RESUMO

PURPOSE: To evaluate the capabilities and incapabilities of a GPT-4V(ision)-based chatbot in interpreting ocular multimodal images. METHODS: We developed a digital ophthalmologist app using GPT-4V and evaluated its performance with a dataset (60 images, 60 ophthalmic conditions, 6 modalities) that included slit-lamp, scanning laser ophthalmoscopy, fundus photography of the posterior pole (FPP), optical coherence tomography, fundus fluorescein angiography and ocular ultrasound images. The chatbot was tested with ten open-ended questions per image, covering examination identification, lesion detection, diagnosis and decision support. The responses were manually assessed for accuracy, usability, safety and diagnosis repeatability. Auto-evaluation was performed using sentence similarity and GPT-4-based auto-evaluation. RESULTS: Out of 600 responses, 30.6% were accurate, 21.5% were highly usable and 55.6% were deemed as no harm. GPT-4V performed best with slit-lamp images, with 42.0%, 38.5% and 68.5% of the responses being accurate, highly usable and no harm, respectively. However, its performance was weaker in FPP images, with only 13.7%, 3.7% and 38.5% in the same categories. GPT-4V correctly identified 95.6% of the imaging modalities and showed varying accuracies in lesion identification (25.6%), diagnosis (16.1%) and decision support (24.0%). The overall repeatability of GPT-4V in diagnosing ocular images was 63.3% (38/60). The overall sentence similarity between responses generated by GPT-4V and human answers is 55.5%, with Spearman correlations of 0.569 for accuracy and 0.576 for usability. CONCLUSION: GPT-4V currently is not yet suitable for clinical decision-making in ophthalmology. Our study serves as a benchmark for enhancing ophthalmic multimodal models.

ICGA-GPT: report generation and question answering for indocyanine green angiography images.

Chen, Xiaolan; Zhang, Weiyi; Zhao, Ziwei; Xu, Pusheng; Zheng, Yingfeng; Shi, Danli; He, Mingguang.

Br J Ophthalmol ; 2024 Mar 26.

Artigo em Inglês | MEDLINE | ID: mdl-38508675

RESUMO

BACKGROUND: Indocyanine green angiography (ICGA) is vital for diagnosing chorioretinal diseases, but its interpretation and patient communication require extensive expertise and time-consuming efforts. We aim to develop a bilingual ICGA report generation and question-answering (QA) system. METHODS: Our dataset comprised 213 129 ICGA images from 2919 participants. The system comprised two stages: image-text alignment for report generation by a multimodal transformer architecture, and large language model (LLM)-based QA with ICGA text reports and human-input questions. Performance was assessed using both qualitative metrics (including Bilingual Evaluation Understudy (BLEU), Consensus-based Image Description Evaluation (CIDEr), Recall-Oriented Understudy for Gisting Evaluation-Longest Common Subsequence (ROUGE-L), Semantic Propositional Image Caption Evaluation (SPICE), accuracy, sensitivity, specificity, precision and F1 score) and subjective evaluation by three experienced ophthalmologists using 5-point scales (5 refers to high quality). RESULTS: We produced 8757 ICGA reports covering 39 disease-related conditions after bilingual translation (66.7% English, 33.3% Chinese). The ICGA-GPT model's report generation performance was evaluated with BLEU scores (1-4) of 0.48, 0.44, 0.40 and 0.37; CIDEr of 0.82; ROUGE of 0.41 and SPICE of 0.18. For disease-based metrics, the average specificity, accuracy, precision, sensitivity and F1 score were 0.98, 0.94, 0.70, 0.68 and 0.64, respectively. Assessing the quality of 50 images (100 reports), three ophthalmologists achieved substantial agreement (kappa=0.723 for completeness, kappa=0.738 for accuracy), yielding scores from 3.20 to 3.55. In an interactive QA scenario involving 100 generated answers, the ophthalmologists provided scores of 4.24, 4.22 and 4.10, displaying good consistency (kappa=0.779). CONCLUSION: This pioneering study introduces the ICGA-GPT model for report generation and interactive QA for the first time, underscoring the potential of LLMs in assisting with automated ICGA image interpretation.

FFA-GPT: an automated pipeline for fundus fluorescein angiography interpretation and question-answer.

Chen, Xiaolan; Zhang, Weiyi; Xu, Pusheng; Zhao, Ziwei; Zheng, Yingfeng; Shi, Danli; He, Mingguang.

NPJ Digit Med ; 7(1): 111, 2024 May 03.

Artigo em Inglês | MEDLINE | ID: mdl-38702471

RESUMO

Fundus fluorescein angiography (FFA) is a crucial diagnostic tool for chorioretinal diseases, but its interpretation requires significant expertise and time. Prior studies have used Artificial Intelligence (AI)-based systems to assist FFA interpretation, but these systems lack user interaction and comprehensive evaluation by ophthalmologists. Here, we used large language models (LLMs) to develop an automated interpretation pipeline for both report generation and medical question-answering (QA) for FFA images. The pipeline comprises two parts: an image-text alignment module (Bootstrapping Language-Image Pre-training) for report generation and an LLM (Llama 2) for interactive QA. The model was developed using 654,343 FFA images with 9392 reports. It was evaluated both automatically, using language-based and classification-based metrics, and manually by three experienced ophthalmologists. The automatic evaluation of the generated reports demonstrated that the system can generate coherent and comprehensible free-text reports, achieving a BERTScore of 0.70 and F1 scores ranging from 0.64 to 0.82 for detecting top-5 retinal conditions. The manual evaluation revealed acceptable accuracy (68.3%, Kappa 0.746) and completeness (62.3%, Kappa 0.739) of the generated reports. The generated free-form answers were evaluated manually, with the majority meeting the ophthalmologists' criteria (error-free: 70.7%, complete: 84.0%, harmless: 93.7%, satisfied: 65.3%, Kappa: 0.762-0.834). This study introduces an innovative framework that combines multi-modal transformers and LLMs, enhancing ophthalmic image interpretation, and facilitating interactive communications during medical consultation.

ChatFFA: An ophthalmic chat system for unified vision-language understanding and question answering for fundus fluorescein angiography.

Chen, Xiaolan; Xu, Pusheng; Li, Yao; Zhang, Weiyi; Song, Fan; He, Mingguang; Shi, Danli.

iScience ; 27(7): 110021, 2024 Jul 19.

Artigo em Inglês | MEDLINE | ID: mdl-39055931

RESUMO

Existing automatic analysis of fundus fluorescein angiography (FFA) images faces limitations, including a predetermined set of possible image classifications and being confined to text-based question-answering (QA) approaches. This study aims to address these limitations by developing an end-to-end unified model that utilizes synthetic data to train a visual question-answering model for FFA images. To achieve this, we employed ChatGPT to generate 4,110,581 QA pairs for a large FFA dataset, which encompassed a total of 654,343 FFA images from 9,392 participants. We then fine-tuned the Bootstrapping Language-Image Pre-training (BLIP) framework to enable simultaneous handling of vision and language. The performance of the fine-tuned model (ChatFFA) was thoroughly evaluated through automated and manual assessments, as well as case studies based on an external validation set, demonstrating satisfactory results. In conclusion, our ChatFFA system paves the way for improved efficiency and feasibility in medical imaging analysis by leveraging generative large language models.

Profiles of Cough and Associated Risk Factors in Nonhospitalized Individuals With SARS-CoV-2 Omicron Variant Infection: Cross-Sectional Online Survey in China.

Xu, Tingting; Chen, Yuehan; Zhan, Wenzhi; Chung, Kian Fan; Qiu, Zhongmin; Huang, Kewu; Chen, Ruchong; Xie, Jiaxing; Wang, Gang; Zhang, Min; Wang, Xuefen; Yao, Hongmei; Liao, Xiuqing; Zhang, Yunhui; Zhang, Guojun; Zhang, Wei; Sun, Dejun; Zhu, Jia; Jiang, Shujuan; Feng, Juntao; Zhao, Jianping; Sun, Gengyun; Huang, Huaqiong; Zhang, Jianyong; Wang, Lingwei; Wu, Feng; Li, Suyun; Xu, Pusheng; Chi, Chunhua; Chen, Ping; Jiang, Mei; He, Wen; Huang, Lianrong; Luo, Wei; Li, Shiyue; Zhong, Nanshan; Lai, Kefang.

JMIR Public Health Surveill ; 10: e47453, 2024 Feb 05.

Artigo em Inglês | MEDLINE | ID: mdl-38315527

RESUMO

BACKGROUND: Cough is a common symptom during and after COVID-19 infection; however, few studies have described the cough profiles of COVID-19. OBJECTIVE: The aim of this study was to investigate the prevalence, severity, and associated risk factors of severe and persistent cough in individuals with COVID-19 during the latest wave of the Omicron variant in China. METHODS: In this nationwide cross-sectional study, we collected information of the characteristics of cough from individuals with infection of the SARS-CoV-2 Omicron variant using an online questionnaire sent between December 31, 2022, and January 11, 2023. RESULTS: There were 11,718 (n=7978, 68.1% female) nonhospitalized responders, with a median age of 37 (IQR 30-47) years who responded at a median of 16 (IQR 12-20) days from infection onset to the time of the survey. Cough was the most common symptom, occurring in 91.7% of participants, followed by fever, fatigue, and nasal congestion (68.8%-87.4%). The median cough visual analog scale (VAS) score was 70 (IQR 50-80) mm. Being female (odds ratio [OR] 1.31, 95% CI 1.20-1.43), having a COVID-19 vaccination history (OR 1.71, 95% CI 1.37-2.12), current smoking (OR 0.48, 95% CI 0.41-0.58), chronic cough (OR 2.04, 95% CI 1.69-2.45), coronary heart disease (OR 1.71, 95% CI 1.17-2.52), asthma (OR 1.22, 95% CI 1.02-1.46), and gastroesophageal reflux disease (GERD) (OR 1.21, 95% CI 1.01-1.45) were independent factors for severe cough (VAS>70, 37.4%). Among all respondents, 35.0% indicated having a productive cough, which was associated with risk factors of being female (OR 1.44, 95% CI 1.31-1.57), having asthma (OR 1.84, 95% CI 1.52-2.22), chronic cough (OR 1.44, 95% CI 1.19-1.74), and GERD (OR 1.22, 95% CI 1.01-1.47). Persistent cough (>3 weeks) occurred in 13.0% of individuals, which was associated with the risk factors of having diabetes (OR 2.24, 95% CI 1.30-3.85), asthma (OR 1.70, 95% CI 1.11-2.62), and chronic cough (OR 1.97, 95% CI 1.32-2.94). CONCLUSIONS: Cough is the most common symptom in nonhospitalized individuals with Omicron SARS-CoV-2 variant infection. Being female, having asthma, chronic cough, GERD, coronary heart disease, diabetes, and a COVID-19 vaccination history emerged as independent factors associated with severe cough, productive cough, and persistent cough.

Assuntos

Asma , COVID-19 , Doença das Coronárias , Diabetes Mellitus , Refluxo Gastroesofágico , Feminino , Humanos , Lactente , Masculino , SARS-CoV-2 , Estudos Transversais , Vacinas contra COVID-19 , COVID-19/complicações , COVID-19/epidemiologia , Tosse/epidemiologia , Fatores de Risco , Tosse Crônica , China/epidemiologia , Asma/complicações , Asma/epidemiologia

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa