Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
2.
Ann Diagn Pathol ; 73: 152359, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38972166

ABSTRACT

This study aimed to evaluate and analyze the performance of a customized Chat Generative Pre-Trained Transformer (ChatGPT), known as GPT, against pathology residents in providing microscopic descriptions and diagnosing diseases from histopathological images. A dataset of representative photomicrographs from 70 diseases across 14 organ systems was analyzed by a customized version of ChatGPT-4 (GPT-4) and pathology residents. Two pathologists independently evaluated the microscopic descriptions and diagnoses using a predefined scoring system (0-4 for microscopic descriptions and 0-2 for pathological diagnoses), with higher scores indicating greater accuracy. Microscopic descriptions that received perfect scores, which included all relevant keywords and findings, were then presented to the standard version of ChatGPT to assess its diagnostic capabilities based on these descriptions. GPT-4 showed consistency in microscopic description and diagnosis scores across five rounds, accomplishing median scores of 50 % and 48.6 %, respectively. However, its performance was still inferior to junior and senior pathology residents (73.9 % and 93.9 % description scores and 63.9 % and 87.9 % diagnosis scores, respectively). When analyzing classic ChatGPT's understanding of microscopic descriptions provided by residents, it correctly diagnosed 35 (87.5 %) of cases from junior residents and 44 (68.8 %) from senior residents, given that the initial descriptions consisted of keywords and relevant findings. While GPT-4 can accurately interpret some histopathological images, its overall performance is currently inferior to that of pathology residents. However, ChatGPT's ability to accurately interpret and diagnose diseases from the descriptions provided by residents suggests that this technology could serve as a valuable support tool in pathology diagnostics.

3.
Am J Clin Pathol ; 2024 Jul 19.
Article in English | MEDLINE | ID: mdl-39030695

ABSTRACT

OBJECTIVES: This research aimed to evaluate the effectiveness of ChatGPT in accurately diagnosing hepatobiliary tumors using histopathologic images. METHODS: The study compared the diagnostic accuracies of the GPT-4 model, providing the same set of images and 2 different input prompts. The first prompt, the morphologic approach, was designed to mimic pathologists' approach to analyzing tissue morphology. In contrast, the second prompt functioned without incorporating this morphologic analysis feature. Diagnostic accuracy and consistency were analyzed. RESULTS: A total of 120 photomicrographs, composed of 60 images of each hepatobiliary tumor and nonneoplastic liver tissue, were used. The findings revealed that the morphologic approach significantly enhanced the diagnostic accuracy and consistency of the artificial intelligence (AI). This version was particularly more accurate in identifying hepatocellular carcinoma (mean accuracy: 62.0% vs 27.3%), bile duct adenoma (10.7% vs 3.3%), and cholangiocarcinoma (68.7% vs 16.0%), as well as in distinguishing nonneoplastic liver tissues (77.3% vs 37.5%) (Ps ≤ .01). It also demonstrated higher diagnostic consistency than the other model without a morphologic analysis (κ: 0.46 vs 0.27). CONCLUSIONS: This research emphasizes the importance of incorporating pathologists' diagnostic approaches into AI to enhance accuracy and consistency in medical diagnostics. It mainly showcases the AI's histopathologic promise when replicating expert diagnostic processes.

4.
Am J Clin Pathol ; 2024 Jul 27.
Article in English | MEDLINE | ID: mdl-39076014

ABSTRACT

OBJECTIVES: We sought to investigate the adoption and perception of large language model (LLM) applications among pathologists. METHODS: A cross-sectional survey was conducted, gathering data from pathologists on their usage and views concerning LLM tools. The survey, distributed globally through various digital platforms, included quantitative and qualitative questions. Patterns in the respondents' adoption and perspectives on these artificial intelligence tools were analyzed. RESULTS: Of 215 respondents, 100 (46.5%) reported using LLMs, particularly ChatGPT (OpenAI), for professional purposes, predominantly for information retrieval, proofreading, academic writing, and drafting pathology reports, highlighting a significant time-saving benefit. Academic pathologists demonstrated a better level of understanding of LLMs than their peers. Although chatbots sometimes provided incorrect general domain information, they were considered moderately proficient concerning pathology-specific knowledge. The technology was mainly used for drafting educational materials and programming tasks. The most sought-after feature in LLMs was their image analysis capabilities. Participants expressed concerns about information accuracy, privacy, and the need for regulatory approval. CONCLUSIONS: Large language model applications are gaining notable acceptance among pathologists, with nearly half of respondents indicating adoption less than a year after the tools' introduction to the market. They see the benefits but are also worried about these tools' reliability, ethical implications, and security.

5.
Am J Clin Pathol ; 2024 May 25.
Article in English | MEDLINE | ID: mdl-38795049

ABSTRACT

OBJECTIVES: To evaluate the effectiveness of ChatGPT 4 in generating multiple-choice questions (MCQs) with explanations for pathology board examinations, specifically for digestive system pathology. METHODS: The customized ChatGPT 4 model was developed for MCQ and explanation generation. Expert pathologists evaluated content accuracy and relevance. These MCQs were then administered to pathology residents, followed by an analysis focusing on question difficulty, accuracy, item discrimination, and internal consistency. RESULTS: The customized ChatGPT 4 generated 80 MCQs covering various gastrointestinal and hepatobiliary topics. While the MCQs demonstrated moderate to high agreement in evaluation parameters such as content accuracy, clinical relevance, and overall quality, there were issues in cognitive level and distractor quality. The explanations were generally acceptable. Involving 9 residents with a median experience of 1 year, the average score was 57.4 (71.8%). Pairwise comparisons revealed a significant difference in performance between each year group (P < .01). The test analysis showed moderate difficulty, effective item discrimination (index = 0.15), and good internal consistency (Cronbach's α = 0.74). CONCLUSIONS: ChatGPT 4 demonstrated significant potential as a supplementary educational tool in medical education, especially in generating MCQs with explanations similar to those seen in board examinations. While artificial intelligence-generated content was of high quality, it necessitated refinement and expert review.

6.
Am J Clin Pathol ; 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-38619043

ABSTRACT

OBJECTIVES: To evaluate the accuracy of ChatGPT and Bard in answering pathology examination questions requiring image interpretation. METHODS: The study evaluated ChatGPT-4 and Bard's performance using 86 multiple-choice questions, with 17 (19.8%) focusing on general pathology and 69 (80.2%) on systemic pathology. Of these, 62 (72.1%) included microscopic images, and 57 (66.3%) were first-order questions focusing on diagnosing the disease. The authors presented these artificial intelligence (AI) tools with questions, both with and without clinical contexts, and assessed their answers against a reference standard set by pathologists. RESULTS: ChatGPT-4 achieved a 100% (n = 86) accuracy rate in questions with clinical context, surpassing Bard's 87.2% (n = 75). Without context, the accuracy of both AI tools declined significantly, with ChatGPT-4 at 52.3% (n = 45) and Bard at 38.4% (n = 33). ChatGPT-4 consistently outperformed Bard across various categories, particularly in systemic pathology and first-order questions. A notable issue identified was Bard's tendency to "hallucinate" or provide plausible but incorrect answers, especially without clinical context. CONCLUSIONS: This study demonstrated the potential of ChatGPT and Bard in pathology education, stressing the importance of clinical context for accurate AI interpretations of pathology images. It underlined the need for careful AI integration in medical education.

7.
Ann Diagn Pathol ; 70: 152284, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38422806

ABSTRACT

OBJECTIVES: This study aimed to evaluate the accuracy and interobserver reliability of diagnosing and subtyping gastric intestinal metaplasia (IM) among general pathologists and pathology residents at a university hospital in Thailand, focusing on the challenges in the histopathologic evaluation of gastric IM for less experienced practitioners. METHODS: The study analyzed 44 non-neoplastic gastric biopsies, using a consensus diagnosis of gastrointestinal pathologists as the reference standard. Participants included 6 general pathologists and 9 pathology residents who assessed gastric IM and categorized its subtype (complete, incomplete, or mixed) on digital slides. After initial evaluations and receiving feedback, participants reviewed specific images of gastric IM, as agreed by experts. Following a one-month washout period, a reevaluation of the slides was conducted. RESULTS: Diagnostic accuracy, interobserver reliability, and time taken for diagnosis improved following training, with general pathologists showing higher accuracies than residents (median accuracy of gastric IM detection: 100 % vs. 97.7 %). Increased years of experience were associated with more IM detection accuracy (p-value<0.05). However, the overall median accuracy for diagnosing incomplete IM remained lower than for complete IM (86.4 % vs. 97.7 %). After training, diagnostic errors occurred in 6 out of 44 specimens (13.6 %), reported by over 40 % of participants. Errors involved omitting 5 slides with incomplete IM and 1 with complete IM, all showing a subtle presence of IM. CONCLUSIONS: The study highlights the diagnostic challenges in identifying incomplete gastric IM, showing notable discrepancies in accuracy and interobserver agreement. It underscores the need for better diagnostic protocols and training to enhance detection and management outcomes.


Subject(s)
Metaplasia , Observer Variation , Pathologists , Humans , Metaplasia/pathology , Biopsy/methods , Reproducibility of Results , Internship and Residency , Stomach/pathology , Thailand , Pathology, Clinical/methods , Pathology, Clinical/education , Female , Diagnostic Errors/statistics & numerical data , Diagnostic Errors/prevention & control , Stomach Neoplasms/pathology , Stomach Neoplasms/diagnosis , Male
8.
J Clin Pathol ; 2024 Jan 10.
Article in English | MEDLINE | ID: mdl-38199797

ABSTRACT

AIMS: To evaluate the accuracy of Chat Generative Pre-trained Transformer (ChatGPT) powered by GPT-4 in histopathological image detection and classification of colorectal adenomas using the diagnostic consensus provided by pathologists as a reference standard. METHODS: A study was conducted with 100 colorectal polyp photomicrographs, comprising an equal number of adenomas and non-adenomas, classified by two pathologists. These images were analysed by classic GPT-4 for 1 time in October 2023 and custom GPT-4 for 20 times in December 2023. GPT-4's responses were compared against the reference standard through statistical measures to evaluate its proficiency in histopathological diagnosis, with the pathologists further assessing the model's descriptive accuracy. RESULTS: GPT-4 demonstrated a median sensitivity of 74% and specificity of 36% for adenoma detection. The median accuracy of polyp classification varied, ranging from 16% for non-specific changes to 36% for tubular adenomas. Its diagnostic consistency, indicated by low kappa values ranging from 0.06 to 0.11, suggested only poor to slight agreement. All of the microscopic descriptions corresponded with their diagnoses. GPT-4 also commented about the limitations in its diagnoses (eg, slide diagnosis best done by pathologists, the inadequacy of single-image diagnostic conclusions, the need for clinical data and a higher magnification view). CONCLUSIONS: GPT-4 showed high sensitivity but low specificity in detecting adenomas and varied accuracy for polyp classification. However, its diagnostic consistency was low. This artificial intelligence tool acknowledged its diagnostic limitations, emphasising the need for a pathologist's expertise and additional clinical context.

SELECTION OF CITATIONS
SEARCH DETAIL