Búsqueda | Biblioteca Virtual en Salud

1.

AI interprets the Central Dogma and Genetic Code.

Ille, Alexander M; Mathews, Michael B.

Trends Biochem Sci ; 48(12): 1014-1018, 2023 12.

Artículo en Inglés | MEDLINE | ID: mdl-37833131

RESUMEN

Generative artificial intelligence (AI) is a burgeoning field with widespread applications, including in science. Here, we explore two paradigms that provide insight into the capabilities and limitations of Chat Generative Pre-trained Transformer (ChatGPT): its ability to (i) define a core biological concept (the Central Dogma of molecular biology); and (ii) interpret the genetic code.

Asunto(s)

Inteligencia Artificial , Código Genético , Biología Molecular

2.

Uncovering the semantics of concepts using GPT-4.

Le Mens, Gaël; Kovács, Balázs; Hannan, Michael T; Pros, Guillem.

Proc Natl Acad Sci U S A ; 120(49): e2309350120, 2023 Dec 05.

Artículo en Inglés | MEDLINE | ID: mdl-38032930

RESUMEN

The ability of recent Large Language Models (LLMs) such as GPT-3.5 and GPT-4 to generate human-like texts suggests that social scientists could use these LLMs to construct measures of semantic similarity that match human judgment. In this article, we provide an empirical test of this intuition. We use GPT-4 to construct a measure of typicality-the similarity of a text document to a concept. We evaluate its performance against other model-based typicality measures in terms of the correlation with human typicality ratings. We conduct this comparative analysis in two domains: the typicality of books in literary genres (using an existing dataset of book descriptions) and the typicality of tweets authored by US Congress members in the Democratic and Republican parties (using a novel dataset). The typicality measure produced with GPT-4 meets or exceeds the performance of the previous state-of-the art typicality measure we introduced in a recent paper [G. Le Mens, B. Kovács, M. T. Hannan, G. Pros Rius, Sociol. Sci. 2023, 82-117 (2023)]. It accomplishes this without any training with the research data (it is zero-shot learning). This is a breakthrough because the previous state-of-the-art measure required fine-tuning an LLM on hundreds of thousands of text documents to achieve its performance.

3.

ChatGPT outperforms crowd workers for text-annotation tasks.

Gilardi, Fabrizio; Alizadeh, Meysam; Kubli, Maël.

Proc Natl Acad Sci U S A ; 120(30): e2305016120, 2023 Jul 25.

Artículo en Inglés | MEDLINE | ID: mdl-37463210

RESUMEN

Many NLP applications require manual text annotations for a variety of tasks, notably to train classifiers or evaluate the performance of unsupervised models. Depending on the size and degree of complexity, the tasks may be conducted by crowd workers on platforms such as MTurk as well as trained annotators, such as research assistants. Using four samples of tweets and news articles (n = 6,183), we show that ChatGPT outperforms crowd workers for several annotation tasks, including relevance, stance, topics, and frame detection. Across the four datasets, the zero-shot accuracy of ChatGPT exceeds that of crowd workers by about 25 percentage points on average, while ChatGPT's intercoder agreement exceeds that of both crowd workers and trained annotators for all tasks. Moreover, the per-annotation cost of ChatGPT is less than $0.003-about thirty times cheaper than MTurk. These results demonstrate the potential of large language models to drastically increase the efficiency of text classification.

4.

Large language models show human-like content biases in transmission chain experiments.

Acerbi, Alberto; Stubbersfield, Joseph M.

Proc Natl Acad Sci U S A ; 120(44): e2313790120, 2023 Oct 31.

Artículo en Inglés | MEDLINE | ID: mdl-37883432

RESUMEN

As the use of large language models (LLMs) grows, it is important to examine whether they exhibit biases in their output. Research in cultural evolution, using transmission chain experiments, demonstrates that humans have biases to attend to, remember, and transmit some types of content over others. Here, in five preregistered experiments using material from previous studies with human participants, we use the same, transmission chain-like methodology, and find that the LLM ChatGPT-3 shows biases analogous to humans for content that is gender-stereotype-consistent, social, negative, threat-related, and biologically counterintuitive, over other content. The presence of these biases in LLM output suggests that such content is widespread in its training data and could have consequential downstream effects, by magnifying preexisting human tendencies for cognitively appealing and not necessarily informative, or valuable, content.

Asunto(s)

Evolución Cultural , Lenguaje , Humanos , Recuerdo Mental , Sesgo , Teoría Ética

5.

Opportunities and challenges for ChatGPT and large language models in biomedicine and health.

Tian, Shubo; Jin, Qiao; Yeganova, Lana; Lai, Po-Ting; Zhu, Qingqing; Chen, Xiuying; Yang, Yifan; Chen, Qingyu; Kim, Won; Comeau, Donald C; Islamaj, Rezarta; Kapoor, Aadit; Gao, Xin; Lu, Zhiyong.

Brief Bioinform ; 25(1)2023 11 22.

Artículo en Inglés | MEDLINE | ID: mdl-38168838

RESUMEN

ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically, we explore the areas of biomedical information retrieval, question answering, medical text summarization, information extraction and medical education and investigate whether LLMs possess the transformative power to revolutionize these tasks or whether the distinct complexities of biomedical domain presents unique challenges. Following an extensive literature survey, we find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods. For other applications, the advances have been modest. Overall, LLMs have not yet revolutionized biomedicine, but recent rapid progress indicates that such methods hold great potential to provide valuable means for accelerating discovery and improving health. We also find that the use of LLMs, like ChatGPT, in the fields of biomedicine and health entails various risks and challenges, including fabricated information in its generated responses, as well as legal and privacy concerns associated with sensitive patient data. We believe this survey can provide a comprehensive and timely overview to biomedical researchers and healthcare practitioners on the opportunities and challenges associated with using ChatGPT and other LLMs for transforming biomedicine and health.

Asunto(s)

Almacenamiento y Recuperación de la Información , Lenguaje , Humanos , Privacidad , Investigadores

6.

Comprehensive evaluation of molecule property prediction with ChatGPT.

Cai, Xibao; Lai, Houtim; Wang, Xing; Wang, Longyue; Liu, Wei; Wang, Yijun; Wang, Zixu; Cao, Dongsheng; Zeng, Xiangxiang.

Methods ; 222: 133-141, 2024 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-38242382

RESUMEN

The versatility of ChatGPT in performing a diverse range of tasks has elicited considerable interest on its potential applications within professional fields. Taking drug discovery as a testbed, this paper provides a comprehensive evaluation of ChatGPT's ability on molecule property prediction. The study focuses on three aspects: 1) Effects of different prompt settings, where we investigate the impact of varying prompts on the prediction outcomes of ChatGPT; 2) Comprehensive evaluation on molecule property prediction, where we conduct a comprehensive evaluation on 53 ADMET-related endpoints; 3) Analysis of ChatGPT's potential and limitations, where we make comparisons with models tailored for molecule property prediction, thus gaining a more accurate understanding of ChatGPT's capabilities and limitations in this area. Through comprehensive evaluation, we find that 1) With appropriate prompt settings, ChatGPT can attain satisfactory prediction outcomes that are competitive with specialized models designed for those tasks. 2) Prompt settings significantly affect ChatGPT's performance. Among all prompt settings, the strategy of selecting examples in few-shot has the greatest impact on results. Scaffold sampling greatly outperforms random sampling. 3) The capacity of ChatGPT to accomplish high-precision predictions is significantly influenced by the quality of examples provided, which may constrain its practical applicability in real-world scenarios. This work highlights ChatGPT's potential and limitations on molecule property prediction, which we hope can inspire future design and evaluation of Large Language Models within scientific domains.

Asunto(s)

Descubrimiento de Drogas , Proyectos de Investigación

7.

Special considerations for the use of AI tools by PEERs as a learning and communication aid.

Arango, Maria Camila; Hincapié-Otero, María; Hardeman, Keisha; Shao, Bryanna; Starbird, Leo; Starbird, Chrystal.

J Cell Physiol ; 239(7): e31339, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38924572

RESUMEN

There is no doubt that navigating academia is a formidable challenge, particularly for those from underrepresented backgrounds who face additional barriers at every turn. In such an environment, efforts to create learning and training environments that are diverse, equitable, and inclusive can feel like an uphill battle. We believe that harnessing the power of artificial intelligence (AI) tools can help in leveling the playing field. While AI cannot supplant the need for supportive mentorship, it can serve as a vital supplement, offering guidance and assistance to those who may lack access to adequate avenues of support. Embracing AI in this context should not be stigmatized, as it may represent a vital lifeline for underrepresented individuals who often face systemic biases while forging their own paths in pursuit of success and belonging in academia. AI tools should not be gatekept from these individuals, particularly by those in positions of power and privilege within the scientific community. Instead, we argue, institutions should make a strong commitment to educating their community members on how to ethically harness these tools.

Asunto(s)

Inteligencia Artificial , Aprendizaje , Humanos , Grupo Paritario , Comunicación , Mentores

8.

Black Box Warning: Large Language Models and the Future of Infectious Diseases Consultation.

Schwartz, Ilan S; Link, Katherine E; Daneshjou, Roxana; Cortés-Penfield, Nicolás.

Clin Infect Dis ; 78(4): 860-866, 2024 Apr 10.

Artículo en Inglés | MEDLINE | ID: mdl-37971399

RESUMEN

Large language models (LLMs) are artificial intelligence systems trained by deep learning algorithms to process natural language and generate text responses to user prompts. Some approach physician performance on a range of medical challenges, leading some proponents to advocate for their potential use in clinical consultation and prompting some consternation about the future of cognitive specialties. However, LLMs currently have limitations that preclude safe clinical deployment in performing specialist consultations, including frequent confabulations, lack of contextual awareness crucial for nuanced diagnostic and treatment plans, inscrutable and unexplainable training data and methods, and propensity to recapitulate biases. Nonetheless, considering the rapid improvement in this technology, growing calls for clinical integration, and healthcare systems that chronically undervalue cognitive specialties, it is critical that infectious diseases clinicians engage with LLMs to enable informed advocacy for how they should-and shouldn't-be used to augment specialist care.

Asunto(s)

Enfermedades Transmisibles , Etiquetado de Medicamentos , Humanos , Inteligencia Artificial , Enfermedades Transmisibles/diagnóstico , Lenguaje , Derivación y Consulta

9.

Can Chatbot Artificial Intelligence Replace Infectious Diseases Physicians in the Management of Bloodstream Infections? A Prospective Cohort Study.

Maillard, Alexis; Micheli, Giulia; Lefevre, Leila; Guyonnet, Cécile; Poyart, Claire; Canouï, Etienne; Belan, Martin; Charlier, Caroline.

Clin Infect Dis ; 78(4): 825-832, 2024 Apr 10.

Artículo en Inglés | MEDLINE | ID: mdl-37823416

RESUMEN

BACKGROUND: The development of chatbot artificial intelligence (AI) has raised major questions about their use in healthcare. We assessed the quality and safety of the management suggested by Chat Generative Pre-training Transformer 4 (ChatGPT-4) in real-life practice for patients with positive blood cultures. METHODS: Over a 4-week period in a tertiary care hospital, data from consecutive infectious diseases (ID) consultations for a first positive blood culture were prospectively provided to ChatGPT-4. Data were requested to propose a comprehensive management plan (suspected/confirmed diagnosis, workup, antibiotic therapy, source control, follow-up). We compared the management plan suggested by ChatGPT-4 with the plan suggested by ID consultants based on literature and guidelines. Comparisons were performed by 2 ID physicians not involved in patient management. RESULTS: Forty-four cases with a first episode of positive blood culture were included. ChatGPT-4 provided detailed and well-written responses in all cases. AI's diagnoses were identical to those of the consultant in 26 (59%) cases. Suggested diagnostic workups were satisfactory (ie, no missing important diagnostic tests) in 35 (80%) cases; empirical antimicrobial therapies were adequate in 28 (64%) cases and harmful in 1 (2%). Source control plans were inadequate in 4 (9%) cases. Definitive antibiotic therapies were optimal in 16 (36%) patients and harmful in 2 (5%). Overall, management plans were considered optimal in only 1 patient, as satisfactory in 17 (39%), and as harmful in 7 (16%). CONCLUSIONS: The use of ChatGPT-4 without consultant input remains hazardous when seeking expert medical advice in 2023, especially for severe IDs.

Asunto(s)

Médicos , Sepsis , Humanos , Inteligencia Artificial , Estudios Prospectivos , Programas Informáticos

10.

Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study.

Iannantuono, Giovanni Maria; Bracken-Clarke, Dara; Karzai, Fatima; Choo-Wosoba, Hyoyoung; Gulley, James L; Floudas, Charalampos S.

Oncologist ; 29(5): 407-414, 2024 May 03.

Artículo en Inglés | MEDLINE | ID: mdl-38309720

RESUMEN

BACKGROUND: The capability of large language models (LLMs) to understand and generate human-readable text has prompted the investigation of their potential as educational and management tools for patients with cancer and healthcare providers. MATERIALS AND METHODS: We conducted a cross-sectional study aimed at evaluating the ability of ChatGPT-4, ChatGPT-3.5, and Google Bard to answer questions related to 4 domains of immuno-oncology (Mechanisms, Indications, Toxicities, and Prognosis). We generated 60 open-ended questions (15 for each section). Questions were manually submitted to LLMs, and responses were collected on June 30, 2023. Two reviewers evaluated the answers independently. RESULTS: ChatGPT-4 and ChatGPT-3.5 answered all questions, whereas Google Bard answered only 53.3% (Pâ<â.0001). The number of questions with reproducible answers was higher for ChatGPT-4 (95%) and ChatGPT3.5 (88.3%) than for Google Bard (50%) (Pâ<â.0001). In terms of accuracy, the number of answers deemed fully correct were 75.4%, 58.5%, and 43.8% for ChatGPT-4, ChatGPT-3.5, and Google Bard, respectively (Pâ=â.03). Furthermore, the number of responses deemed highly relevant was 71.9%, 77.4%, and 43.8% for ChatGPT-4, ChatGPT-3.5, and Google Bard, respectively (Pâ=â.04). Regarding readability, the number of highly readable was higher for ChatGPT-4 and ChatGPT-3.5 (98.1%) and (100%) compared to Google Bard (87.5%) (Pâ=â.02). CONCLUSION: ChatGPT-4 and ChatGPT-3.5 are potentially powerful tools in immuno-oncology, whereas Google Bard demonstrated relatively poorer performance. However, the risk of inaccuracy or incompleteness in the responses was evident in all 3 LLMs, highlighting the importance of expert-driven verification of the outputs returned by these technologies.

Asunto(s)

Neoplasias , Humanos , Estudios Transversales , Neoplasias/inmunología , Neoplasias/terapia , Oncología Médica/métodos , Oncología Médica/normas , Encuestas y Cuestionarios , Lenguaje , Inmunoterapia/métodos

11.

ChatGPT4 Outperforms Endoscopists for Determination of Postcolonoscopy Rescreening and Surveillance Recommendations.

Chang, Patrick W; Amini, Maziar M; Davis, Rio O; Nguyen, Denis D; Dodge, Jennifer L; Lee, Helen; Sheibani, Sarah; Phan, Jennifer; Buxbaum, James L; Sahakian, Ara B.

Clin Gastroenterol Hepatol ; 2024 May 09.

Artículo en Inglés | MEDLINE | ID: mdl-38729387

RESUMEN

BACKGROUND & AIMS: Large language models including Chat Generative Pretrained Transformers version 4 (ChatGPT4) improve access to artificial intelligence, but their impact on the clinical practice of gastroenterology is undefined. This study compared the accuracy, concordance, and reliability of ChatGPT4 colonoscopy recommendations for colorectal cancer rescreening and surveillance with contemporary guidelines and real-world gastroenterology practice. METHODS: History of present illness, colonoscopy data, and pathology reports from patients undergoing procedures at 2 large academic centers were entered into ChatGPT4 and it was queried for the next recommended colonoscopy follow-up interval. Using the McNemar test and inter-rater reliability, we compared the recommendations made by ChatGPT4 with the actual surveillance interval provided in the endoscopist's procedure report (gastroenterology practice) and the appropriate US Multisociety Task Force (USMSTF) guidance. The latter was generated for each case by an expert panel using the clinical information and guideline documents as reference. RESULTS: Text input of de-identified data into ChatGPT4 from 505 consecutive patients undergoing colonoscopy between January 1 and April 30, 2023, elicited a successful follow-up recommendation in 99.2% of the queries. ChatGPT4 recommendations were in closer agreement with the USMSTF Panel (85.7%) than gastroenterology practice recommendations with the USMSTF Panel (75.4%) (P < .001). Of the 14.3% discordant recommendations between ChatGPT4 and the USMSTF Panel, recommendations were for later screening in 26 (5.1%) and for earlier screening in 44 (8.7%) cases. The inter-rater reliability was good for ChatGPT4 vs USMSTF Panel (Fleiss κ, 0.786; 95% CI, 0.734-0.838; P < .001). CONCLUSIONS: Initial real-world results suggest that ChatGPT4 can define routine colonoscopy screening intervals accurately based on verbatim input of clinical data. Large language models have potential for clinical applications, but further training is needed for broad use.

12.

The role of an artificial intelligence model in antiretroviral therapy counselling and advice for people living with HIV.

Koh, Matthew Chung Yi; Ngiam, Jinghao Nicholas; Yong, Joy; Tambyah, Paul Anantharajah; Archuleta, Sophia.

HIV Med ; 25(4): 504-508, 2024 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-38169077

RESUMEN

OBJECTIVES: People living with HIV may find personalized access to accurate information on antiretroviral therapy (ART) challenging given the stigma and costs potentially associated with attending physical consultations. Artificial intelligence (AI) chatbots such as ChatGPT may help to lower barriers to accessing information addressing concerns around ART initiation. However, the safety and accuracy of the information provided remains to be studied. METHODS: We instructed ChatGPT to answer questions that people living with HIV frequently ask about ART, covering i) knowledge of and access to ART; ii) ART initiation, side effects, and adherence, and iii) general sexual health practices while receiving ART. We checked the accuracy of the advice against international HIV clinical practice guidelines. RESULTS: ChatGPT answered all questions accurately and comprehensively. It recognized potentially life-threatening scenarios such as abacavir hypersensitivity reaction and gave appropriate advice. However, in certain contexts, such as specific geographic locations or for pregnant individuals, the advice lacked specificity to an individual's unique circumstances and may be inadequate. Nevertheless, ChatGPT consistently re-directed the individual to seek help from a healthcare professional to obtain targeted advice. CONCLUSIONS: ChatGPT may act as a useful adjunct in the process of ART counselling for people living with HIV. Improving access to information on and knowledge about ART may improve access and adherence to ART and outcomes for people living with HIV overall.

Asunto(s)

Infecciones por VIH , Embarazo , Femenino , Humanos , Infecciones por VIH/tratamiento farmacológico , Inteligencia Artificial , Consejo , Personal de Salud

13.

Evaluating the efficacy of few-shot learning for GPT-4Vision in neurodegenerative disease histopathology: A comparative analysis with convolutional neural network model.

Ono, Daisuke; Dickson, Dennis W; Koga, Shunsuke.

Neuropathol Appl Neurobiol ; 50(4): e12997, 2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-39010256

RESUMEN

AIMS: Recent advances in artificial intelligence, particularly with large language models like GPT-4Vision (GPT-4V)-a derivative feature of ChatGPT-have expanded the potential for medical image interpretation. This study evaluates the accuracy of GPT-4V in image classification tasks of histopathological images and compares its performance with a traditional convolutional neural network (CNN). METHODS: We utilised 1520 images, including haematoxylin and eosin staining and tau immunohistochemistry, from patients with various neurodegenerative diseases, such as Alzheimer's disease (AD), progressive supranuclear palsy (PSP) and corticobasal degeneration (CBD). We assessed GPT-4V's performance using multi-step prompts to determine how textual context influences image interpretation. We also employed few-shot learning to enhance improvements in GPT-4V's diagnostic performance in classifying three specific tau lesions-astrocytic plaques, neuritic plaques and tufted astrocytes-and compared the outcomes with the CNN model YOLOv8. RESULTS: GPT-4V accurately recognised staining techniques and tissue origin but struggled with specific lesion identification. The interpretation of images was notably influenced by the provided textual context, which sometimes led to diagnostic inaccuracies. For instance, when presented with images of the motor cortex, the diagnosis shifted inappropriately from AD to CBD or PSP. However, few-shot learning markedly improved GPT-4V's diagnostic capabilities, enhancing accuracy from 40% in zero-shot learning to 90% with 20-shot learning, matching the performance of YOLOv8, which required 100-shot learning to achieve the same accuracy. CONCLUSIONS: Although GPT-4V faces challenges in independently interpreting histopathological images, few-shot learning significantly improves its performance. This approach is especially promising for neuropathology, where acquiring extensive labelled datasets is often challenging.

Asunto(s)

Redes Neurales de la Computación , Enfermedades Neurodegenerativas , Humanos , Enfermedades Neurodegenerativas/patología , Interpretación de Imagen Asistida por Computador/métodos , Enfermedad de Alzheimer/patología

14.

ChatGPT4's Proficiency in Addressing Patients' Questions on Systemic Lupus Erythematosus: A Blinded Comparative Study with Specialists.

Xu, Dan; Zhao, Jinxia; Liu, Rui; Dai, Yijun; Sun, Kai; Wong, Priscilla; Ming, Samuel Lee Shang; Wearn, Koh Li; Wang, Jiangyuan; Xie, Shasha; Zeng, Lin; Mu, Rong; Xu, Chuanhui.

Rheumatology (Oxford) ; 2024 Apr 22.

Artículo en Inglés | MEDLINE | ID: mdl-38648756

RESUMEN

OBJECTIVES: The efficacy of artificial intelligence (AI)-driven chatbots like ChatGPT4 in specialized medical consultations, particularly in rheumatology, remains underexplored. This study compares the proficiency of ChatGPT4' responses with practicing rheumatologists to inquiries from patients with systemic lupus erythematosus (SLE). METHODS: In this cross-sectional study, we curated 95 frequently asked questions (FAQs), including 55 in Chinese and 40 in English. Responses for FAQs from ChatGPT4 and 5 rheumatologists were scored separately by a panel of rheumatologists and a group of patients with SLE across 6 domains (scientific validity, logical consistency, comprehensibility, completeness, satisfaction level, and empathy) on a 0-10 scale (a score of 0 indicates entirely incorrect responses, while 10 indicates accurate and comprehensive answers). RESULTS: Rheumatologists' scoring revealed that ChatGPT4-generated responses outperformed those from rheumatologists in satisfaction level and empathy, with mean differences of 0.537 (95% CI, 0.252-0.823; p < 0.01) and 0.460 (95% CI, 0.227-0.693 p < 0.01), respectively. From the SLE patients' perspective, ChatGPT4-generated responses were comparable to the rheumatologist-provided answers in all 6 domains. Subgroup analysis revealed ChatGPT4 responses were more logically consistent and complete regardless of language, and exhibited greater comprehensibility, satisfaction, and empathy in Chinese. However, ChatGPT4 responses were inferior in comprehensibility for English FAQs. CONCLUSION: ChatGPT4 demonstrated comparable, possibly better in certain domains, to address FAQs from patients with SLE, when compared with the answers provided by specialists. This study showed the potential of applying ChatGPT4 to improve consultation in SLE patients.

15.

Performance of Artificial Intelligence Content Detectors Using Human and Artificial Intelligence-Generated Scientific Writing.

Flitcroft, Madelyn A; Sheriff, Salma A; Wolfrath, Nathan; Maddula, Ragasnehith; McConnell, Laura; Xing, Yun; Haines, Krista L; Wong, Sandra L; Kothari, Anai N.

Ann Surg Oncol ; 2024 Jun 22.

Artículo en Inglés | MEDLINE | ID: mdl-38909113

RESUMEN

BACKGROUND: Few studies have examined the performance of artificial intelligence (AI) content detection in scientific writing. This study evaluates the performance of publicly available AI content detectors when applied to both human-written and AI-generated scientific articles. METHODS: Articles published in Annals of Surgical Oncology (ASO) during the year 2022, as well as AI-generated articles using OpenAI's ChatGPT, were analyzed by three AI content detectors to assess the probability of AI-generated content. Full manuscripts and their individual sections were evaluated. Group comparisons and trend analyses were conducted by using ANOVA and linear regression. Classification performance was determined using area under the curve (AUC). RESULTS: A total of 449 original articles met inclusion criteria and were evaluated to determine the likelihood of being generated by AI. Each detector also evaluated 47 AI-generated articles by using titles from ASO articles. Human-written articles had an average probability of being AI-generated of 9.4% with significant differences between the detectors. Only two (0.4%) human-written manuscripts were detected as having a 0% probability of being AI-generated by all three detectors. Completely AI-generated articles were evaluated to have a higher average probability of being AI-generated (43.5%) with a range from 12.0 to 99.9%. CONCLUSIONS: This study demonstrates differences in the performance of various AI content detectors with the potential to label human-written articles as AI-generated. Any effort toward implementing AI detectors must include a strategy for continuous evaluation and validation as AI models and detectors rapidly evolve.

16.

ChatGPT: a reliable fertility decision-making tool?

Beilby, Kiri; Hammarberg, Karin.

Hum Reprod ; 39(3): 443-447, 2024 Mar 01.

Artículo en Inglés | MEDLINE | ID: mdl-38199794

RESUMEN

The internet is the primary source of infertility-related information for most people who are experiencing fertility issues. Although no longer shrouded in stigma, the privacy of interacting only with a computer provides a sense of safety when engaging with sensitive content and allows for diverse and geographically dispersed communities to connect and share their experiences. It also provides businesses with a virtual marketplace for their products. The introduction of ChatGPT, a conversational language model developed by OpenAI to understand and generate human-like text in response to user input, in November 2022, and other emerging generative artificial intelligence (AI) language models, has changed and will continue to change the way we interact with large volumes of digital information. When it comes to its application in health information seeking, specifically in relation to fertility in this case, is ChatGPT a friend or foe in helping people make well-informed decisions? Furthermore, if deemed useful, how can we ensure this technology supports fertility-related decision-making? After conducting a study into the quality of the information provided by ChatGPT to people seeking information on fertility, we explore the potential benefits and pitfalls of using generative AI as a tool to support decision-making.

Asunto(s)

Inteligencia Artificial , Infertilidad , Humanos , Fertilidad , Infertilidad/terapia , Comercio , Comunicación

17.

Bridging bytes and biopsies: A comparative analysis of ChatGPT and histopathologists in pathology diagnosis and collaborative potential.

Oon, Ming Liang; Syn, Nicholas L; Tan, Char Loo; Tan, Kong-Bing; Ng, Siok-Bian.

Histopathology ; 84(4): 601-613, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-38032062

RESUMEN

BACKGROUND AND AIMS: ChatGPT is a powerful artificial intelligence (AI) chatbot developed by the OpenAI research laboratory which is capable of analysing human input and generating human-like responses. Early research into the potential application of ChatGPT in healthcare has focused mainly on clinical and administrative functions. The diagnostic ability and utility of ChatGPT in histopathology is not well defined. We benchmarked the performance of ChatGPT against pathologists in diagnostic histopathology, and evaluated the collaborative potential between pathologists and ChatGPT to deliver more accurate diagnoses. METHODS AND RESULTS: In Part 1 of the study, pathologists and ChatGPT were subjected to a series of questions encompassing common diagnostic conundrums in histopathology. For Part 2, pathologists reviewed a series of challenging virtual slides and provided their diagnoses before and after consultation with ChatGPT. We found that ChatGPT performed worse than pathologists in reaching the correct diagnosis. Consultation with ChatGPT provided limited help and information generated from ChatGPT is dependent on the prompts provided by the pathologists and is not always correct. Finally, we surveyed pathologists who rated the diagnostic accuracy of ChatGPT poorly, but found it useful as an advanced search engine. CONCLUSIONS: The use of ChatGPT4 as a diagnostic tool in histopathology is limited by its inherent shortcomings. Judicious evaluation of the information and histopathology diagnosis generated from ChatGPT4 is essential and cannot replace the acuity and judgement of a pathologist. However, future advances in generative AI may expand its role in the field of histopathology.

Asunto(s)

Inteligencia Artificial , Patólogos , Humanos , Biopsia , Derivación y Consulta , Programas Informáticos

18.

The role of artificial intelligence in informed patient consent for radiotherapy treatments-a case report.

Moll, M; Heilemann, G; Georg, Dietmar; Kauer-Dorner, D; Kuess, P.

Strahlenther Onkol ; 200(6): 544-548, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38180493

RESUMEN

Recent advancements in large language models (LMM; e.g., ChatGPT (OpenAI, San Francisco, California, USA)) have seen widespread use in various fields, including healthcare. This case study reports on the first use of LMM in a pretreatment discussion and in obtaining informed consent for a radiation oncology treatment. Further, the reproducibility of the replies by ChatGPT 3.5 was analyzed. A breast cancer patient, following legal consultation, engaged in a conversation with ChatGPT 3.5 regarding her radiotherapy treatment. The patient posed questions about side effects, prevention, activities, medications, and late effects. While some answers contained inaccuracies, responses closely resembled doctors' replies. In a final evaluation discussion, the patient, however, stated that she preferred the presence of a physician and expressed concerns about the source of the provided information. The reproducibility was tested in ten iterations. Future guidelines for using such models in radiation oncology should be driven by medical professionals. While artificial intelligence (AI) supports essential tasks, human interaction remains crucial.

Asunto(s)

Inteligencia Artificial , Neoplasias de la Mama , Consentimiento Informado , Humanos , Femenino , Neoplasias de la Mama/radioterapia , Relaciones Médico-Paciente , Oncología por Radiación , Persona de Mediana Edad

19.

New Frontiers in Health Literacy: Using ChatGPT to Simplify Health Information for People in the Community.

Ayre, Julie; Mac, Olivia; McCaffery, Kirsten; McKay, Brad R; Liu, Mingyi; Shi, Yi; Rezwan, Atria; Dunn, Adam G.

J Gen Intern Med ; 39(4): 573-577, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-37940756

RESUMEN

BACKGROUND: Most health information does not meet the health literacy needs of our communities. Writing health information in plain language is time-consuming but the release of tools like ChatGPT may make it easier to produce reliable plain language health information. OBJECTIVE: To investigate the capacity for ChatGPT to produce plain language versions of health texts. DESIGN: Observational study of 26 health texts from reputable websites. METHODS: ChatGPT was prompted to 'rewrite the text for people with low literacy'. Researchers captured three revised versions of each original text. MAIN MEASURES: Objective health literacy assessment, including Simple Measure of Gobbledygook (SMOG), proportion of the text that contains complex language (%), number of instances of passive voice and subjective ratings of key messages retained (%). KEY RESULTS: On average, original texts were written at grade 12.8 (SD = 2.2) and revised to grade 11.0 (SD = 1.2), p < 0.001. Original texts were on average 22.8% complex (SD = 7.5%) compared to 14.4% (SD = 5.6%) in revised texts, p < 0.001. Original texts had on average 4.7 instances (SD = 3.2) of passive text compared to 1.7 (SD = 1.2) in revised texts, p < 0.001. On average 80% of key messages were retained (SD = 15.0). The more complex original texts showed more improvements than less complex original texts. For example, when original texts were ≥ grade 13, revised versions improved by an average 3.3 grades (SD = 2.2), p < 0.001. Simpler original texts (< grade 11) improved by an average 0.5 grades (SD = 1.4), p < 0.001. CONCLUSIONS: This study used multiple objective assessments of health literacy to demonstrate that ChatGPT can simplify health information while retaining most key messages. However, the revised texts typically did not meet health literacy targets for grade reading score, and improvements were marginal for texts that were already relatively simple.

Asunto(s)

Alfabetización en Salud , Humanos , Comprensión , Lenguaje , Lectura

20.

Comment on published article "A chat about bipolar disorder".

Daungsupawong, Hinpetch; Wiwanitkit, Viroj.

Bipolar Disord ; 26(2): 190, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-38238101

RESUMEN

To the Editor, we follow the topic "A chat about bipolar disorder1 ". According to the study's findings, ChatGPT proved its ability to deliver basic and informative information on bipolar disorders.

Asunto(s)

Inteligencia Artificial , Trastorno Bipolar , Humanos , Trastorno Bipolar/terapia

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

Detalles de la búsqueda