Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
J Med Internet Res ; 26: e54706, 2024 Apr 30.
Artículo en Inglés | MEDLINE | ID: mdl-38687566

RESUMEN

BACKGROUND: There is a dearth of feasibility assessments regarding using large language models (LLMs) for responding to inquiries from autistic patients within a Chinese-language context. Despite Chinese being one of the most widely spoken languages globally, the predominant research focus on applying these models in the medical field has been on English-speaking populations. OBJECTIVE: This study aims to assess the effectiveness of LLM chatbots, specifically ChatGPT-4 (OpenAI) and ERNIE Bot (version 2.2.3; Baidu, Inc), one of the most advanced LLMs in China, in addressing inquiries from autistic individuals in a Chinese setting. METHODS: For this study, we gathered data from DXY-a widely acknowledged, web-based, medical consultation platform in China with a user base of over 100 million individuals. A total of 100 patient consultation samples were rigorously selected from January 2018 to August 2023, amounting to 239 questions extracted from publicly available autism-related documents on the platform. To maintain objectivity, both the original questions and responses were anonymized and randomized. An evaluation team of 3 chief physicians assessed the responses across 4 dimensions: relevance, accuracy, usefulness, and empathy. The team completed 717 evaluations. The team initially identified the best response and then used a Likert scale with 5 response categories to gauge the responses, each representing a distinct level of quality. Finally, we compared the responses collected from different sources. RESULTS: Among the 717 evaluations conducted, 46.86% (95% CI 43.21%-50.51%) of assessors displayed varying preferences for responses from physicians, with 34.87% (95% CI 31.38%-38.36%) of assessors favoring ChatGPT and 18.27% (95% CI 15.44%-21.10%) of assessors favoring ERNIE Bot. The average relevance scores for physicians, ChatGPT, and ERNIE Bot were 3.75 (95% CI 3.69-3.82), 3.69 (95% CI 3.63-3.74), and 3.41 (95% CI 3.35-3.46), respectively. Physicians (3.66, 95% CI 3.60-3.73) and ChatGPT (3.73, 95% CI 3.69-3.77) demonstrated higher accuracy ratings compared to ERNIE Bot (3.52, 95% CI 3.47-3.57). In terms of usefulness scores, physicians (3.54, 95% CI 3.47-3.62) received higher ratings than ChatGPT (3.40, 95% CI 3.34-3.47) and ERNIE Bot (3.05, 95% CI 2.99-3.12). Finally, concerning the empathy dimension, ChatGPT (3.64, 95% CI 3.57-3.71) outperformed physicians (3.13, 95% CI 3.04-3.21) and ERNIE Bot (3.11, 95% CI 3.04-3.18). CONCLUSIONS: In this cross-sectional study, physicians' responses exhibited superiority in the present Chinese-language context. Nonetheless, LLMs can provide valuable medical guidance to autistic patients and may even surpass physicians in demonstrating empathy. However, it is crucial to acknowledge that further optimization and research are imperative prerequisites before the effective integration of LLMs in clinical settings across diverse linguistic environments can be realized. TRIAL REGISTRATION: Chinese Clinical Trial Registry ChiCTR2300074655; https://www.chictr.org.cn/bin/project/edit?pid=199432.


Asunto(s)
Trastorno Autístico , Femenino , Humanos , Masculino , Trastorno Autístico/psicología , China , Estudios Transversales , Pueblos del Este de Asia , Internet , Lenguaje , Relaciones Médico-Paciente , Médicos/estadística & datos numéricos , Médicos/psicología , Inteligencia Artificial
2.
J Med Syst ; 48(1): 38, 2024 Apr 03.
Artículo en Inglés | MEDLINE | ID: mdl-38568432

RESUMEN

The aim of the study is to evaluate and compare the quality and readability of responses generated by five different artificial intelligence (AI) chatbots-ChatGPT, Bard, Bing, Ernie, and Copilot-to the top searched queries of erectile dysfunction (ED). Google Trends was used to identify ED-related relevant phrases. Each AI chatbot received a specific sequence of 25 frequently searched terms as input. Responses were evaluated using DISCERN, Ensuring Quality Information for Patients (EQIP), and Flesch-Kincaid Grade Level (FKGL) and Reading Ease (FKRE) metrics. The top three most frequently searched phrases were "erectile dysfunction cause", "how to erectile dysfunction," and "erectile dysfunction treatment." Zimbabwe, Zambia, and Ghana exhibited the highest level of interest in ED. None of the AI chatbots achieved the necessary degree of readability. However, Bard exhibited significantly higher FKRE and FKGL ratings (p = 0.001), and Copilot achieved better EQIP and DISCERN ratings than the other chatbots (p = 0.001). Bard exhibited the simplest linguistic framework and posed the least challenge in terms of readability and comprehension, and Copilot's text quality on ED was superior to the other chatbots. As new chatbots are introduced, their understandability and text quality increase, providing better guidance to patients.


Asunto(s)
Inteligencia Artificial , Disfunción Eréctil , Masculino , Humanos , Programas Informáticos , Benchmarking , Lingüística
3.
Entropy (Basel) ; 25(4)2023 Apr 10.
Artículo en Inglés | MEDLINE | ID: mdl-37190427

RESUMEN

Along with the explosion of ChatGPT, the artificial intelligence question-answering system has been pushed to a climax. Intelligent question-answering enables computers to simulate people's behavior habits of understanding a corpus through machine learning, so as to answer questions in professional fields. How to obtain more accurate answers to personalized questions in professional fields is the core content of intelligent question-answering research. As one of the key technologies of intelligent question-answering, the accuracy of text matching is related to the development of the intelligent question-answering community. Aiming to solve the problem of polysemy of text, the Enhanced Representation through Knowledge Integration (ERNIE) model is used to obtain the word vector representation of text, which makes up for the lack of prior knowledge in the traditional word vector representation model. Additionally, there are also problems of homophones and polyphones in Chinese, so this paper introduces the phonetic character sequence of the text to distinguish them. In addition, aiming at the problem that there are many proper nouns in the insurance field that are difficult to identify, after conventional part-of-speech tagging, proper nouns are distinguished by especially defining their parts of speech. After the above three types of text-based semantic feature extensions, this paper also uses the Bi-directional Long Short-Term Memory (BiLSTM) and TextCNN models to extract the global features and local features of the text, respectively. It can obtain the feature representation of the text more comprehensively. Thus, the text matching model integrating BiLSTM and TextCNN fusing Multi-Feature (namely MFBT) is proposed for the insurance question-answering community. The MFBT model aims to solve the problems that affect the answer selection in the insurance question-answering community, such as proper nouns, nonstandard sentences and sparse features. Taking the question-and-answer data of the insurance library as the sample, the MFBT text-matching model is compared and evaluated with other models. The experimental results show that the MFBT text-matching model has higher evaluation index values, including accuracy, recall and F1, than other models. The model trained by historical search data can better help users in the insurance question-and-answer community obtain the answers they need and improve their satisfaction.

4.
Sensors (Basel) ; 22(3)2022 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-35162015

RESUMEN

The 'intention' classification of a user question is an important element of a task-engine driven chatbot. The essence of a user question's intention understanding is the text classification. The transfer learning, such as BERT (Bidirectional Encoder Representations from Transformers) and ERNIE (Enhanced Representation through Knowledge Integration), has put the text classification task into a new level, but the BERT and ERNIE model are difficult to support high QPS (queries per second) intelligent dialogue systems due to computational performance issues. In reality, the simple classification model usually shows a high computational performance, but they are limited by low accuracy. In this paper, we use knowledge of the ERNIE model to distill the FastText model; the ERNIE model works as a teacher model to predict the massive online unlabeled data for data enhancement, and then guides the training of the student model of FastText with better computational efficiency. The FastText model is distilled by the ERNIE model in chatbot intention classification. This not only guarantees the superiority of its original computational performance, but also the intention classification accuracy has been significantly improved.


Asunto(s)
Aprendizaje Profundo , Intención , Humanos , Aprendizaje Automático
5.
Sensors (Basel) ; 22(14)2022 Jul 13.
Artículo en Inglés | MEDLINE | ID: mdl-35890903

RESUMEN

Sentiment analysis is one of the fields of affective computing, which detects and evaluates people's psychological states and sentiments through text analysis. It is an important application of text mining technology and is widely used to analyze comments. Bullet screen videos have become a popular way for people to interact and communicate while watching online videos. Existing studies have focused on the form, content, and function of bullet screen comments, but few have examined bullet screen comments using natural language processing. Bullet screen comments are short text messages of different lengths and ambiguous emotional information, which makes it extremely challenging in natural language processing. Hence, it is important to understand how we can use the characteristics of bullet screen comments and sentiment analysis to understand the sentiments expressed and trends in bullet screen comments. This study poses the following research question: how can one analyze the sentiments ex-pressed in bullet screen comments accurately and effectively? This study mainly proposes an ERNIE-BiLSTM approach for sentiment analysis on bullet screen comments, which provides effective and innovative thinking for the sentiment analysis of bullet screen comments. The experimental results show that the ERNIE-BiLSTM approach has a higher accuracy rate, precision rate, recall rate, and F1-score than other methods.


Asunto(s)
Actitud , Minería de Datos , Emociones , Humanos , Procesamiento de Lenguaje Natural
6.
J Biomed Inform ; 108: 103492, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-32645382

RESUMEN

Chest imaging reports describe the results of chest radiography procedures. Automatic extraction of abnormal imaging signs from chest imaging reports has a pivotal role in clinical research and a wide range of downstream medical tasks. However, there are few studies on information extraction from Chinese chest imaging reports. In this paper, we formulate chest abnormal imaging sign extraction as a sequence tagging and matching problem. On this basis, we propose a transferred abnormal imaging signs extractor with pretrained ERNIE as the backbone, named EASON (fine-tuning ERNIE with CRF for Abnormal Signs ExtractiON), which can address the problem of data insufficiency. In addition, to assign the attributes (the body part and degree) to corresponding abnormal imaging signs from the results of the sequence tagging model, we design a simple but effective tag2relation algorithm based on the nature of chest imaging report text. We evaluate our method on the corpus provided by a medical big data company, and the experimental results demonstrate that our method achieves significant and consistent improvement compared to other baselines.


Asunto(s)
Registros Electrónicos de Salud , Almacenamiento y Recuperación de la Información , Algoritmos , Diagnóstico por Imagen , Informe de Investigación
7.
PeerJ Comput Sci ; 10: e2292, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39314733

RESUMEN

Indirect aggression has become a prevalent phenomenon that erodes the social media environment. Due to the expense and the difficulty in determining objectively what constitutes indirect aggression, the traditional self-reporting questionnaire is hard to be employed in the current cyber area. In this study, we present a model for predicting indirect aggression online based on pre-trained models. Building on Weibo users' social media activities, we constructed basic, dynamic, and content features and classified indirect aggression into three subtypes: social exclusion, malicious humour, and guilt induction. We then built the prediction model by combining it with large-scale pre-trained models. The empirical evidence shows that this prediction model (ERNIE) outperforms the pre-trained models and predicts indirect aggression online much better than the models without extra pre-trained information. This study offers a practical model to predict users' indirect aggression. Furthermore, this work contributes to a better understanding of indirect aggression behaviors and can support social media platforms' organization and management.

8.
Digit Health ; 10: 20552076241284771, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39386109

RESUMEN

Purpose: Large language models (LLMs) are deep learning models designed to comprehend and generate meaningful responses, which have gained public attention in recent years. The purpose of this study is to evaluate and compare the performance of LLMs in answering questions regarding breast cancer in the Chinese context. Material and Methods: ChatGPT, ERNIE Bot, and ChatGLM were chosen to answer 60 questions related to breast cancer posed by two oncologists. Responses were scored as comprehensive, correct but inadequate, mixed with correct and incorrect data, completely incorrect, or unanswered. The accuracy, length, and readability among answers from different models were evaluated using statistical software. Results: ChatGPT answered 60 questions, with 40 (66.7%) comprehensive answers and six (10.0%) correct but inadequate answers. ERNIE Bot answered 60 questions, with 34 (56.7%) comprehensive answers and seven (11.7%) correct but inadequate answers. ChatGLM generated 60 answers, with 35 (58.3%) comprehensive answers and six (10.0%) correct but inadequate answers. The differences for chosen accuracy metrics among the three LLMs did not reach statistical significance, but only ChatGPT demonstrated a sense of human compassion. The accuracy of the three models in answering questions regarding breast cancer treatment was the lowest, with an average of 44.4%. ERNIE Bot's responses were significantly shorter compared to ChatGPT and ChatGLM (p < .001 for both). The readability scores of the three models showed no statistical significance. Conclusions: In the Chinese context, the capabilities of ChatGPT, ERNIE Bot, and ChatGLM are similar in answering breast cancer-related questions at present. These three LLMs may serve as adjunct informational tools for breast cancer patients in the Chinese context, offering guidance for general inquiries. However, for highly specialized issues, particularly in the realm of breast cancer treatment, LLMs cannot deliver reliable performance. It is necessary to utilize them under the supervision of healthcare professionals.

9.
Digit Health ; 9: 20552076231193213, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37559830

RESUMEN

Medical text classification, as a fundamental medical natural language processing task, aims to identify the categories to which a short medical text belongs. Current research has focused on performing the medical text classification task using a pre-training language model through fine-tuning. However, this paradigm introduces additional parameters when training extra classifiers. Recent studies have shown that the "prompt-tuning" paradigm induces better performance in many natural language processing tasks because it bridges the gap between pre-training goals and downstream tasks. The main idea of prompt-tuning is to transform binary or multi-classification tasks into mask prediction tasks by fully exploiting the features learned by pre-training language models. This study explores, for the first time, how to classify medical texts using a discriminative pre-training language model called ERNIE-Health through prompt-tuning. Specifically, we attempt to perform prompt-tuning based on the multi-token selection task, which is a pre-training task of ERNIE-Health. The raw text is wrapped into a new sequence with a template in which the category label is replaced by a [UNK] token. The model is then trained to calculate the probability distribution of the candidate categories. Our method is tested on the KUAKE-Question Intention Classification and CHiP-Clinical Trial Criterion datasets and obtains the accuracy values of 0.866 and 0.861. In addition, the loss values of our model decrease faster throughout the training period compared to the fine-tuning. The experimental results provide valuable insights to the community and suggest that prompt-tuning can be a promising approach to improve the performance of pre-training models in domain-specific tasks.

10.
Z Gesundh Wiss ; : 1-12, 2023 May 11.
Artículo en Inglés | MEDLINE | ID: mdl-37361279

RESUMEN

Aim: The accessibility of social media data has allowed researchers to measure official-public interactions during COVID-19. However, previous work analyzing official posts or public comments has failed to explore the link between the two. Therefore, this study investigates the relationship between the communication strategies of public health agencies (PHAs) on TikTok and public emotional/sentiment tendencies in COVID-19 normalization. Subject and methods: This study uses the 2022 Shanghai city closure event as a public health communication case study in the context of COVID-19 normalization, using TikTok as a data source. We first analyze the communication strategies adopted by the PHA based on the Crisis and Emergency Risk Communication (CERC) model. Then, we classify the sentiment of public comments using the Large-Scale Knowledge Enhanced Pre-Training for Language Understanding and Generation (ERNIE) pre-training model. Finally, we explore the connection between PHA communication strategies and public sentiment tendencies. Results: First, the public's sentiment tendencies differ at different stages. Therefore, appropriate communication strategies should be developed stage-by-stage. Second, the public's emotional disposition to different communication strategies varies: government statements, vaccines, and prevention and control programs are more likely to produce a friendly comment environment, while policy and new cases per day are more likely to produce unfavorable comment content. However, this does not mean that policy and new cases per day should be avoided; the judicious use of these two strategies can help PHAs understand the current issues causing public dissatisfaction. Third, videos with celebrity appearances can significantly increase positive public sentiment and, thereby, public participation. Conclusion: We propose an improved CERC guideline for China based on the Shanghai lockdown case.

11.
Healthcare (Basel) ; 10(6)2022 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-35742169

RESUMEN

(1) Background: Poor adherence to management behaviors in Chinese Type 2 diabetes mellitus (T2DM) patients leads to an uncontrolled prognosis of diabetes, which results in significant economic costs for China. It is imperative to quickly locate vulnerability factors in the management behavior of patients with T2DM. (2) Methods: In this study, a thematic analysis of the collected interview materials was conducted to construct the themes of T2DM management vulnerability. We explored the applicability of the pre-trained models based on the evaluation metrics in text classification. (3) Results: We constructed 12 themes of vulnerability related to the health and well-being of people with T2DM in Tianjin. We considered that Bidirectional Encoder Representation from Transformers (BERT) performed better in this Natural Language Processing (NLP) task with a shorter completion time. With the splitting ratio of 6:3:1 and batch size of 64 for BERT, the test accuracy was 97.71%, the completion time was 10 min 24 s, and the macro-F1 score was 0.9752. (4) Conclusions: Our results proved the applicability of NLP techniques in this specific Chinese-language medical environment. We filled the knowledge gap in the application of NLP technologies in diabetes management. Our study provided strong support for using NLP techniques to rapidly locate vulnerability factors in T2DM management.

12.
Artículo en Inglés | MEDLINE | ID: mdl-36231895

RESUMEN

The occurrence of major health events can have a significant impact on public mood and mental health. In this study, we selected Shanghai during the 2019 novel coronavirus pandemic as a case study and Weibo texts as the data source. The ERNIE pre-training model was used to classify the text data into five emotional categories: gratitude, confidence, sadness, anger, and no emotion. The changes in public sentiment and potential influencing factors were analyzed with the emotional sequence diagram method. We also examined the causal relationship between the epidemic and public sentiment, as well as positive and negative emotions. The study found: (1) public sentiment during the epidemic was primarily affected by public behavior, government behavior, and the severity of the epidemic. (2) From the perspective of time series changes, the changes in public emotions during the epidemic were divided into emotional fermentation, emotional climax, and emotional chaos periods. (3) There was a clear causal relationship between the epidemic and the changes in public emotions, and the impact on negative emotions was greater than that of positive emotions. Additionally, positive emotions had a certain inhibitory effect on negative emotions.


Asunto(s)
COVID-19 , Medios de Comunicación Sociales , Actitud , COVID-19/epidemiología , China/epidemiología , Urgencias Médicas , Emociones , Humanos , Pandemias
13.
JMIR Med Inform ; 10(4): e35606, 2022 Apr 21.
Artículo en Inglés | MEDLINE | ID: mdl-35451969

RESUMEN

BACKGROUND: With the prevalence of online consultation, many patient-doctor dialogues have accumulated, which, in an authentic language environment, are of significant value to the research and development of intelligent question answering and automated triage in recent natural language processing studies. OBJECTIVE: The purpose of this study was to design a front-end task module for the network inquiry of intelligent medical services. Through the study of automatic labeling of real doctor-patient dialogue text on the internet, a method of identifying the negative and positive entities of dialogues with higher accuracy has been explored. METHODS: The data set used for this study was from the Spring Rain Doctor internet online consultation, which was downloaded from the official data set of Alibaba Tianchi Lab. We proposed a composite abutting joint model, which was able to automatically classify the types of clinical finding entities into the following 4 attributes: positive, negative, other, and empty. We adapted a downstream architecture in Chinese Robustly Optimized Bidirectional Encoder Representations from Transformers Pretraining Approach (RoBERTa) with whole word masking (WWM) extended (RoBERTa-WWM-ext) combining a text convolutional neural network (CNN). We used RoBERTa-WWM-ext to express sentence semantics as a text vector and then extracted the local features of the sentence through the CNN, which was our new fusion model. To verify its knowledge learning ability, we chose Enhanced Representation through Knowledge Integration (ERNIE), original Bidirectional Encoder Representations from Transformers (BERT), and Chinese BERT with WWM to perform the same task, and then compared the results. Precision, recall, and macro-F1 were used to evaluate the performance of the methods. RESULTS: We found that the ERNIE model, which was trained with a large Chinese corpus, had a total score (macro-F1) of 65.78290014, while BERT and BERT-WWM had scores of 53.18247117 and 69.2795315, respectively. Our composite abutting joint model (RoBERTa-WWM-ext + CNN) had a macro-F1 value of 70.55936311, showing that our model outperformed the other models in the task. CONCLUSIONS: The accuracy of the original model can be greatly improved by giving priority to WWM and replacing the word-based mask with unit to classify and label medical entities. Better results can be obtained by effectively optimizing the downstream tasks of the model and the integration of multiple models later on. The study findings contribute to the translation of online consultation information into machine-readable information.

14.
Front Artif Intell ; 4: 732381, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34988434

RESUMEN

Recently, several studies have reported promising results with BERT-like methods on acronym tasks. In this study, we find an older rule-based program, Ab3P, not only performs better, but error analysis suggests why. There is a well-known spelling convention in acronyms where each letter in the short form (SF) refers to "salient" letters in the long form (LF). The error analysis uses decision trees and logistic regression to show that there is an opportunity for many pre-trained models (BERT, T5, BioBert, BART, ERNIE) to take advantage of this spelling convention.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA