RESUMEN
The social and behavioral sciences have been increasingly using automated text analysis to measure psychological constructs in text. We explore whether GPT, the large-language model (LLM) underlying the AI chatbot ChatGPT, can be used as a tool for automated psychological text analysis in several languages. Across 15 datasets (n = 47,925 manually annotated tweets and news headlines), we tested whether different versions of GPT (3.5 Turbo, 4, and 4 Turbo) can accurately detect psychological constructs (sentiment, discrete emotions, offensiveness, and moral foundations) across 12 languages. We found that GPT (r = 0.59 to 0.77) performed much better than English-language dictionary analysis (r = 0.20 to 0.30) at detecting psychological constructs as judged by manual annotators. GPT performed nearly as well as, and sometimes better than, several top-performing fine-tuned machine learning models. Moreover, GPT's performance improved across successive versions of the model, particularly for lesser-spoken languages, and became less expensive. Overall, GPT may be superior to many existing methods of automated text analysis, since it achieves relatively high accuracy across many languages, requires no training data, and is easy to use with simple prompts (e.g., "is this text negative?") and little coding experience. We provide sample code and a video tutorial for analyzing text with the GPT application programming interface. We argue that GPT and other LLMs help democratize automated text analysis by making advanced natural language processing capabilities more accessible, and may help facilitate more cross-linguistic research with understudied languages.
Asunto(s)
Multilingüismo , Humanos , Lenguaje , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Emociones , Medios de Comunicación SocialesRESUMEN
Works of fiction play a crucial role in the production of cultural stereotypes. Concerning gender, a widely held presumption is that many such works ascribe agency to men and passivity to women. However, large-scale diachronic analyses of this notion have been lacking. This paper provides an assessment of agency attributions in 87,531 fiction works written between 1850 and 2010. It introduces a syntax-based approach for extracting networks of character interactions. Agency is then formalized as a dyadic property: Does a character primarily serve as an agent acting upon the other character or as recipient acted upon by the other character? Findings indicate that female characters are more likely to be passive in cross-gender relationships than their male counterparts. This difference, the gender agency gap, has declined since the 19th century but persists into the 21st. Male authors are especially likely to attribute less agency to female characters. Moreover, certain kinds of actions, especially physical and villainous ones, have more pronounced gender disparities.
Asunto(s)
Escritura , Femenino , Masculino , Humanos , Historia del Siglo XIX , Historia del Siglo XX , Historia del Siglo XXI , Literatura , Identidad de GéneroRESUMEN
Online reviews significantly impact consumers' decision-making process and firms' economic outcomes and are widely seen as crucial to the success of online markets. Firms, therefore, have a strong incentive to manipulate ratings using fake reviews. This presents a problem that academic researchers have tried to solve for over two decades and on which platforms expend a large amount of resources. Nevertheless, the prevalence of fake reviews is arguably higher than ever. To combat this, we collect a dataset of reviews for thousands of Amazon products and develop a general and highly accurate method for detecting fake reviews. A unique difference between previous datasets and ours is that we directly observe which sellers buy fake reviews. Thus, while prior research has trained models using laboratory-generated reviews or proxies for fake reviews, we are able to train a model using actual fake reviews. We show that products that buy fake reviews are highly clustered in the product reviewer network. Therefore, features constructed from this network are highly predictive of which products buy fake reviews. We show that our network-based approach is also successful at detecting fake review buyers even without ground truth data, as unsupervised clustering methods can accurately identify fake review buyers by identifying clusters of products that are closely connected in the network. While text or metadata can be manipulated to evade detection, network-based features are more costly to manipulate because these features result directly from the inherent limitations of buying reviews from online review marketplaces, making our detection approach more robust to manipulation.
Asunto(s)
Comercio , Envío de Mensajes de Texto , Comportamiento del Consumidor , MotivaciónRESUMEN
Using publicly available data from 299 preregistered replications from the social sciences, we found that the language used to describe a study can predict its replicability above and beyond a large set of controls related to the article characteristics, study design and results, author information, and replication effort. To understand why, we analyzed the textual differences between replicable and nonreplicable studies. Our findings suggest that the language in replicable studies is transparent and confident, written in a detailed and complex manner, and generally exhibits markers of truthful communication, possibly demonstrating the researchers' confidence in the study. Nonreplicable studies, however, are vaguely written and have markers of persuasion techniques, such as the use of positivity and clout. Thus, our findings allude to the possibility that authors of nonreplicable studies are more likely to make an effort, through their writing, to persuade readers of their (possibly weaker) results.
Asunto(s)
Lenguaje , Ciencias Sociales , Humanos , Reproducibilidad de los Resultados , EscrituraRESUMEN
PURPOSE: Worldwide clinical knowledge is expanding rapidly, but physicians have sparse time to review scientific literature. Large language models (eg, Chat Generative Pretrained Transformer [ChatGPT]), might help summarize and prioritize research articles to review. However, large language models sometimes "hallucinate" incorrect information. METHODS: We evaluated ChatGPT's ability to summarize 140 peer-reviewed abstracts from 14 journals. Physicians rated the quality, accuracy, and bias of the ChatGPT summaries. We also compared human ratings of relevance to various areas of medicine to ChatGPT relevance ratings. RESULTS: ChatGPT produced summaries that were 70% shorter (mean abstract length of 2,438 characters decreased to 739 characters). Summaries were nevertheless rated as high quality (median score 90, interquartile range [IQR] 87.0-92.5; scale 0-100), high accuracy (median 92.5, IQR 89.0-95.0), and low bias (median 0, IQR 0-7.5). Serious inaccuracies and hallucinations were uncommon. Classification of the relevance of entire journals to various fields of medicine closely mirrored physician classifications (nonlinear standard error of the regression [SER] 8.6 on a scale of 0-100). However, relevance classification for individual articles was much more modest (SER 22.3). CONCLUSIONS: Summaries generated by ChatGPT were 70% shorter than mean abstract length and were characterized by high quality, high accuracy, and low bias. Conversely, ChatGPT had modest ability to classify the relevance of articles to medical specialties. We suggest that ChatGPT can help family physicians accelerate review of the scientific literature and have developed software (pyJournalWatch) to support this application. Life-critical medical decisions should remain based on full, critical, and thoughtful evaluation of the full text of research articles in context with clinical guidelines.
Asunto(s)
Medicina , Humanos , Médicos de FamiliaRESUMEN
BACKGROUND: Laboratory data can provide great value to support research aimed at reducing the incidence, prolonging survival and enhancing outcomes of cancer. Data is characterized by the information it carries and the format it holds. Data captured in Alberta's biomarker laboratory repository is free text, cluttered and rouge. Such data format limits its utility and prohibits broader adoption and research development. Text analysis for information extraction of unstructured data can change this and lead to more complete analyses. Previous work on extracting relevant information from free text, unstructured data employed Natural Language Processing (NLP), Machine Learning (ML), rule-based Information Extraction (IE) methods, or a hybrid combination between them. METHODS: In our study, text analysis was performed on Alberta Precision Laboratories data which consisted of 95,854 entries from the Southern Alberta Dataset (SAD) and 6944 entries from the Northern Alberta Dataset (NAD). The data covers all of Alberta and is completely population-based. Our proposed framework is built around rule-based IE methods. It incorporates topics such as Syntax and Lexical analyses to achieve deterministic extraction of data from biomarker laboratory data (i.e., Epidermal Growth Factor Receptor (EGFR) test results). Lexical analysis compromises of data cleaning and pre-processing, Rich Text Format text conversion into readable plain text format, and normalization and tokenization of text. The framework then passes the text into the Syntax analysis stage which includes the rule-based method of extracting relevant data. Rule-based patterns of the test result are identified, and a Context Free Grammar then generates the rules of information extraction. Finally, the results are linked with the Alberta Cancer Registry to support real-world cancer research studies. RESULTS: Of the original 5512 entries in the SAD dataset and 5017 entries in the NAD dataset which were filtered for EGFR, the framework yielded 5129 and 3388 extracted EGFR test results from the SAD and NAD datasets, respectively. An accuracy of 97.5% was achieved on a random sample of 362 tests. CONCLUSIONS: We presented a text analysis framework to extract specific information from unstructured clinical data. Our proposed framework has shown that it can successfully extract relevant information from EGFR test results.
Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Humanos , Carcinoma de Pulmón de Células no Pequeñas/diagnóstico , Carcinoma de Pulmón de Células no Pequeñas/genética , Laboratorios , NAD , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Mutación , Procesamiento de Lenguaje Natural , Receptores ErbB , Biomarcadores , Registros Electrónicos de SaludRESUMEN
Climate change is a common challenge faced by all humanity. Promoting emission and carbon reduction in agricultural land is the most important priority for addressing climate change and realizing sustainable development. Based on data from 296 prefecture-level cities in China from 2011 to 2021, this study utilizes machine-learning and text-analysis methods to construct an indicator of government climate-risk attention (GCRA). It combines a two-way fixed-effects model to investigate how GCRA affects agricultural-land carbon emissions (ALCE) and carbon intensity (ALCI) and the mechanism of the impact. The results indicate that (1) GCRA substantially reduces ALCE and ALCI, and the conclusions are robust to a battery of tests. Furthermore, (2) mechanism analysis reveals that GCRA primarily uses three mechanisms-strengthening environmental regulation, promoting agricultural green-technology innovation, and upgrading agricultural-land mechanization-to reduce ALCE and lower ALCI. Additionally, (3) heterogeneity analysis suggests that the carbon-emission reduction effect of GCRA is more significant in the east, in arid and humid climate zones, and in non-grain-producing regions. Finally, (4) spatial-spillover effect analysis and quantile regression results demonstrate that GCRA also significantly inhibits carbon emissions and the carbon intensity of nearby agricultural land, with the inhibition effect becoming more pronounced at higher levels of government attention. This study's discoveries are helpful in promoting the emission reduction and carbon sequestration of agricultural land and provide references for developing countries to cope with climate change.
Asunto(s)
Agricultura , Cambio Climático , China , Carbono/análisis , GobiernoRESUMEN
BACKGROUND: Effective communication between patients and healthcare providers in the emergency department (ED) is challenging due to the dynamic nature of the ED environment. This study aimed to trial a chat service enabling patients in the ED and their family members to ask questions freely, exploring the service's feasibility and user experience. OBJECTIVES: To identify the types of needs and inquiries from patients and family members in the ED that could be addressed through the chat service and to assess the user experience of the service. METHODS: We enrolled patients and family members aged over 19 years in the ED, providing the chat service for up to 4 h per ED visit. Trained research nurses followed specific guidelines to respond to messages from the participants. After participation, participants were required to complete a survey. Those who agreed also participated in interviews to provide insights on their experiences with the ED chat service. RESULTS: A total of 40 participants (20 patients and 20 family members) sent 305 messages (72 by patients and 233 by family members), with patients sending an average of 3.6 messages and family members 11.7. Research nurses resolved 41.4% of patient inquiries and 70.9% of family member inquiries without further healthcare provider involvement. High usability was reported, with positive feedback on communication with healthcare workers, information accessibility, and emotional support. CONCLUSIONS: The ED chat service was found to be feasible and led to positive user experiences for both patients and their family members.
Asunto(s)
Servicio de Urgencia en Hospital , Familia , Humanos , Masculino , Femenino , Adulto , Familia/psicología , Persona de Mediana Edad , Comunicación , Anciano , Satisfacción del Paciente , Encuestas y Cuestionarios , Adulto JovenRESUMEN
Metacognitive frameworks such as processing fluency often suggest people respond more favorably to simple and common language versus complex and technical language. It is easier for people to process information that is simple and nontechnical compared to complex information, therefore leading to more engagement with targets. In two studies covering 12 field samples (total n = 1,064,533), we establish and replicate this simpler-is-better phenomenon by demonstrating people engage more with nontechnical language when giving their time and attention (e.g., simple online language tends to receive more social engagements). However, people respond to complex language when giving their money (e.g., complex language within charitable giving campaigns and grant abstracts tend to receive more money). This evidence suggests people engage with the heuristic of complex language differently depending on a time or money target. These results underscore language as a lens into social and psychological processes and computational methods to measure text patterns at scale.
Asunto(s)
Comprensión , Minería de Datos , Procesamiento Automatizado de Datos , Pruebas Psicológicas/normas , Femenino , Humanos , Lenguaje , Masculino , Reconocimiento en PsicologíaRESUMEN
The global coronavirus disease 2019 (COVID-19) pandemic has necessitated the establishment of new medical care systems worldwide. Medical staff treating COVID-19 patients perform their care duties in highly challenging and psychologically demanding situations, raising concerns about their impact on patient safety. Therefore, this study aimed to investigate and characterize incident reports related to COVID-19 patients to clarify the impact of COVID-19 on patient safety. The study included data from 557 patients admitted to the Critical Care Center of a tertiary-care teaching hospital in Osaka, Japan, from April 2020 to March 2021. The patients were divided into two groups: COVID-19 (n = 106) and non-COVID-19 (n = 451) and compared based on various characteristics, incident reporting rates, and the content of incident reports. The findings indicated a significantly higher rate of patients with incident reports in the COVID-19 group compared to the non-COVID-19 group (49.1% vs. 24.4%, P < 0.001). In addition, quantitative text analysis revealed that the topic ratio, consisting of "respiration," "circuit," "settings," "connection," "nursing," "ventilator," "control," "tape," "Oxylog®," and "artificial nose" was significantly higher in the incident reports of the COVID-19 group (P = 0.003). In conclusion, COVID-19 patients are more susceptible to adverse incidents and may face a higher risk of patient safety issues. The characteristic topics in incident reports involving COVID-19 patients primarily revolved around ventilator-related issues. In the future, the methodology used in the current study may be utilized to identify incident characteristics and implement appropriate countermeasures in the event of unknown patient safety issues.
Asunto(s)
COVID-19 , Humanos , Japón/epidemiología , COVID-19/epidemiología , Gestión de Riesgos , Cuidados Críticos , Hospitales de EnseñanzaRESUMEN
This study explores the potential of using large language models to assist content analysis by conducting a case study to identify adverse events (AEs) in social media posts. The case study compares ChatGPT's performance with human annotators' in detecting AEs associated with delta-8-tetrahydrocannabinol, a cannabis-derived product. Using the identical instructions given to human annotators, ChatGPT closely approximated human results, with a high degree of agreement noted: 94.4% (9436/10,000) for any AE detection (Fleiss κ=0.95) and 99.3% (9931/10,000) for serious AEs (κ=0.96). These findings suggest that ChatGPT has the potential to replicate human annotation accurately and efficiently. The study recognizes possible limitations, including concerns about the generalizability due to ChatGPT's training data, and prompts further research with different models, data sources, and content analysis tasks. The study highlights the promise of large language models for enhancing the efficiency of biomedical research.
Asunto(s)
Medios de Comunicación Sociales , Humanos , Medios de Comunicación Sociales/estadística & datos numéricos , Dronabinol/efectos adversos , Procesamiento de Lenguaje NaturalRESUMEN
Recent history has shown both the benefits and risks of information sharing among firms. Information is shared to facilitate mutual business objectives. However, information sharing can also introduce security-related concerns that could expose the firm to a breach of privacy, with significant economic, reputational, and safety implications. It is imperative for organizations to leverage available information to evaluate security related to information sharing when evaluating current and potential information-sharing partnerships. The "fine print" or privacy policies of firms can provide a signal of security across a wide variety of firms being considered for new and continued information-sharing partnerships. In this article, we develop a methodology to gauge and benchmark information security policies in the partner-selection process that can help direct risk-based investments in information sharing security. We develop a methodology to collect and interpret firm privacy policies, evaluate characteristics of those policies by leveraging natural language processing metrics and developing benchmarking metrics, and understand how those characteristics relate to one another in information-sharing partnership situations. We demonstrate the methodology on 500 high-revenue firms. The methodology and managerial insights will be of interest to risk managers, information security professionals, and individuals forming information sharing agreements across industries.
RESUMEN
Green manufacturing, widely recognized as a crucial avenue for companies to achieve sustainable competitive advantages, exerts significant spillover effects on both environmental protection and social responsibility. Accordingly, this can significantly enhance corporate Environmental, Social, and Governance (ESG) practices and enhance overall ESG performance. This study focuses on Chinese A-share listed firms between 2009 and 2022, characterizing their ESG performance. Employing quasi-natural experimental methods, this research evaluates the causal relationships and specific channels through which green manufacturing enhances corporate ESG performance. The findings demonstrate that green manufacturing significantly empowers companies to strengthen their ESG performance. Channel analysis indicates that enhanced green innovation capabilities and the reduction of financing constraints constitute two critical channels through which green manufacturing facilitates ESG performance improvement. Further analysis indicates that all three subsystems in the green manufacturing system significantly empower companies to enhance ESG performance to varying degrees, with the most significant empowerment effect observed in the environmental dimension of ESG performance. The analysis below indicates that the enhancement of corporate ESG performance yields dual benefits, reflected in both economic gains and reputational enhancements.
RESUMEN
For the longest time, the gold standard in preparing spoken language corpora for text analysis in psychology was using human transcription. However, such standard comes at extensive cost, and creates barriers to quantitative spoken language analysis that recent advances in speech-to-text technology could address. The current study quantifies the accuracy of AI-generated transcripts compared to human-corrected transcripts across younger (n = 100) and older (n = 92) adults and two spoken language tasks. Further, it evaluates the validity of Linguistic Inquiry and Word Count (LIWC)-features extracted from these two kinds of transcripts, as well as transcripts specifically prepared for LIWC analyses via tagging. We find that overall, AI-generated transcripts are highly accurate with a word error rate of 2.50% to 3.36%, albeit being slightly less accurate for younger compared to older adults. LIWC features extracted from either transcripts are highly correlated, while the tagging procedure significantly alters filler word categories. Based on these results, automatic speech-to-text appears to be ready for psychological language research when using spoken language tasks in relatively quiet environments, unless filler words are of interest to researchers.
Asunto(s)
Habla , Humanos , Anciano , Adulto , Masculino , Adulto Joven , Femenino , Persona de Mediana Edad , Lenguaje , Psicolingüística/métodos , Inteligencia Artificial , Lingüística , Adolescente , Anciano de 80 o más AñosRESUMEN
Recent approaches to text analysis from social media and other corpora rely on word lists to detect topics, measure meaning, or to select relevant documents. These lists are often generated by applying computational lexicon expansion methods to small, manually curated sets of seed words. Despite the wide use of this approach, we still lack an exhaustive comparative analysis of the performance of lexicon expansion methods and how they can be improved with additional linguistic data. In this work, we present LEXpander, a method for lexicon expansion that leverages novel data on colexification, i.e., semantic networks connecting words with multiple meanings according to shared senses. We evaluate LEXpander in a benchmark including widely used methods for lexicon expansion based on word embedding models and synonym networks. We find that LEXpander outperforms existing approaches in terms of both precision and the trade-off between precision and recall of generated word lists in a variety of tests. Our benchmark includes several linguistic categories, as words relating to the financial area or to the concept of friendship, and sentiment variables in English and German. We also show that the expanded word lists constitute a high-performing text analysis method in application cases to various English corpora. This way, LEXpander poses a systematic automated solution to expand short lists of words into exhaustive and accurate word lists that can closely approximate word lists generated by experts in psychology and linguistics.
Asunto(s)
Lingüística , Medios de Comunicación Sociales , Humanos , SemánticaRESUMEN
Generative AI, short for Generative Artificial Intelligence, a class of artificial intelligence systems, is not currently the choice technology for text analysis, but prior work suggests it may have some utility to assess dynamics like emotion. The current work builds upon this empirical foundation to consider how analytic thinking scores from a large language model chatbot, ChatGPT, were linked to analytic thinking scores from dictionary-based tools like Linguistic Inquiry and Word Count (LIWC). Using over 16,000 texts from four samples and tested against three prompts and two large language models (GPT-3.5, GPT-4), the evidence suggests there were small associations between ChatGPT and LIWC analytic thinking scores (meta-analytic effect sizes: .058 < rs < .304; ps < .001). When given the formula to calculate the LIWC analytic thinking index, ChatGPT performed incorrect mathematical operations in 22% of the cases, suggesting basic word and number processing may be unreliable with large language models. Researchers should be cautious when using AI for text analysis.
Asunto(s)
Inteligencia Artificial , Lenguaje , Pensamiento , Humanos , Pensamiento/fisiologíaRESUMEN
This study examines the extent to which general and substantive accountability is integrated into the language used by key actors involved in nursing home services. Particularly, we investigate the messages used by the supply side, which includes public and private organizations involved in residential care for older adults, and the demand side, which comprises organizations representing service beneficiaries. Moreover, we explore the alignment between the messages used by both sides of the accountability relationship. In the context of Spanish nursing homes, we analyzed a corpus of tweets by organizations from both sides of the accountability relationship, from one year before the outbreak of COVID-19 restrictions to after their implementation. Using text analysis techniques, we found that messages related to general and substantive accountability had a low priority before and after the outbreak. Public organizations were slightly more likely to employ general accountability terms than private organizations. This is particularly in non-crisis situations, although less frequently than organizations representing beneficiaries. Our analysis demonstrates a lack of convergence between the messaging on the supply and demand sides, indicating a communication breakdown between the two sides in the accountability relationship.
RESUMEN
The article considers issues of how population behavior impacts realization of state anti-epidemic measures and efforts to control pandemic. Materials and Methods. The methodology of the study is based on such methods as text analysis, elastic network and construction of regression equations. The analysis of indicators characterizing state policy measures controlling pandemic was applied according to data from The Oxford COVID-19 Government Response Tracker portal. The behavioral reactions of population were assessed by text analysis of messages in Twitter and VKontakte social networks using the Rulexicon, tonalities dictionary of Russian language. The analysis of mobility was implemented on basis of data from Google Community Mobility Reports (GCMR). The study base includes data of March 12, 2020 - August 1, 2021. It is established that in controlling pandemic the most effective is to apply combination of measures implemented at state level of the Ministry of Health and the Ministry of Economic Development of the Russian Federation that permits to compensate negative effect of quarantine regimen. In the Russian Federation,effect of self-isolation measures, organization of remote work of employees of enterprises, closure of schools, wearing masks is controversial and their incorrect application can contribute to virus propagation. The vaccination measures are also effective in reducing morbidity of disease, but they are characterized by lagging effect. The approval and acceptance by population anti-epidemic measures significantly impact efficiency of pandemic control. The study results can be applied in practice of implementation of anti-epidemic measures as a tool preventing excessive risks of population morbidity and mortality.
Asunto(s)
COVID-19 , Humanos , COVID-19/prevención & control , COVID-19/epidemiología , Federación de Rusia/epidemiología , Pandemias/prevención & control , Cuarentena , SARS-CoV-2 , Conductas Relacionadas con la SaludRESUMEN
Database records contain useful information, which is readily available, but, unfortunately, limited compared to the source (publications). Our study reviewed the text fragments supporting the association between the biological macromolecules and diseases from Open Targets to map them on the biological level of study (DNA/RNA, proteins, metabolites). We screened records using a dictionary containing terms related to the selected levels of study, reviewed 600 hits manually and used machine learning to classify 31,260 text fragments. Our results indicate that association studies between diseases and macromolecules conducted on the level of DNA and RNA prevail, followed by the studies on the level of proteins and metabolites. We conclude that there is a clear need to translate the knowledge from the DNA/RNA level to the evidence on the level of proteins and metabolites. Since genes and their transcripts rarely act in the cell by themselves, more direct evidence may be of greater value for basic and applied research.
RESUMEN
The past decade has witnessed an explosion of textual information in the biomedical field. Biomedical texts provide a basis for healthcare delivery, knowledge discovery, and decision-making. Over the same period, deep learning has achieved remarkable performance in biomedical natural language processing, however, its development has been limited by well-annotated datasets and interpretability. To solve this, researchers have considered combining domain knowledge (such as biomedical knowledge graph) with biomedical data, which has become a promising means of introducing more information into biomedical datasets and following evidence-based medicine. This paper comprehensively reviews more than 150 recent literature studies on incorporating domain knowledge into deep learning models to facilitate typical biomedical text analysis tasks, including information extraction, text classification, and text generation. We eventually discuss various challenges and future directions.