RESUMO
The social and behavioral sciences have been increasingly using automated text analysis to measure psychological constructs in text. We explore whether GPT, the large-language model (LLM) underlying the AI chatbot ChatGPT, can be used as a tool for automated psychological text analysis in several languages. Across 15 datasets (n = 47,925 manually annotated tweets and news headlines), we tested whether different versions of GPT (3.5 Turbo, 4, and 4 Turbo) can accurately detect psychological constructs (sentiment, discrete emotions, offensiveness, and moral foundations) across 12 languages. We found that GPT (r = 0.59 to 0.77) performed much better than English-language dictionary analysis (r = 0.20 to 0.30) at detecting psychological constructs as judged by manual annotators. GPT performed nearly as well as, and sometimes better than, several top-performing fine-tuned machine learning models. Moreover, GPT's performance improved across successive versions of the model, particularly for lesser-spoken languages, and became less expensive. Overall, GPT may be superior to many existing methods of automated text analysis, since it achieves relatively high accuracy across many languages, requires no training data, and is easy to use with simple prompts (e.g., "is this text negative?") and little coding experience. We provide sample code and a video tutorial for analyzing text with the GPT application programming interface. We argue that GPT and other LLMs help democratize automated text analysis by making advanced natural language processing capabilities more accessible, and may help facilitate more cross-linguistic research with understudied languages.
Assuntos
Multilinguismo , Humanos , Idioma , Aprendizado de Máquina , Processamento de Linguagem Natural , Emoções , Mídias SociaisRESUMO
Humans have an outstanding ability to generalize from past experiences, which requires parsing continuously experienced events into discrete, coherent units, and relating them to similar past experiences. Time is a key element in this process; however, how temporal information is used in generalization remains unclear. Latent-cause inference provides a Bayesian framework for clustering experiences, by building a world model in which related experiences are generated by a shared cause. Here, we examine how temporal information is used in latent-cause inference, using a novel task in which participants see "microbe" stimuli and explicitly report the latent cause ("strain") they infer for each microbe. We show that humans incorporate time in their inference of latent causes, such that recently inferred latent causes are more likely to be inferred again. In particular, a "persistent" model, in which the latent cause inferred for one observation has a fixed probability of continuing to cause the next observation, explains the data significantly better than two other time-sensitive models, although extensive individual differences exist. We show that our task and this model have good psychometric properties, highlighting their potential use for quantifying individual differences in computational psychiatry or in neuroimaging studies.
Assuntos
Teorema de Bayes , Humanos , Adulto , Masculino , Feminino , Adulto Jovem , Modelos Psicológicos , Generalização Psicológica/fisiologia , Fatores de Tempo , Individualidade , Pensamento/fisiologiaRESUMO
BACKGROUND: Diagnosing major depressive disorder (MDD) is challenging, with diagnostic manuals failing to capture the wide range of clinical symptoms that are endorsed by individuals with this condition. OBJECTIVE: This study aims to provide evidence for an extended definition of MDD symptomatology. METHODS: Symptom data were collected via a digital assessment developed for a delta study. Random forest classification with nested cross-validation was used to distinguish between individuals with MDD and those with subthreshold symptomatology of the disorder using disorder-specific symptoms and transdiagnostic symptoms. The diagnostic performance of the Patient Health Questionnaire-9 was also examined. RESULTS: A depression-specific model demonstrated good predictive performance when distinguishing between individuals with MDD (n=64) and those with subthreshold depression (n=140) (area under the receiver operating characteristic curve=0.89; sensitivity=82.4%; specificity=81.3%; accuracy=81.6%). The inclusion of transdiagnostic symptoms of psychopathology, including symptoms of depression, generalized anxiety disorder, insomnia, emotional instability, and panic disorder, significantly improved the model performance (area under the receiver operating characteristic curve=0.95; sensitivity=86.5%; specificity=90.8%; accuracy=89.5%). The Patient Health Questionnaire-9 was excellent at identifying MDD but overdiagnosed the condition (sensitivity=92.2%; specificity=54.3%; accuracy=66.2%). CONCLUSIONS: Our findings are in line with the notion that current diagnostic practices may present an overly narrow conception of mental health. Furthermore, our study provides proof-of-concept support for the clinical utility of a digital assessment to inform clinical decision-making in the evaluation of MDD.
RESUMO
BACKGROUND: Web-based assessments of mental health concerns hold great potential for earlier, more cost-effective, and more accurate diagnoses of psychiatric conditions than that achieved with traditional interview-based methods. OBJECTIVE: The aim of this study was to assess the impact of a comprehensive web-based mental health assessment on the mental health and well-being of over 2000 individuals presenting with symptoms of depression. METHODS: Individuals presenting with depressive symptoms completed a web-based assessment that screened for mood and other psychiatric conditions. After completing the assessment, the study participants received a report containing their assessment results along with personalized psychoeducation. After 6 and 12 months, participants were asked to rate the usefulness of the web-based assessment on different mental health-related outcomes and to self-report on their recent help-seeking behavior, diagnoses, medication, and lifestyle changes. In addition, general mental well-being was assessed at baseline and both follow-ups using the Warwick-Edinburgh Mental Well-being Scale (WEMWBS). RESULTS: Data from all participants who completed either the 6-month or the 12-month follow-up (N=2064) were analyzed. The majority of study participants rated the study as useful for their subjective mental well-being. This included talking more openly (1314/1939, 67.77%) and understanding one's mental health problems better (1083/1939, 55.85%). Although most participants (1477/1939, 76.17%) found their assessment results useful, only a small proportion (302/2064, 14.63%) subsequently discussed them with a mental health professional, leading to only a small number of study participants receiving a new diagnosis (110/2064, 5.33%). Among those who were reviewed, new mood disorder diagnoses were predicted by the digital algorithm with high sensitivity (above 70%), and nearly half of the participants with new diagnoses also had a corresponding change in medication. Furthermore, participants' subjective well-being significantly improved over 12 months (baseline WEMWBS score: mean 35.24, SD 8.11; 12-month WEMWBS score: mean 41.19, SD 10.59). Significant positive predictors of follow-up subjective well-being included talking more openly, exercising more, and having been reviewed by a psychiatrist. CONCLUSIONS: Our results suggest that completing a web-based mental health assessment and receiving personalized psychoeducation are associated with subjective mental health improvements, facilitated by increased self-awareness and subsequent use of self-help interventions. Integrating web-based mental health assessments within primary and/or secondary care services could benefit patients further and expedite earlier diagnosis and effective treatment. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.2196/18453.
RESUMO
The vast personal and economic burden of mood disorders is largely caused by their under- and misdiagnosis, which is associated with ineffective treatment and worsening of outcomes. Here, we aimed to develop a diagnostic algorithm, based on an online questionnaire and blood biomarker data, to reduce the misdiagnosis of bipolar disorder (BD) as major depressive disorder (MDD). Individuals with depressive symptoms (Patient Health Questionnaire-9 score ≥5) aged 18-45 years were recruited online. After completing a purpose-built online mental health questionnaire, eligible participants provided dried blood spot samples for biomarker analysis and underwent the World Health Organization World Mental Health Composite International Diagnostic Interview via telephone, to establish their mental health diagnosis. Extreme Gradient Boosting and nested cross-validation were used to train and validate diagnostic models differentiating BD from MDD in participants who self-reported a current MDD diagnosis. Mean test area under the receiver operating characteristic curve (AUROC) for separating participants with BD diagnosed as MDD (N = 126) from those with correct MDD diagnosis (N = 187) was 0.92 (95% CI: 0.86-0.97). Core predictors included elevated mood, grandiosity, talkativeness, recklessness and risky behaviour. Additional validation in participants with no previous mood disorder diagnosis showed AUROCs of 0.89 (0.86-0.91) and 0.90 (0.87-0.91) for separating newly diagnosed BD (N = 98) from MDD (N = 112) and subclinical low mood (N = 120), respectively. Validation in participants with a previous diagnosis of BD (N = 45) demonstrated sensitivity of 0.86 (0.57-0.96). The diagnostic algorithm accurately identified patients with BD in various clinical scenarios, and could help expedite accurate clinical diagnosis and treatment of BD.
Assuntos
Transtorno Bipolar , Transtorno Depressivo Maior , Algoritmos , Biomarcadores , Transtorno Bipolar/diagnóstico , Transtorno Depressivo Maior/diagnóstico , Humanos , Aprendizado de Máquina , Saúde Mental , Inquéritos e QuestionáriosRESUMO
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
Existing high-throughput methods to identify RNA-binding proteins (RBPs) are based on capture of polyadenylated RNAs and cannot recover proteins that interact with nonadenylated RNAs, including long noncoding RNA, pre-mRNAs and bacterial RNAs. We present orthogonal organic phase separation (OOPS), which does not require molecular tagging or capture of polyadenylated RNA, and apply it to recover cross-linked protein-RNA and free protein, or protein-bound RNA and free RNA, in an unbiased way. We validated OOPS in HEK293, U2OS and MCF10A human cell lines, and show that 96% of proteins recovered were bound to RNA. We show that all long RNAs can be cross-linked to proteins, and recovered 1,838 RBPs, including 926 putative novel RBPs. OOPS is approximately 100-fold more efficient than existing methods and can enable analyses of dynamic RNA-protein interactions. We also characterize dynamic changes in RNA-protein interactions in mammalian cells following nocodazole arrest, and present a bacterial RNA-interactome for Escherichia coli. OOPS is compatible with downstream proteomics and RNA sequencing, and can be applied in any organism.