Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
J Pers Soc Psychol ; 126(2): 312-331, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37676124

RESUMO

Traditional methods of personality assessment, and survey-based research in general, cannot make inferences about new items that have not been surveyed previously. This limits the amount of information that can be obtained from a given survey. In this article, we tackle this problem by leveraging recent advances in statistical natural language processing. Specifically, we extract "embedding" representations of questionnaire items from deep neural networks, trained on large-scale English language data. These embeddings allow us to construct a high-dimensional space of items, in which linguistically similar items are located near each other. We combine item embeddings with machine learning algorithms to extrapolate participant ratings of personality items to completely new items that have not been rated by any participants. The accuracy of our approach is on par with incentivized human judges given an identical task, indicating that it predicts ratings of new personality items as accurately as people do. Our approach is also capable of identifying psychological constructs associated with questionnaire items and can accurately cluster items into their constructs based only on their language content. Overall, our results show how representations of linguistic personality descriptors obtained from deep language models can be used to model and predict a large variety of traits, scales, and constructs. In doing so, they showcase a new scalable and cost-effective method for psychological measurement. (PsycInfo Database Record (c) 2024 APA, all rights reserved).


Assuntos
Aprendizado Profundo , Humanos , Personalidade , Transtornos da Personalidade , Inventário de Personalidade , Idioma
2.
J Biomed Inform ; 139: 104269, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36621750

RESUMO

Electronic health records (EHR) are collected as a routine part of healthcare delivery, and have great potential to be utilized to improve patient health outcomes. They contain multiple years of health information to be leveraged for risk prediction, disease detection, and treatment evaluation. However, they do not have a consistent, standardized format across institutions, particularly in the United States, and can present significant analytical challenges- they contain multi-scale data from heterogeneous domains and include both structured and unstructured data. Data for individual patients are collected at irregular time intervals and with varying frequencies. In addition to the analytical challenges, EHR can reflect inequity- patients belonging to different groups will have differing amounts of data in their health records. Many of these issues can contribute to biased data collection. The consequence is that the data for under-served groups may be less informative partly due to more fragmented care, which can be viewed as a type of missing data problem. For EHR data in this complex form, there is currently no framework for introducing realistic missing values. There has also been little to no work in assessing the impact of missing data in EHR. In this work, we first introduce a terminology to define three levels of EHR data and then propose a novel framework for simulating realistic missing data scenarios in EHR to adequately assess their impact on predictive modeling. We incorporate the use of a medical knowledge graph to capture dependencies between medical events to create a more realistic missing data framework. In an intensive care unit setting, we found that missing data have greater negative impact on the performance of disease prediction models in groups that tend to have less access to healthcare, or seek less healthcare. We also found that the impact of missing data on disease prediction models is stronger when using the knowledge graph framework to introduce realistic missing values as opposed to random event removal.


Assuntos
Atenção à Saúde , Registros Eletrônicos de Saúde , Humanos , Estados Unidos , Unidades de Terapia Intensiva
3.
J Pers ; 90(3): 405-425, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-34536229

RESUMO

OBJECTIVE: We explore the personality of counties as assessed through linguistic patterns on social media. Such studies were previously limited by the cost and feasibility of large-scale surveys; however, language-based computational models applied to large social media datasets now allow for large-scale personality assessment. METHOD: We applied a language-based assessment of the five factor model of personality to 6,064,267 U.S. Twitter users. We aggregated the Twitter-based personality scores to 2,041 counties and compared to political, economic, social, and health outcomes measured through surveys and by government agencies. RESULTS: There was significant personality variation across counties. Openness to experience was higher on the coasts, conscientiousness was uniformly spread, extraversion was higher in southern states, agreeableness was higher in western states, and emotional stability was highest in the south. Across 13 outcomes, language-based personality estimates replicated patterns that have been observed in individual-level and geographic studies. This includes higher Republican vote share in less agreeable counties and increased life satisfaction in more conscientious counties. CONCLUSIONS: Results suggest that regions vary in their personality and that these differences can be studied through computational linguistic analysis of social media. Furthermore, these methods may be used to explore other psychological constructs across geographies.


Assuntos
Mídias Sociais , Extroversão Psicológica , Humanos , Idioma , Personalidade , Determinação da Personalidade
4.
Proc Natl Acad Sci U S A ; 118(39)2021 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-34544875

RESUMO

On May 25, 2020, George Floyd, an unarmed Black American male, was killed by a White police officer. Footage of the murder was widely shared. We examined the psychological impact of Floyd's death using two population surveys that collected data before and after his death; one from Gallup (117,568 responses from n = 47,355) and one from the US Census (409,652 responses from n = 319,471). According to the Gallup data, in the week following Floyd's death, anger and sadness increased to unprecedented levels in the US population. During this period, more than a third of the US population reported these emotions. These increases were more pronounced for Black Americans, nearly half of whom reported these emotions. According to the US Census Household Pulse data, in the week following Floyd's death, depression and anxiety severity increased among Black Americans at significantly higher rates than that of White Americans. Our estimates suggest that this increase corresponds to an additional 900,000 Black Americans who would have screened positive for depression, associated with a burden of roughly 2.7 million to 6.3 million mentally unhealthy days.


Assuntos
Ansiedade/epidemiologia , Depressão/epidemiologia , Emoções/fisiologia , Homicídio/psicologia , Saúde Mental/etnologia , Polícia/estatística & dados numéricos , Racismo/psicologia , Adolescente , Adulto , Negro ou Afro-Americano/psicologia , Ira/fisiologia , Ansiedade/psicologia , Depressão/psicologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estados Unidos/epidemiologia , População Branca/psicologia , Adulto Jovem
5.
Proc Natl Acad Sci U S A ; 116(40): 19887-19893, 2019 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-31527280

RESUMO

The expansion of machine learning to high-stakes application domains such as medicine, finance, and criminal justice, where making informed decisions requires clear understanding of the model, has increased the interest in interpretable machine learning. The widely used Classification and Regression Trees (CART) have played a major role in health sciences, due to their simple and intuitive explanation of predictions. Ensemble methods like gradient boosting can improve the accuracy of decision trees, but at the expense of the interpretability of the generated model. Additive models, such as those produced by gradient boosting, and full interaction models, such as CART, have been investigated largely in isolation. We show that these models exist along a spectrum, revealing previously unseen connections between these approaches. This paper introduces a rigorous formalization for the additive tree, an empirically validated learning technique for creating a single decision tree, and shows that this method can produce models equivalent to CART or gradient boosted stumps at the extremes by varying a single parameter. Although the additive tree is designed primarily to provide both the model interpretability and predictive performance needed for high-stakes applications like medicine, it also can produce decision trees represented by hybrid models between CART and boosted stumps that can outperform either of these approaches.


Assuntos
Algoritmos , Árvores de Decisões , Aprendizado de Máquina , Bases de Dados Factuais , Modelos Estatísticos , Linguagens de Programação
6.
J Med Internet Res ; 17(2): e51, 2015 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-25707038

RESUMO

BACKGROUND: Traditional metrics of the impact of the Affordable Care Act (ACA) and health insurance marketplaces in the United States include public opinion polls and marketplace enrollment, which are published with a lag of weeks to months. In this rapidly changing environment, a real-time barometer of public opinion with a mechanism to identify emerging issues would be valuable. OBJECTIVE: We sought to evaluate Twitter's role as a real-time barometer of public sentiment on the ACA and to determine if Twitter sentiment (the positivity or negativity of tweets) could be predictive of state-level marketplace enrollment. METHODS: We retrospectively collected 977,303 ACA-related tweets in March 2014 and then tested a correlation of Twitter sentiment with marketplace enrollment by state. RESULTS: A 0.10 increase in the sentiment score was associated with an 8.7% increase in enrollment at the state level (95% CI 1.32-16.13; P=.02), a correlation that remained significant when adjusting for state Medicaid expansion (P=.02) or use of a state-based marketplace (P=.03). CONCLUSIONS: This correlation indicates Twitter's potential as a real-time monitoring strategy for future marketplace enrollment periods; marketplaces could systematically track Twitter sentiment to more rapidly identify enrollment changes and potentially emerging issues. As a repository of free and accessible consumer-generated opinions, this study reveals a novel role for Twitter in the health policy landscape.


Assuntos
Internet/estatística & dados numéricos , Patient Protection and Affordable Care Act/estatística & dados numéricos , Mídias Sociais/estatística & dados numéricos , Humanos , Estados Unidos
7.
J Pers Soc Psychol ; 108(6): 934-52, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25365036

RESUMO

Language use is a psychologically rich, stable individual difference with well-established correlations to personality. We describe a method for assessing personality using an open-vocabulary analysis of language from social media. We compiled the written language from 66,732 Facebook users and their questionnaire-based self-reported Big Five personality traits, and then we built a predictive model of personality based on their language. We used this model to predict the 5 personality factors in a separate sample of 4,824 Facebook users, examining (a) convergence with self-reports of personality at the domain- and facet-level; (b) discriminant validity between predictions of distinct traits; (c) agreement with informant reports of personality; (d) patterns of correlations with external criteria (e.g., number of friends, political attitudes, impulsiveness); and (e) test-retest reliability over 6-month intervals. Results indicated that language-based assessments can constitute valid personality measures: they agreed with self-reports and informant reports of personality, added incremental validity over informant reports, adequately discriminated between traits, exhibited patterns of correlations with external criteria similar to those found with self-reported personality, and were stable over 6-month intervals. Analysis of predictive language can provide rich portraits of the mental life associated with traits. This approach can complement and extend traditional methods, providing researchers with an additional measure that can quickly and cheaply assess large groups of participants with minimal burden.


Assuntos
Idioma , Determinação da Personalidade , Mídias Sociais , Feminino , Humanos , Linguística , Masculino , Personalidade , Testes de Personalidade , Reprodutibilidade dos Testes , Autorrelato , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA