RESUMO
The beginnings of words are, in some informal sense, special. This intuition is widely shared, for example, when playing word games. Less apparent is whether the intuition is substantiated empirically and what the underlying organizational principle(s) might be. Here, we answer this seemingly simple question in a quantitatively clear way. Based on arguments about the interplay between lexical storage and speech processing, we examine whether the distribution of information among different speech sounds of words is governed by a critical computational unit for online speech perception and production: syllables. By analyzing lexical databases of twelve languages, we demonstrate that there is a compelling asymmetry between syllable beginnings (onsets) versus ends (codas) in their involvement in distinguishing words stored in the lexicon. In particular, we show that the functional advantage of syllable onset reflects an asymmetrical distribution of lexical informativeness within the syllable unit but not an effect of a global decay of informativeness from the beginning to the end of a word. The converging finding across languages from a range of typological families supports the conjecture that the syllable unit, while being a critical primitive for both speech perception and production, is also a key organizational constraint for lexical storage.
Assuntos
Dissidências e Disputas , Intuição , Humanos , Bases de Dados Factuais , Idioma , FalaRESUMO
Spoken language production involves selecting and assembling words and syntactic structures to convey one's message. Here we probe this process by analyzing natural language productions of individuals with primary progressive aphasia (PPA) and healthy individuals. Based on prior neuropsychological observations, we hypothesize that patients who have difficulty producing complex syntax might choose semantically richer words to make their meaning clear, whereas patients with lexicosemantic deficits may choose more complex syntax. To evaluate this hypothesis, we first introduce a frequency-based method for characterizing the syntactic complexity of naturally produced utterances. We then show that lexical and syntactic complexity, as measured by their frequencies, are negatively correlated in a large (n = 79) PPA population. We then show that this syntax-lexicon trade-off is also present in the utterances of healthy speakers (n = 99) taking part in a picture description task, suggesting that it may be a general property of the process by which humans turn thoughts into speech.
Assuntos
Idioma , Fala , Afasia Primária Progressiva/fisiopatologia , Humanos , Fala/fisiologiaRESUMO
INTRODUCTION: Despite the progress in gene editing platforms like CRISPR/Cas9 with the potential to transform the standard of care for haemophilia, the language used to explain and discuss gene editing is not aligned across the haemophilia community. Here, we present the objective and rationale for developing a clear, consistent, and globally aligned gene editing lexicon to address these communication gaps. METHODS: Effectively communicating complex gene editing concepts requires a clear and consistent vocabulary. Through collaboration with a diversity of haemophilia stakeholders, our main goal is to develop an accurate, informative lexicon which avoids overpromising or highly technical terminology. Using an innovative process, representatives from several patient and scientific haemophilia organizations and select biotechnology companies will develop and refine language concepts to be tested with approximately seventy participants across the United States of America, United Kingdom, and Germany. Participants will include lived experience experts (LEEs) and haematologists. The process will be overseen by the Lexicon Steering Committee of global experts from leading scientific and patient organizations in the haemophilia and gene editing fields. RESULTS: Initial feedback provided a robust foundation and rationale for building clear, consistent language around gene editing. This lexicon development framework will allow for increased understanding across the haemophilia community, including the development of valid informed consent and shared decision-making materials. CONCLUSION: Results provide important building blocks for stimuli development and highlight the need for a novel gene editing lexicon. In the next phase, language stimuli will be tested with LEEs and haematologists to better understand audience preferences and help shape the final lexicon.
RESUMO
BACKGROUND: The appearance of the COVID-19 virus in December 2019, quickly escalated into a global crisis, prompting the World Health Organization to recommend regional lockdowns. While effective in curbing the virus's spread, these measures have triggered intense debates on social media platforms, exposing widespread public anxiety and skepticism. The spread of fake news further fueled public unrest and negative emotions, potentially undermining the effectiveness of anti-COVID-19 policies. Exploring the narratives surrounding COVID-19 on social media immediately following the lockdown announcements presents an intriguing research avenue. The purpose of this study is to examine social media discourse to identify the topics discussed and, more importantly, to analyze differences in the focus and emotions expressed by the public in two countries (the UK and India). This is done with an analysis of a big corpus of tweets. METHODS: The datasets comprised of COVID-19-related tweets in English, published between March 29th and April 11th 2020 from residents in the UK and India. Methods employed in the analysis include identification of latent topics and themes, assessment of the popularity of tweets on topic distributions, examination of the overall sentiment, and investigation of sentiment in specific topics and themes. RESULTS: Safety measures, government responses and cooperative supports are common themes in the UK and India. Personal experiences and cooperations are top discussion for both countries. The impact on specific groups is given the least emphasis in the UK, whereas India places the least focus on discussions related to social media and news reports. Supports, discussion about the UK PM Boris Johnson and appreciation are strong topics among British popular tweets, whereas confirmed cases are discussed most among Indian popular tweets. Unpopular tweets in both countries pay the most attention to issues regarding lockdown. According to overall sentiment, positive attitudes are dominated in the UK whilst the sentiment is more neutral in India. Trust and anticipation are the most prevalent emotions in both countries. In particular, the British population felt positive about community support and volunteering, personal experiences, and government responses, while Indian people felt positive about cooperation, government responses, and coping strategies. Public health situations raise negative sentiment both in the UK and India. CONCLUSIONS: The study emphasizes the role of cultural values in crisis communication and public health policy. Individualistic societies prioritize personal freedom, requiring a balance between individual liberty and public health measures. Collectivistic societies focus on community impact, suggesting policies that could utilize community networks for public health compliance. Social media shapes public discourse during pandemics, with popular and unpopular tweets reflecting and reshaping discussions. The presence of fake news may distort topics of high public interest, necessitating authenticity confirmation by official bloggers. Understanding public concerns and popular content on social media can help authorities tailor crisis communication to improve public engagement and health measure compliance.
Assuntos
COVID-19 , Opinião Pública , Mídias Sociais , Humanos , Índia , COVID-19/epidemiologia , COVID-19/prevenção & controle , Reino Unido , Mídias Sociais/estatística & dados numéricos , Quarentena/psicologia , SARS-CoV-2 , EmoçõesRESUMO
BACKGROUND: Core lexicon (CL) analysis is a time efficient and possibly reliable measure that captures discourse production abilities. For people with aphasia, CL scores have demonstrated correlations with aphasia severity, as well as other discourse and linguistic measures. It was also found to be clinician-friendly and clinically sensitive enough to capture longitudinal changes in aphasia. To our knowledge, CL has never been investigated in individuals with neurologically progressive disease. AIMS: As a preliminary investigation, we sought to investigate (1) whether CL scores correlate with dementia severity, (2) whether CL scores correlate with measures of discourse quality, and (3) whether CL scores correlate with other measures of lexical/semantic access. METHODS & PROCEDURES: Twelve participants with a cognitive impairment associated with dementia of the Alzheimer's type (DAT) completed several measures of language and cognitive ability, as well as provide a language sample from the wordless picture book, Picnic. RESULTS & CONCLUSION: Results are informative, as they provide insight into characteristics of CL and provide support for potential use of CL in individuals with neurologically progressive disease. The results indicated that CL scores do correlate with dementia severity and several measures of language ability, indicating they may provide a useful measure of language abilities in DAT, but more research is needed. WHAT THIS PAPER ADDS: What is already known on the subject Core lexicon (CL) analysis is an assessment measure of discourse ability, most closely related to informativeness or productivity, used in aphasiology that is easier to use and less time consuming than previous measures of informativeness, such as correct information units or type-token ratio (TTR). For people with aphasia, CL analysis correlates with aphasia severity, measures of informativeness, as well as other measures of discourse quality. It has also been shown to be faster and more reliable between scorers than other informativeness measures. What this study adds Core lexicon analysis is a new simple and online method for assessing the informativeness of a discourse sample without the need to record or transcribe the language sample. CL is receiving a lot of attention in aphasia, correlating with everything from aphasia severity to measures of productivity and lexical access, as well as measures of informativeness. Unfortunately, no one has investigated CL analysis in dementia. The study demonstrates the first evidence that CL analysis may be a useful measure for determining dementia severity and language quality in people with dementia. What are the clinical implications of this work? Core lexicon analysis may provide clinicians and researchers with an easy method for assessing the discourse of people with a cognitive impairment associated with dementia of the Alzheimer's type. This will improve initial assessment, as well as improve ongoing language assessment that may provide clues into their functional ability to communicate effectively.
Assuntos
Doença de Alzheimer , Humanos , Feminino , Masculino , Idoso , Doença de Alzheimer/psicologia , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/complicações , Idoso de 80 Anos ou mais , Testes de Linguagem , Afasia/etiologia , Afasia/psicologia , Afasia/diagnóstico , Semântica , Testes Neuropsicológicos , Pessoa de Meia-Idade , Índice de Gravidade de DoençaRESUMO
Recent approaches to text analysis from social media and other corpora rely on word lists to detect topics, measure meaning, or to select relevant documents. These lists are often generated by applying computational lexicon expansion methods to small, manually curated sets of seed words. Despite the wide use of this approach, we still lack an exhaustive comparative analysis of the performance of lexicon expansion methods and how they can be improved with additional linguistic data. In this work, we present LEXpander, a method for lexicon expansion that leverages novel data on colexification, i.e., semantic networks connecting words with multiple meanings according to shared senses. We evaluate LEXpander in a benchmark including widely used methods for lexicon expansion based on word embedding models and synonym networks. We find that LEXpander outperforms existing approaches in terms of both precision and the trade-off between precision and recall of generated word lists in a variety of tests. Our benchmark includes several linguistic categories, as words relating to the financial area or to the concept of friendship, and sentiment variables in English and German. We also show that the expanded word lists constitute a high-performing text analysis method in application cases to various English corpora. This way, LEXpander poses a systematic automated solution to expand short lists of words into exhaustive and accurate word lists that can closely approximate word lists generated by experts in psychology and linguistics.
Assuntos
Linguística , Mídias Sociais , Humanos , SemânticaRESUMO
Large-scale word association datasets are both important tools used in psycholinguistics and used as models that capture meaning when considered as semantic networks. Here, we present word association norms for Rioplatense Spanish, a variant spoken in Argentina and Uruguay. The norms were derived through a large-scale crowd-sourced continued word association task in which participants give three associations to a list of cue words. Covering over 13,000 words and +3.6 M responses, it is currently the most extensive dataset available for Spanish. We compare the obtained dataset with previous studies in Dutch and English to investigate the role of grammatical gender and studies that used Iberian Spanish to test generalizability to other Spanish variants. Finally, we evaluated the validity of our data in word processing (lexical decision reaction times) and semantic (similarity judgment) tasks. Our results demonstrate that network measures such as in-degree provide a good prediction of lexical decision response times. Analyzing semantic similarity judgments showed that results replicate and extend previous findings demonstrating that semantic similarity derived using spreading activation or spectral methods outperform word embeddings trained on text corpora.
Assuntos
Associação Livre , Semântica , Humanos , Psicolinguística , Tempo de Reação , JulgamentoRESUMO
We present a psycholinguistic study investigating lexical effects on simplified Chinese character recognition by deaf readers. Prior research suggests that deaf readers exhibit efficient orthographic processing and decreased reliance on speech-based phonology in word recognition compared to hearing readers. In this large-scale character decision study (25 participants, each evaluating 2500 real characters and 2500 pseudo-characters), we analyzed various factors influencing character recognition accuracy and speed in deaf readers. Deaf participants demonstrated greater accuracy and faster recognition when characters were more frequent, were acquired earlier, had more strokes, displayed higher orthographic complexity, were more imageable in reference, or were less concrete in reference. Comparison with a previous study of hearing readers revealed that the facilitative effect of frequency on character decision accuracy was stronger for deaf readers than hearing readers. The effect of orthographic-phonological regularity differed significantly for the two groups, indicating that deaf readers rely more on orthographic structure and less on phonological information during character recognition. Notably, increased stroke counts (i.e., higher orthographic complexity) hindered hearing readers but facilitated recognition processes in deaf readers, suggesting that deaf readers excel at recognizing characters based on orthographic structure. The database generated from this large-scale character decision study offers a valuable resource for further research and practical applications in deaf education and literacy.
Assuntos
Surdez , Leitura , Humanos , Masculino , Feminino , Surdez/fisiopatologia , Adulto , Adulto Jovem , Psicolinguística/métodos , China , Tomada de Decisões/fisiologia , Pessoas com Deficiência Auditiva/psicologia , IdiomaRESUMO
Iconic words and signs are characterized by a perceived resemblance between aspects of their form and aspects of their meaning. For example, in English, iconic words include peep and crash, which mimic the sounds they denote, and wiggle and zigzag, which mimic motion. As a semiotic property of words and signs, iconicity has been demonstrated to play a role in word learning, language processing, and language evolution. This paper presents the results of a large-scale norming study for more than 14,000 English words conducted with over 1400 American English speakers. We demonstrate the utility of these ratings by replicating a number of existing findings showing that iconicity ratings are related to age of acquisition, sensory modality, semantic neighborhood density, structural markedness, and playfulness. We discuss possible use cases and limitations of the rating dataset, which is made publicly available.
Assuntos
Idioma , Semântica , Humanos , Desenvolvimento da Linguagem , Aprendizagem Verbal , SomRESUMO
Infant-directed speech (IDS) is known to be characterised by phonetic and prosodic cues along with reduced vocabulary and syntax compared to adult-directed speech (ADS). However, there is considerable variation between mothers in the degree of lexical and syntactic reduction of their IDS. The present study aims to investigate the correspondences of the inter-individual variation of maternal IDS at 6 and 18 months with infants' language development at 18 months. 109 dyads of mothers and their firstborn infants participated in the study. Mothers' ID and AD storytelling based on standard picture stimuli were recorded at 6 and 18 months of their infants' age. We analysed measures of speech quantity (number of utterances and words), syntactic complexity (mean length of utterance), and lexical diversity (type-token ratio). Language growth was measured bimonthly using the Hungarian adaptation of the MacArthur-Bates CDI W&G form. The results did not reveal any association between characteristics of mothers' ID narratives and their infants' concurrent language skills at 18 months. However, we found a longitudinal link between a distinct pattern of linguistic simplification in maternal ID storytelling at 6 months and the development of expressive vocabulary in infants at 18 months. Infants whose mother tends to reduce both lexical and syntactic complexity of ID narratives the most are more likely to exhibit higher language outcomes. Further research is warranted to explore the background factors and longer-term effects of this maternal strategy.
RESUMO
We report on two types of developmental surface dysgraphia. One type, exhibited by 8 participants, is orthographic lexicon surface dysgraphia, which involves an impairment in the orthographic output lexicon, leading to nonword phonologically-plausible misspellings. The other type, shown by 3 participants, is disconnection surface dysgraphia. In this type, the orthographic output lexicon is disconnected from the semantic system and from the phonological input lexicon, but still contributes to spelling via support to the orthographic output buffer, resulting in mainly lexical phonologically-plausible misspellings (writing be as "bee" but not "bea").The specific localization of the impairment in spelling, in the lexicon or in its connections, allowed us to examine the question of one or two orthographic lexicons; four participants who had a deficit in the orthographic output lexicon itself in writing had intact orthographic-input-lexicon in reading. They made surface errors in writing but not in reading the same words, supporting separate input and output orthographic lexicons.
Assuntos
Agrafia , Dislexia , Humanos , Abelhas , Animais , Fonética , Idioma , SemânticaRESUMO
Anomic aphasia is characterized by good comprehension and non-word repetition but poor naming. Two sub-types of deficits might be hypothesized: faulty access to preserved phonological representations or preserved access to impaired representations. Phonological errors may occur only when representations are impaired or in post-lexical deficits (conduction aphasia). We analysed the incidence of phonological naming errors of 30 individuals, 25 with anomic aphasia based on poor naming but good repetition and comprehension, and five with conduction aphasia based on poor naming and poor repetition. Individuals with anomic aphasia produced very few phonological errors compared to individuals with conduction aphasia (0-19.1% versus 42-66%). However, six individuals with anomia produced more than 11% phonological errors, suggesting two patterns of deficit: either impaired lexical representations or impaired access to them. The lack of phonological errors in most individuals with anomic aphasia suggests that access to the phonological output lexicon is semantically, not phonologically driven.
Assuntos
Afasia de Condução , Afasia , Humanos , Anomia , Semântica , LinguísticaRESUMO
Trial-to-trial effects have been found in a number of studies, indicating that processing a stimulus influences responses in subsequent trials. A special case are priming effects which have been modelled successfully with error-driven learning (Marsolek, 2008), implying that participants are continuously learning during experiments. This study investigates whether trial-to-trial learning can be detected in an unprimed lexical decision experiment. We used the Discriminative Lexicon Model (DLM; Baayen et al., 2019), a model of the mental lexicon with meaning representations from distributional semantics, which models error-driven incremental learning with the Widrow-Hoff rule. We used data from the British Lexicon Project (BLP; Keuleers et al., 2012) and simulated the lexical decision experiment with the DLM on a trial-by-trial basis for each subject individually. Then, reaction times were predicted with Generalized Additive Models (GAMs), using measures derived from the DLM simulations as predictors. We extracted measures from two simulations per subject (one with learning updates between trials and one without), and used them as input to two GAMs. Learning-based models showed better model fit than the non-learning ones for the majority of subjects. Our measures also provide insights into lexical processing and individual differences. This demonstrates the potential of the DLM to model behavioural data and leads to the conclusion that trial-to-trial learning can indeed be detected in unprimed lexical decision. Our results support the possibility that our lexical knowledge is subject to continuous changes.
Assuntos
Aprendizagem por Discriminação , Semântica , Humanos , Aprendizagem , Tempo de Reação/fisiologia , Individualidade , Tomada de DecisõesRESUMO
BACKGROUND. Imaging reports that consistently document all disease sites with a potential to increase surgical complexity or morbidity can facilitate ovarian cancer treatment planning. OBJECTIVE. The aims of this study were to compare simple structured reports and synoptic reports from pretreatment CT examinations in patients with advanced ovarian cancer in terms of completeness of documenting involvement of clinically relevant anatomic sites as well as to evaluate physician satisfaction with synoptic reports. METHODS. This retrospective study included 205 patients (median age, 65 years) who underwent contrast-enhanced abdominopelvic CT before primary treatment of advanced ovarian cancer from June 1, 2018, to January 31, 2022. A total of 128 reports generated on or before March 31, 2020, used a simple structured report (free text organized into sections); 77 reports generated on or after April 1, 2020, used a synoptic report (a list of 45 anatomic sites relevant to ovarian cancer management, each of which was classified in terms of disease absence versus presence). Reports were reviewed for completeness of documentation of involvement of the 45 sites. For patients who underwent neoadjuvant chemotherapy based on diagnostic laparoscopy findings or underwent primary debulking surgery with suboptimal resection, the EMR was reviewed to identify surgically established sites of disease that were unresectable or challenging to resect. Gynecologic oncology surgeons were electronically surveyed. RESULTS. The mean report turnaround time was 29.8 minutes for simple structured reports versus 54.5 minutes for synoptic reports (p < .001). A mean of 17.6 of 45 sites (range, four to 43 sites) were mentioned by simple structured reports versus 44.5 of 45 sites (range, 39-45) for synoptic reports (p < .001). Forty-three patients had surgically established unresectable or challenging-to-resect disease; involvement of anatomic site(s) with such disease was mentioned in 37% (11/30) of simple structured reports versus 100% (13/13) of synoptic reports (p < .001). All eight surveyed gynecologic oncology surgeons completed the survey. CONCLUSION. A synoptic report improved completeness of pretreatment CT reports in patients with advanced ovarian cancer, including for established sites of unresectable or challenging-to-resect disease. CLINICAL IMPACT. The findings indicate the role of disease-specific synoptic reports in facilitating referrer communication and potentially guiding clinical decision-making.
Assuntos
Neoplasias dos Genitais Femininos , Neoplasias Ovarianas , Médicos , Humanos , Feminino , Idoso , Estudos Retrospectivos , Satisfação do Paciente , Neoplasias Ovarianas/diagnóstico por imagem , Neoplasias Ovarianas/cirurgia , Documentação , Tomografia Computadorizada por Raios X , Satisfação PessoalRESUMO
The mid-twentieth century brought a radical change in how the linguistics community formulated its major goal, moving from a largely taxonomic science to Chomsky's revolution, which conceptualized language as a higher-order cognitive function. This article reviews the paths (not always direct) that brought Lila Gleitman into contact with that revolution, her contributions to it, and the evolution in her thinking about how language is learned by every child, regardless of extreme variation in the input received. To understand how that occurs, we need to discover what must be learned by the child and what is already there to guide that learning-what must be, in Plato's terms, "recollected." The growing picture shows a learner equipped with information-processing mechanisms that extract evidence about word meanings using various evidential sources. Chief among these are the observational and linguistic-syntactic contexts in which words occur. The former is supported by a mechanism Gleitman and her collaborators call "propose but verify," and the latter by a mechanism known as "syntactic boot-strapping."
Assuntos
Idioma , Psicolinguística , Criança , Feminino , Humanos , Desenvolvimento da Linguagem , Aprendizagem , Rememoração MentalRESUMO
BACKGROUND: Patients' rights are integral to medical ethics. This study aimed to perform sentiment analysis and opinion mining on patients' messages by a combination of lexicon-based and machine learning methods to identify positive or negative comments and to determine the different ward and staff names mentioned in patients' messages. METHODS: The level of satisfaction and observance of the rights of 250 service recipients of the hospital was evaluated through the related checklists by the evaluator. In total, 822 Persian messages, composed of 540 negative and 282 positive comments, were collected and labeled by the evaluator. Pre-processing was performed on the messages and followed by 2 feature vectors which were extracted from the messages, including the term frequency-inverse document frequency (TFIDF) vector and a combination of the multifeature (MF) (a lexicon-based method) and TFIDF (MF + TFIDF) vectors. Six feature selectors and 5 classifiers were used in this study. For the evaluations, 5-fold cross-validation with different metrics including area under the receiver operating characteristic curve (AUC), accuracy (ACC), F1 score, sensitivity (SEN), specificity (SPE) and Precision-Recall Curves (PRC) were reported. Message tag detection, which featured different hospital wards and identified staff names mentioned in the study patients' messages, was implemented by the lexicon-based method. RESULTS: The best classifier was Multinomial Naïve Bayes in combination with MF + TFIDF feature vector and SelectFromModel (SFM) feature selection (ACC = 0.89 ± 0.03, AUC = 0.87 ± 0.03, F1 = 0.92 ± 0.03, SEN = 0.93 ± 0.04, and SPE = 0.82 ± 0.02, PRC-AUC = 0.97). Two methods of assessment by the evaluator and artificial intelligence as well as survey systems were compared. CONCLUSION: Our results demonstrated that the lexicon-based method, in combination with machine learning classifiers, could extract sentiments in patients' comments and classify them into positive and negative categories. We also developed an online survey system to analyze patients' satisfaction in different wards and to remove conventional assessments by the evaluator.
Assuntos
Inteligência Artificial , Satisfação do Paciente , Humanos , Teorema de Bayes , Aprendizado de Máquina , Curva ROCRESUMO
BACKGROUND: The innovative method of sentiment analysis based on an emotional lexicon shows prominent advantages in capturing emotional information, such as individual attitudes, experiences, and needs, which provides a new perspective and method for emotion recognition and management for patients with breast cancer (BC). However, at present, sentiment analysis in the field of BC is limited, and there is no emotional lexicon for this field. Therefore, it is necessary to construct an emotional lexicon that conforms to the characteristics of patients with BC so as to provide a new tool for accurate identification and analysis of the patients' emotions and a new method for their personalized emotion management. OBJECTIVE: This study aimed to construct an emotional lexicon of patients with BC. METHODS: Emotional words were obtained by merging the words in 2 general sentiment lexicons, the Chinese Linguistic Inquiry and Word Count (C-LIWC) and HowNet, and the words in text corpora acquired from patients with BC via Weibo, semistructured interviews, and expressive writing. The lexicon was constructed using manual annotation and classification under the guidance of Russell's valence-arousal space. Ekman's basic emotional categories, Lazarus' cognitive appraisal theory of emotion, and a qualitative text analysis based on the text corpora of patients with BC were combined to determine the fine-grained emotional categories of the lexicon we constructed. Precision, recall, and the F1-score were used to evaluate the lexicon's performance. RESULTS: The text corpora collected from patients in different stages of BC included 150 written materials, 17 interviews, and 6689 original posts and comments from Weibo, with a total of 1,923,593 Chinese characters. The emotional lexicon of patients with BC contained 9357 words and covered 8 fine-grained emotional categories: joy, anger, sadness, fear, disgust, surprise, somatic symptoms, and BC terminology. Experimental results showed that precision, recall, and the F1-score of positive emotional words were 98.42%, 99.73%, and 99.07%, respectively, and those of negative emotional words were 99.73%, 98.38%, and 99.05%, respectively, which all significantly outperformed the C-LIWC and HowNet. CONCLUSIONS: The emotional lexicon with fine-grained emotional categories conforms to the characteristics of patients with BC. Its performance related to identifying and classifying domain-specific emotional words in BC is better compared to the C-LIWC and HowNet. This lexicon not only provides a new tool for sentiment analysis in the field of BC but also provides a new perspective for recognizing the specific emotional state and needs of patients with BC and formulating tailored emotional management plans.
Assuntos
Neoplasias da Mama , Humanos , Feminino , Análise de Sentimentos , Emoções , Medo , TristezaRESUMO
BACKGROUND: Positive mental health is arguably increasingly important and can be revealed, to some extent, in terms of psychological well-being (PWB). However, PWB is difficult to assess in real time on a large scale. The popularity and proliferation of social media make it possible to sense and monitor online users' PWB in a nonintrusive way, and the objective of this study is to test the effectiveness of using social media language expression as a predictor of PWB. OBJECTIVE: This study aims to investigate the predictive power of social media corresponding to ground truth well-being data in a psychological way. METHODS: We recruited 1427 participants. Their well-being was evaluated using 6 dimensions of PWB. Their posts on social media were collected, and 6 psychological lexicons were used to extract linguistic features. A multiobjective prediction model was then built with the extracted linguistic features as input and PWB as the output. Further, the validity of the prediction model was confirmed by evaluating the model's discriminant validity, convergent validity, and criterion validity. The reliability of the model was also confirmed by evaluating the split-half reliability. RESULTS: The correlation coefficients between the predicted PWB scores of social media users and the actual scores obtained using the linguistic prediction model of this study were between 0.49 and 0.54 (P<.001), which means that the model had good criterion validity. In terms of the model's structural validity, it exhibited excellent convergent validity but less than satisfactory discriminant validity. The results also suggested that our model had good split-half reliability levels for every dimension (ranging from 0.65 to 0.85; P<.001). CONCLUSIONS: By confirming the availability and stability of the linguistic prediction model, this study verified the predictability of social media corresponding to ground truth well-being data from the perspective of PWB. Our study has positive implications for the use of social media to predict mental health in nonprofessional settings such as self-testing or a large-scale user study.
Assuntos
Bem-Estar Psicológico , Mídias Sociais , Humanos , Reprodutibilidade dos Testes , Saúde Mental , IdiomaRESUMO
BACKGROUND: Patients with anomic aphasia experience difficulties in narrative processing. General discourse measures are time consuming and require necessary skills. Core lexicon analysis has been proposed as an effort-saving approach but has not been developed in Mandarin discourse. AIMS: This exploratory study was aimed (1) to apply core lexicon analysis in Mandarin patients with anomic aphasia at the discourse level and (2) to verify the problems with core words among people with anomic aphasia. METHODS & PROCEDURE: The core nouns and verbs were extracted from narrative language samples from 88 healthy participants. The production of core words for 12 anomic aphasia and 12 age- and education-matched controls were then calculated and compared. The correlation between the percentages and the Aphasia Quotients of the revised Western Aphasia Battery was analyzed as well. OUTCOMES & RESULTS: The core nouns and verbs were successfully extracted. Patients with anomic aphasia produced fewer core words than healthy people, and the percentages differed significantly in different tasks as well as word classes. There was no correlation between the core lexicon use and the severity of aphasia in patients with anomic aphasia. CONCLUSIONS & IMPLICATIONS: Core lexicon analysis may potentially serve as a clinician-friendly manner of quantifying core words produced at the discourse level in Mandarin patients with anomic aphasia. WHAT THIS PAPER ADDS: What is already known on the subject Discourse analyses in aphasia assessment and treatment have increasingly garnered attention. Core lexicon analysis based on English AphasiaBank has been reported in recent years. It is correlated with microlinguistic and macrolinguistic measures in aphasia narratives. Nevertheless, the application based on Mandarin AphasiaBank is still under development in healthy individuals and patients with anomic aphasia. What this paper adds to existing knowledge A Mandarin core lexicon set was developed for different tasks. The feasibility of core lexicon analysis to evaluate the corpus of patients with anomic aphasia was preliminarily discussed and the speech performance of patients and healthy people was then compared to provide a reference for the evaluation and treatment of clinical aphasia corpus. What are the potential or actual clinical implications of this work? The purpose of this exploratory study was to consider the potential use of core lexicon analysis to evaluate core word production in narrative discourse. Moreover, normative and aphasia data were provided for comparison to develop clinical use for Mandarin patients with anomic aphasia.
Assuntos
Anomia , Afasia , Humanos , Anomia/diagnóstico , Afasia/diagnóstico , Afasia/terapia , Idioma , Fala , Testes de LinguagemRESUMO
The American College of Radiology (ACR) Ovarian-Adnexal Reporting and Data System (O-RADS) lexicon and risk assessment tool for ultrasound (US) provides a framework for characterization of ovarian and adnexal pathology with the ultimate goal of harmonizing reporting and patient management strategies. Since the first O-RADS US publication in 2018, multiple validation studies have shown O-RADS US to have excellent diagnostic accuracy, with the majority of these studies using O-RADS 4 as the optimal cut-off for detecting ovarian cancer. Most of the existing validation studies include a dedicated training phase and confirm that ORADS US categories and lexicon descriptors are associated with high level inter-read agreement, regardless of radiologist training level or practice experience. O-RADS US has a similar inter-reader agreement when compared to Gynecologic Imaging Reporting and Data System (GIRADS), Assessment of Different Neoplasias in the adnexa (ADNEX), and International Tumor Analysis Group (IOTA) simple rules. System descriptors have been shown to correlate with expected malignancy rates and the O-RADS US risk stratification system has been shown to perform in the expected range of malignancy risk per category. Further directions will focus on clarifying governing concepts and lexicon terminology as well as further refining risk stratification categories based on data from published validation studies.