Pesquisa | Portal Regional da BVS

Changes in the length of speeches in the plays of William Shakespeare and his contemporaries: A mixed models approach.

Colyvas, Kim; Egan, Gabriel; Craig, Hugh.

PLoS One ; 18(4): e0282716, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37083841

RESUMO

Since 2007 a number of investigators have compiled statistics on the length in words of speeches in plays by William Shakespeare and his contemporaries, focusing on a change to shorter speeches around 1600. In this article we take account of several potentially confounding factors in the variation of speech lengths in these works and present a model of this variation in the period 1538-1642 through Linear Mixed Models. We confirm that the mode of speech lengths in English plays changed from nine words to four words around 1600, and that Shakespeare's plays fit this wider pattern closely. We establish for the first time: that this change is independent of authorship, dramatic genre, theatrical company, and the proportion of verse in a play's dialogue; that the chosen time span can be segmented into pre-1597 plays (with high modes), 1597-1602 plays (with mixed high and low modes), and post-1602 plays (with low modes); that some additional secondary modes are evident in speech lengths, at 16 and 24 words, suggesting that the length of a standard blank verse line (around 8 words) is an underlying unit in speech length; and that the general change to short speeches also holds true when the data is viewed through the perspective of the median and the mean. The change in speech lengths is part of a collective drift in the plays towards liveliness and verisimilitude and is evidence of a hitherto hidden constraint on the playwrights: whether or not they were aware of the fact, playwrights as a group were conforming to a structure for the distribution of speech lengths peculiar to the era they were writing in. The authors hope that the full modelling of this variation in the article will help bring this change to the attention of scholars of Shakespeare and his contemporaries.

Assuntos

Drama , Medicina na Literatura , Fala , Idioma , Autoria , Drama/história

A Novel Clustering Methodology Based on Modularity Optimisation for Detecting Authorship Affinities in Shakespearean Era Plays.

Naeni, Leila M; Craig, Hugh; Berretta, Regina; Moscato, Pablo.

PLoS One ; 11(8): e0157988, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-27571416

RESUMO

In this study we propose a novel, unsupervised clustering methodology for analyzing large datasets. This new, efficient methodology converts the general clustering problem into the community detection problem in graph by using the Jensen-Shannon distance, a dissimilarity measure originating in Information Theory. Moreover, we use graph theoretic concepts for the generation and analysis of proximity graphs. Our methodology is based on a newly proposed memetic algorithm (iMA-Net) for discovering clusters of data elements by maximizing the modularity function in proximity graphs of literary works. To test the effectiveness of this general methodology, we apply it to a text corpus dataset, which contains frequencies of approximately 55,114 unique words across all 168 written in the Shakespearean era (16th and 17th centuries), to analyze and detect clusters of similar plays. Experimental results and comparison with state-of-the-art clustering methods demonstrate the remarkable performance of our new method for identifying high quality clusters which reflect the commonalities in the literary style of the plays.

Assuntos

Algoritmos , Análise por Conglomerados

Propositional idea density in older men's written language: findings from the HIMS study using computerised analysis.

Spencer, Elizabeth; Ferguson, Alison; Craig, Hugh; Colyvas, Kim; Hankey, Graeme J; Flicker, Leon.

Clin Linguist Phon ; 29(2): 85-101, 2015 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-25216374

RESUMO

Decline in linguistic function has been associated with decline in cognitive function in previous research. This research investigated the informativeness of written language samples of Australian men from the Health in Men's Study (HIMS) aged from 76 to 93 years using the Computerised Propositional Idea Density Rater (CPIDR 5.1). In total, 60,255 words in 1147 comments were analysed using a linear-mixed model for statistical analysis. Results indicated no relationship with education level (p = 0.79). Participants for whom English was not their first learnt language showed Propositional Idea Density (PD) scores slightly lower (0.018 per 1 word). Mean PD per 1 word for those for whom English was their first language for comments below 60 words was 0.494 and above 60 words 0.526. Text length was found to have an effect (p = <0.0001). The mean PD was higher than previously reported for men and lower than previously reported for a similar cohort for Australian women.

Assuntos

Diagnóstico por Computador , Transtornos da Linguagem/diagnóstico , Testes de Linguagem , Linguística , Redação , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/diagnóstico , Escolaridade , Humanos , Masculino , Multilinguismo , Semântica , Fatores Sexuais , Software , Acidente Vascular Cerebral/diagnóstico , Austrália Ocidental

An information theoretic clustering approach for unveiling authorship affinities in Shakespearean era plays and poems.

Arefin, Ahmed Shamsul; Vimieiro, Renato; Riveros, Carlos; Craig, Hugh; Moscato, Pablo.

PLoS One ; 9(10): e111445, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25347727

RESUMO

In this paper we analyse the word frequency profiles of a set of works from the Shakespearean era to uncover patterns of relationship between them, highlighting the connections within authorial canons. We used a text corpus comprising 256 plays and poems from the 16th and 17th centuries, with 17 works of uncertain authorship. Our clustering approach is based on the Jensen-Shannon divergence and a graph partitioning algorithm, and our results show that authors' characteristic styles are very powerful factors in explaining the variation of word use, frequently transcending cross-cutting factors like the differences between tragedy and comedy, early and late works, and plays and poems. Our method also provides an empirical guide to the authorship of plays and poems where this is unknown or disputed.

Assuntos

Autoria/história , Drama/história , Modelos Teóricos , Poesia como Assunto/história , Análise por Conglomerados , Inglaterra , História do Século XVI , História do Século XVII

Propositional idea density in women's written language over the lifespan: computerized analysis.

Ferguson, Alison; Spencer, Elizabeth; Craig, Hugh; Colyvas, Kim.

Cortex ; 55: 107-21, 2014 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-23834740

RESUMO

The informativeness of written language, as measured by Propositional Idea Density (PD), has been shown to be a sensitive predictive index of language decline with age and dementia in previous research. The present study investigated the influence of age and education on the written language of three large cohorts of women from the general community, born between 1973 and 1978, 1946-51 and 1921-26. Written texts were obtained from the Australian Longitudinal Study on Women's Health in which participants were invited to respond to an open-ended question about their health. The informativeness of written comments of 10 words or more (90% of the total number of comments) was analyzed using the Computerized Propositional Idea Density Rater 3 (CPIDR-3). Over 2.5 million words used in 37,705 written responses from 19,512 respondents were analyzed. Based on a linear mixed model approach to statistical analysis with adjustment for several factors including number of comments per respondent and number of words per comment, a small but statistically significant effect of age was identified for the older cohort with mean age 78 years. The mean PD per word for this cohort was lower than the younger and mid-aged cohorts with mean age 27 and 53 years respectively, with mean reduction in PD 95% confidence interval (CI) of .006 (.003, .008) and .009 (.008, .011) respectively. This suggests that PD for this population of women was relatively more stable over the adult lifespan than has been reported previously even in late old age. There was no statistically significant effect of education level. Computerized analyses were found to greatly facilitate the study of informativeness of this large corpus of written language. Directions for further research are discussed in relation to the need for extended investigation of the variability of the measure for potential application to the identification of acquired language pathologies.

Assuntos

Envelhecimento/fisiologia , Transtornos da Linguagem/fisiopatologia , Idioma , Adulto , Fatores Etários , Idoso , Austrália , Estudos de Coortes , Feminino , Humanos , Estudos Longitudinais , Pessoa de Meia-Idade , Medição de Risco , Adulto Jovem

Language Individuation and Marker Words: Shakespeare and His Maxwell's Demon.

Marsden, John; Budden, David; Craig, Hugh; Moscato, Pablo.

PLoS One ; 8(6): e66813, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23826143

RESUMO

BACKGROUND: Within the structural and grammatical bounds of a common language, all authors develop their own distinctive writing styles. Whether the relative occurrence of common words can be measured to produce accurate models of authorship is of particular interest. This work introduces a new score that helps to highlight such variations in word occurrence, and is applied to produce models of authorship of a large group of plays from the Shakespearean era. METHODOLOGY: A text corpus containing 55,055 unique words was generated from 168 plays from the Shakespearean era (16th and 17th centuries) of undisputed authorship. A new score, CM1, is introduced to measure variation patterns based on the frequency of occurrence of each word for the authors John Fletcher, Ben Jonson, Thomas Middleton and William Shakespeare, compared to the rest of the authors in the study (which provides a reference of relative word usage at that time). A total of 50 WEKA methods were applied for Fletcher, Jonson and Middleton, to identify those which were able to produce models yielding over 90% classification accuracy. This ensemble of WEKA methods was then applied to model Shakespearean authorship across all 168 plays, yielding a Matthews' correlation coefficient (MCC) performance of over 90%. Furthermore, the best model yielded an MCC of 99%. CONCLUSIONS: Our results suggest that different authors, while adhering to the structural and grammatical bounds of a common language, develop measurably distinct styles by the tendency to over-utilise or avoid particular common words and phrasings. Considering language and the potential of words as an abstract chaotic system with a high entropy, similarities can be drawn to the Maxwell's Demon thought experiment; authors subconsciously favour or filter certain words, modifying the probability profile in ways that could reflect their individuality and style.

Assuntos

Autoria , Idioma , Literatura , Autoria/história , Bases de Dados Factuais , Pessoas Famosas , História do Século XVI , História do Século XVII , Humanos , Idioma/história , Literatura/história , Modelos Teóricos , Reconhecimento Automatizado de Padrão

Language and ageing - exploring propositional density in written language - stability over time.

Spencer, Elizabeth; Craig, Hugh; Ferguson, Alison; Colyvas, Kim.

Clin Linguist Phon ; 26(9): 743-54, 2012 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-22876766

RESUMO

This study investigated the stability of propositional density (PD) in written texts, as this aspect of language shows promise as an indicator and as a predictor of language decline with ageing. This descriptive longitudinal study analysed written texts obtained from the Australian Longitudinal Study of Women's Health in which participants were invited to respond to an open-ended question about their health. The 635 texts used for this study were taken from 127 middle-aged women who responded to this question on each of the five surveys conducted at 3-year intervals over a 16-year period. The study made use of an automated PD rater (CPIDR-3) for the analysis. PD was found to be a stable measure over time when comparing the grouped data, but there was between- and within-subject variation over time. Further research is needed to explore the valid use of this measure in research into language and ageing.

Assuntos

Envelhecimento/fisiologia , Cognição/fisiologia , Transtornos da Linguagem/diagnóstico , Idioma , Linguística , Diagnóstico por Computador/métodos , Feminino , Inquéritos Epidemiológicos , Humanos , Transtornos da Linguagem/epidemiologia , Testes de Linguagem/estatística & dados numéricos , Estudos Longitudinais , Pessoa de Meia-Idade , Inquéritos e Questionários , Saúde da Mulher/estatística & dados numéricos

Deletion hotspot in the argininosuccinate lyase gene: association with topoisomerase II and DNA polymerase alpha sites.

Christodoulou, John; Craig, Hugh J; Walker, David C; Weaving, Linda S; Pearson, Christopher E; McInnes, Roderick R.

Hum Mutat ; 27(11): 1065-71, 2006 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-16941645

RESUMO

Molecular analysis of argininosuccinate lyase (ASAL) deficiency has led to the identification of a deletion hotspot in the ASL gene. Six individuals with ASAL deficiency had alleles that led to a complete absence of exon 13 from the ASL mRNA; each had a partial deletion of exon 13 in the genomic DNA. In all six patients, the deletions begin 18 bp upstream of the 3' end of exon 13. In four cases, the deletions were 13 bp in length, and ended within exon 13, whereas in two other patients the deletions were 25 bp and extended into intron 13. The sequence at which these deletions begin overlaps both a putative topoisomerase II recognition site and a DNA polymerase alpha mutation/frameshift site. Moreover, the topoisomerase II cut site is situated precisely at the beginning of the deletions, which are flanked by small (2- and 3-bp) direct repeats. We note that a similar concurrence of these two putative enzyme sites can be found in a number of other deletion sites in the human genome, most notably the DeltaF508 deletion in the CFTR gene. These findings suggest that the joint presence of these two enzyme sites represents a DNA sequence context that may favor the occurrence of small deletions.

Assuntos

Argininossuccinato Liase/genética , DNA Polimerase I/genética , DNA Topoisomerases Tipo II/genética , Deleção de Sequência , Sequência de Bases , Células Cultivadas , Análise Mutacional de DNA , Éxons , Mutação da Fase de Leitura , Ligação Genética , Genoma Humano , Instabilidade Genômica , Haplótipos , Humanos , Dados de Sequência Molecular , Alinhamento de Sequência , Análise de Sequência de DNA

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA