ABSTRACT
Aim of this procedural method is to construct well-founded corpora of scientific literature, and, hence, to track the evolution of knowledge fields from the reconstruction and clustering of words' life-cycles. The method contains: â¢an original selection process of relevant keywords involving the identification of relevant stems and stem n-grams through a matching with item lists of relevant glossaries;â¢several types of normalization of temporal trajectories of word raw frequenciesâ¢a properly customized clustering of word life-cycles, with a graphical extensive investigation of the best candidates for cluster number, to unveil the important dynamics and decipher the history of a scientific field.
ABSTRACT
This research note focuses on some of the opportunities provided by the statistical analysis of textual data, by illustrating examples of the use of lexicon-based quantitative measures with texts within a particular context of augmentative and alternative communication. The corpus is composed of 12 essays produced by six individuals with autism and six participants without disabilities in a control group during sessions of facilitated communication. The study raises questions that can be answered thanks to the statistical methods implemented in the text analysis framework and other procedures that may be used to identify the characteristics of texts (and their writers) and compare texts (or subcorpora). The aim is to discuss strengths, weaknesses, opportunities, and threats of the approach and to highlight its connections to qualitative approaches.
Subject(s)
Child Development Disorders, Pervasive/rehabilitation , Communication Aids for Disabled/statistics & numerical data , Writing , Adolescent , Adult , Case-Control Studies , Child , Child Development Disorders, Pervasive/diagnosis , Child Development Disorders, Pervasive/psychology , Communication Methods, Total , Female , Humans , Male , Mathematical Computing , Models, Statistical , Reference Values , Semantics , VocabularyABSTRACT
Statistical and linguistic procedures were implemented to analyze a large corpus of texts written by 37 individuals with autism and 92 facilitators (without disabilities), producing written conversations by means of PCs. Such texts were compared and contrasted to identify the specific traits of the lexis of the group of individuals with autism and assess to what extent it differed from the lexis of the facilitators. The purpose of this research was to identify specific language features using statistical procedures to analyze contingency lexical tables that reported on the frequencies of words and grammatical categories in different subcorpora and among different writers. The results support the existence of lexis and distributional patterns of grammatical categories that are characteristic of the written production of individuals with autism and that are different from those of facilitators.