Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Sci Rep ; 14(1): 7259, 2024 03 27.
Artículo en Inglés | MEDLINE | ID: mdl-38538665

RESUMEN

Languages vary in how they signal "who does what to whom". Three main strategies to indicate the participant roles of "who" and "whom" are case, verbal indexing, and rigid word order. Languages that disambiguate these roles with case tend to have either verb-final or flexible word order. Most previous studies that found these patterns used limited language samples and overlooked the causal mechanisms that could jointly explain the association between all three features. Here we analyze grammatical data from a Grambank sample of 1705 languages with phylogenetic causal graph methods. Our results corroborate the claims that verb-final word order generally gives rise to case and, strikingly, establish that case tends to lead to the development of flexible word order. The combination of novel statistical methods and the Grambank database provides a model for the rigorous testing of causal claims about the factors that shape patterns of linguistic diversity.


Asunto(s)
Lenguaje , Lingüística , Humanos , Filogenia , Evolución Biológica , Manejo de Datos , Inhibidores de Proteínas Quinasas
2.
Sci Adv ; 9(33): eadf7704, 2023 08 18.
Artículo en Inglés | MEDLINE | ID: mdl-37585533

RESUMEN

Many recent proposals claim that languages adapt to their environments. The linguistic niche hypothesis claims that languages with numerous native speakers and substantial proportions of nonnative speakers (societies of strangers) tend to lose grammatical distinctions. In contrast, languages in small, isolated communities should maintain or expand their grammatical markers. Here, we test these claims using a global dataset of grammatical structures, Grambank. We model the impact of the number of native speakers, the proportion of nonnative speakers, the number of linguistic neighbors, and the status of a language on grammatical complexity while controlling for spatial and phylogenetic autocorrelation. We deconstruct "grammatical complexity" into two separate dimensions: how much morphology a language has ("fusion") and the amount of information obligatorily encoded in the grammar ("informativity"). We find several instances of weak positive associations but no inverse correlations between grammatical complexity and sociodemographic factors. Our findings cast doubt on the widespread claim that grammatical complexity is shaped by the sociolinguistic environment.


Asunto(s)
Lenguaje , Lingüística , Filogenia , Emociones , Adaptación Fisiológica
3.
Entropy (Basel) ; 25(3)2023 Mar 10.
Artículo en Inglés | MEDLINE | ID: mdl-36981375

RESUMEN

Research in computational textual aesthetics has shown that there are textual correlates of preference in prose texts. The present study investigates whether textual correlates of preference vary across different time periods (contemporary texts versus texts from the 19th and early 20th centuries). Preference is operationalized in different ways for the two periods, in terms of canonization for the earlier texts, and through sales figures for the contemporary texts. As potential textual correlates of preference, we measure degrees of (un)predictability in the distributions of two types of low-level observables, parts of speech and sentence length. Specifically, we calculate two entropy measures, Shannon Entropy as a global measure of unpredictability, and Approximate Entropy as a local measure of surprise (unpredictability in a specific context). Preferred texts from both periods (contemporary bestsellers and canonical earlier texts) are characterized by higher degrees of unpredictability. However, unlike canonicity in the earlier texts, sales figures in contemporary texts are reflected in global (text-level) distributions only (as measured with Shannon Entropy), while surprise in local distributions (as measured with Approximate Entropy) does not have an additional discriminating effect. Our findings thus suggest that there are both time-invariant correlates of preference, and period-specific correlates.

4.
Behav Sci (Basel) ; 13(1)2023 Jan 06.
Artículo en Inglés | MEDLINE | ID: mdl-36661624

RESUMEN

Previous research has shown that eyebrow movement during speech exhibits a systematic relationship with intonation: brow raises tend to be aligned with pitch accents, typically preceding them. The present study approaches the question of temporal alignment between brow movement and intonation from a new angle. The study makes use of footage from the Late Night Show with David Letterman, processed with 3D facial landmark detection. Pitch is modeled as a sinusoidal function whose parameters are correlated with the maximum height of the eyebrows in a brow raise. The results confirm some previous findings on audiovisual prosody but lead to new insights as well. First, the shape of the pitch signal in a region of approx. 630 ms before the brow raise is not random and tends to display a specific shape. Second, while being less informative than the post-peak pitch, the pitch signal in the pre-peak region also exhibits correlations with the magnitude of the associated brow raises. Both of these results point to early preparatory action in the speech signal, calling into question the visual-precedes-acoustic assumption. The results are interpreted as supporting a unified view of gesture/speech co-production that regards both signals as manifestations of a single communicative act.

5.
Entropy (Basel) ; 24(2)2022 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-35205572

RESUMEN

Computational textual aesthetics aims at studying observable differences between aesthetic categories of text. We use Approximate Entropy to measure the (un)predictability in two aesthetic text categories, i.e., canonical fiction ('classics') and non-canonical fiction (with lower prestige). Approximate Entropy is determined for series derived from sentence-length values and the distribution of part-of-speech-tags in windows of texts. For comparison, we also include a sample of non-fictional texts. Moreover, we use Shannon Entropy to estimate degrees of (un)predictability due to frequency distributions in the entire text. Our results show that the Approximate Entropy values can better differentiate canonical from non-canonical texts compared with Shannon Entropy, which is not true for the classification of fictional vs. expository prose. Canonical and non-canonical texts thus differ in sequential structure, while inter-genre differences are a matter of the overall distribution of local frequencies. We conclude that canonical fictional texts exhibit a higher degree of (sequential) unpredictability compared with non-canonical texts, corresponding to the popular assumption that they are more 'demanding' and 'richer'. In using Approximate Entropy, we propose a new method for text classification in the context of computational textual aesthetics.

6.
Front Psychol ; 12: 599063, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33868078

RESUMEN

This study investigates global properties of three categories of English text: canonical fiction, non-canonical fiction, and non-fictional texts. The central hypothesis of the study is that there are systematic differences with respect to structural design features between canonical and non-canonical fiction, and between fictional and non-fictional texts. To investigate these differences, we compiled a corpus containing texts of the three categories of interest, the Jena Corpus of Expository and Fictional Prose (JEFP Corpus). Two aspects of global structure are investigated, variability and self-similar (fractal) patterns, which reflect long-range correlations along texts. We use four types of basic observations, (i) the frequency of POS-tags per sentence, (ii) sentence length, (iii) lexical diversity, and (iv) the distribution of topic probabilities in segments of texts. These basic observations are grouped into two more general categories, (a) the lower-level properties (i) and (ii), which are observed at the level of the sentence (reflecting linguistic decoding), and (b) the higher-level properties (iii) and (iv), which are observed at the textual level (reflecting comprehension/integration). The observations for each property are transformed into series, which are analyzed in terms of variance and subjected to Multi-Fractal Detrended Fluctuation Analysis (MFDFA), giving rise to three statistics: (i) the degree of fractality ( H ), (ii) the degree of multifractality ( D ), i.e., the width of the fractal spectrum, and (iii) the degree of asymmetry ( A ) of the fractal spectrum. The statistics thus obtained are compared individually across text categories and jointly fed into a classification model (Support Vector Machine). Our results show that there are in fact differences between the three text categories of interest. In general, lower-level text properties are better discriminators than higher-level text properties. Canonical fictional texts differ from non-canonical ones primarily in terms of variability in lower-level text properties. Fractality seems to be a universal feature of text, slightly more pronounced in non-fictional than in fictional texts. On the basis of our results obtained on the basis of corpus data we point out some avenues for future research leading toward a more comprehensive analysis of textual aesthetics, e.g., using experimental methodologies.

7.
Sci Data ; 7(1): 13, 2020 01 13.
Artículo en Inglés | MEDLINE | ID: mdl-31932593

RESUMEN

Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven questions can now be investigated. Such advances, however, are bringing high requirements in terms of rigorousness for preparing and curating datasets. Here we present CLICS, a Database of Cross-Linguistic Colexifications (CLICS). CLICS tackles interconnected interdisciplinary research questions about the colexification of words across semantic categories in the world's languages, and show-cases best practices for preparing data for cross-linguistic research. This is done by addressing shortcomings of an earlier version of the database, CLICS2, and by supplying an updated version with CLICS3, which massively increases the size and scope of the project. We provide tools and guidelines for this purpose and discuss insights resulting from organizing student tasks for database updates.


Asunto(s)
Bases de Datos Factuales , Lingüística , Humanos , Lenguaje
8.
F1000Res ; 9: 295, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33552475

RESUMEN

Research software has become a central asset in academic research. It optimizes existing and enables new research methods, implements and embeds research knowledge, and constitutes an essential research product in itself. Research software must be sustainable in order to understand, replicate, reproduce, and build upon existing research or conduct new research effectively. In other words, software must be available, discoverable, usable, and adaptable to new needs, both now and in the future. Research software therefore requires an environment that supports sustainability. Hence, a change is needed in the way research software development and maintenance are currently motivated, incentivized, funded, structurally and infrastructurally supported, and legally treated. Failing to do so will threaten the quality and validity of research. In this paper, we identify challenges for research software sustainability in Germany and beyond, in terms of motivation, selection, research software engineering personnel, funding, infrastructure, and legal aspects. Besides researchers, we specifically address political and academic decision-makers to increase awareness of the importance and needs of sustainable research software practices. In particular, we recommend strategies and measures to create an environment for sustainable research software, with the ultimate goal to ensure that software-driven research is valid, reproducible and sustainable, and that software is recognized as a first class citizen in research. This paper is the outcome of two workshops run in Germany in 2019, at deRSE19 - the first International Conference of Research Software Engineers in Germany - and a dedicated DFG-supported follow-up workshop in Berlin.


Asunto(s)
Conocimiento , Investigadores , Programas Informáticos , Predicción , Alemania , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...