Your browser doesn't support javascript.
loading
Adding part-of-speech information to the SUBTLEX-US word frequencies.
Brysbaert, Marc; New, Boris; Keuleers, Emmanuel.
Afiliação
  • Brysbaert M; Department of Experimental Psychology, Ghent University, Henri Dunantlaan 2, 9000, Gent, Belgium. marc.brysbaert@ugent.be
Behav Res Methods ; 44(4): 991-7, 2012 Dec.
Article em En | MEDLINE | ID: mdl-22396136
ABSTRACT
The SUBTLEX-US corpus has been parsed with the CLAWS tagger, so that researchers have information about the possible word classes (parts-of-speech, or PoSs) of the entries. Five new columns have been added to the SUBTLEX-US word frequency list the dominant (most frequent) PoS for the entry, the frequency of the dominant PoS, the frequency of the dominant PoS relative to the entry's total frequency, all PoSs observed for the entry, and the respective frequencies of these PoSs. Because the current definition of lemma frequency does not seem to provide word recognition researchers with useful information (as illustrated by a comparison of the lemma frequencies and the word form frequencies from the Corpus of Contemporary American English), we have not provided a column with this variable. Instead, we hope that the full list of PoS frequencies will help researchers to collectively determine which combination of frequencies is the most informative.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Vocabulário / Algoritmos / Idioma Limite: Humans Idioma: En Revista: Behav Res Methods Assunto da revista: CIENCIAS DO COMPORTAMENTO Ano de publicação: 2012 Tipo de documento: Article País de afiliação: Bélgica

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Vocabulário / Algoritmos / Idioma Limite: Humans Idioma: En Revista: Behav Res Methods Assunto da revista: CIENCIAS DO COMPORTAMENTO Ano de publicação: 2012 Tipo de documento: Article País de afiliação: Bélgica