Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Child Lang ; 41(1): 176-99, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23343571

RESUMO

Several models of language acquisition have emerged in recent years that rely on computational algorithms for simulation and evaluation. Computational models are formal and precise, and can thus provide mathematically well-motivated insights into the process of language acquisition. Such models are amenable to robust computational evaluation, using technology that was developed for Information Retrieval and Computational Linguistics. In this article we advocate the use of such technology for the evaluation of formal models of language acquisition. We focus on the Traceback Method, proposed in several recent studies as a model of early language acquisition, explaining some of the phenomena associated with children's ability to generalize previously heard utterances and generate novel ones. We present a rigorous computational evaluation that reveals some flaws in the method, and suggest directions for improving it.


Assuntos
Desenvolvimento da Linguagem , Linguística/métodos , Algoritmos , Criança , Linguagem Infantil , Humanos , Modelos Estatísticos
2.
Lang Resour Eval ; 47(4): 973-1005, 2013 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-25419199

RESUMO

We present a corpus of transcribed spoken Hebrew that reflects spoken interactions between children and adults. The corpus is an integral part of the CHILDES database, which distributes similar corpora for over 25 languages. We introduce a dedicated transcription scheme for the spoken Hebrew data that is sensitive to both the phonology and the standard orthography of the language. We also introduce a morphological analyzer that was specifically developed for this corpus. The analyzer adequately covers the entire corpus, producing detailed correct analyses for all tokens. Evaluation on a new corpus reveals high coverage as well. Finally, we describe a morphological disambiguation module that selects the correct analysis of each token in context. The result is a high-quality morphologically-annotated CHILDES corpus of Hebrew, along with a set of tools that can be applied to new corpora.

3.
J Child Lang ; 37(3): 705-29, 2010 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-20334720

RESUMO

Corpora of child language are essential for research in child language acquisition and psycholinguistics. Linguistic annotation of the corpora provides researchers with better means for exploring the development of grammatical constructions and their usage. We describe a project whose goal is to annotate the English section of the CHILDES database with grammatical relations in the form of labeled dependency structures. We have produced a corpus of over 18,800 utterances (approximately 65,000 words) with manually curated gold-standard grammatical relation annotations. Using this corpus, we have developed a highly accurate data-driven parser for the English CHILDES data, which we used to automatically annotate the remainder of the English section of CHILDES. We have also extended the parser to Spanish, and are currently working on supporting more languages. The parser and the manually and automatically annotated data are freely available for research purposes.


Assuntos
Bases de Dados Factuais , Linguística , Interface para o Reconhecimento da Fala , Adulto , Algoritmos , Automação , Criança , Linguagem Infantil , Simulação por Computador , Humanos , Relações Interpessoais , Idioma , Fala , Medida da Produção da Fala
4.
Appl Psycholinguist ; 32(1): 93-111, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-36896256

RESUMO

We compare translations of single words, made by bilingual speakers in a laboratory setting, with contextualized translation choices of the same items, made by professional translators and extracted from parallel language corpora. The translation choices in both cases show moderate convergence, demonstrating that decontextualized translation probabilities partially reflect bilinguals' life experience regarding the conditional distributions of alternative translations. Lexical attributes of the target word differ in their ability to predict translation probability: form similarity is a stronger predictor in decontextualized translation choice, whereas word frequency and semantic salience are stronger predictors for context-embedded translation choice. These findings establish the utility of parallel language corpora as important tools in psycholinguistic investigations of bilingual language processing.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA