Your browser doesn't support javascript.
loading
The ParlaMint corpora of parliamentary proceedings.
Erjavec, Tomaz; Ogrodniczuk, Maciej; Osenova, Petya; Ljubesic, Nikola; Simov, Kiril; Pancur, Andrej; Rudolf, Michal; Kopp, Matyás; Barkarson, Starkaður; Steingrímsson, Steinþór; Çöltekin, Çagri; de Does, Jesse; Depuydt, Katrien; Agnoloni, Tommaso; Venturi, Giulia; Pérez, María Calzada; de Macedo, Luciana D; Navarretta, Costanza; Luxardo, Giancarlo; Coole, Matthew; Rayson, Paul; Morkevicius, Vaidas; Krilavicius, Tomas; Darǵis, Roberts; Ring, Orsolya; van Heusden, Ruben; Marx, Maarten; Fiser, Darja.
Afiliación
  • Erjavec T; Department of Knowledge Technologies, Jozef Stefan Institute, Ljubljana, Slovenia.
  • Ogrodniczuk M; Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland.
  • Osenova P; Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, and Sofia University "St. Kl. Ohridski", Sofia, Bulgaria.
  • Ljubesic N; Department of Knowledge Technologies, Jozef Stefan Institute and Faculty of Computer Science and Informatics, University of Ljubljana, Ljubljana, Slovenia.
  • Simov K; Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Sofia, Bulgaria.
  • Pancur A; Institute for Contemporay History, Ljubljana, Slovenia.
  • Rudolf M; Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland.
  • Kopp M; Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic.
  • Barkarson S; The Árni Magnússon Institute for Icelandic Studies, Reykjavík, Iceland.
  • Steingrímsson S; The Árni Magnússon Institute for Icelandic Studies, Reykjavík, Iceland.
  • Çöltekin Ç; University of Tübingen, Tübingen, Germany.
  • de Does J; Dutch Language Institute, Hague, The Netherlands.
  • Depuydt K; Dutch Language Institute, Hague, The Netherlands.
  • Agnoloni T; Institute of Legal Informatics and Judicial Systems CNR-IGSG, Florence, Italy.
  • Venturi G; Institute of Computational Linguistics CNR-ILC, Pis, Italy.
  • Pérez MC; Universitat Jaume I, Castellón de la Plana, Spain.
  • de Macedo LD; Univ. Federal de Minas Gerais, Belo Horizonte, Brazil.
  • Navarretta C; University of Copenhagen, Copenhagen, Denmark.
  • Luxardo G; Univ. Paul Valéry Montpellier 3, Montpellier, France.
  • Coole M; Lancaster University, Lancaster, UK.
  • Rayson P; Lancaster University, Lancaster, UK.
  • Morkevicius V; Kaunas University of Technology, Kaunas, Lithuania.
  • Krilavicius T; Vytautas Magnus University, Kaunas, Lithuania.
  • Darǵis R; University of Latvia, Riga, Latvia.
  • Ring O; Centre for Social Sciences, Budapest, Hungary.
  • van Heusden R; Universiteit van Amsterdam, Amsterdam, The Netherlands.
  • Marx M; Universiteit van Amsterdam, Amsterdam, The Netherlands.
  • Fiser D; Arts Faculty, University of Ljubljana, and Institute of Contemporary History, Ljubljana, Slovenia.
Lang Resour Eval ; 57(1): 415-448, 2023.
Article en En | MEDLINE | ID: mdl-35125984
ABSTRACT
This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project's GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Idioma: En Revista: Lang Resour Eval Año: 2023 Tipo del documento: Article País de afiliación: Eslovenia

Texto completo: 1 Banco de datos: MEDLINE Idioma: En Revista: Lang Resour Eval Año: 2023 Tipo del documento: Article País de afiliación: Eslovenia