LaVA - Latvian Language Learner corpus

dc.contributor.authorDarģis, Roberts
dc.contributor.authorAuziņa, Ilze
dc.contributor.authorKaija, Inga
dc.contributor.authorLevāne-Petrova, Kristīne
dc.contributor.authorPokratniece, Kristīne
dc.contributor.editorCalzolari, Nicoletta
dc.contributor.editorBechet, Frederic
dc.contributor.editorBlache, Philippe
dc.contributor.editorChoukri, Khalid
dc.contributor.editorCieri, Christopher
dc.contributor.editorDeclerck, Thierry
dc.contributor.editorGoggi, Sara
dc.contributor.editorIsahara, Hitoshi
dc.contributor.editorMaegaard, Bente
dc.contributor.editorMariani, Joseph
dc.contributor.editorMazo, Helene
dc.contributor.editorOdijk, Jan
dc.contributor.editorPiperidis, Stelios
dc.contributor.institutionRīga Stradiņš University
dc.date.accessioned2023-01-04T11:25:01Z
dc.date.available2023-01-04T11:25:01Z
dc.date.issued2022
dc.descriptionFunding Information: The work reported in this paper is a part of the project Development of Learner Corpus of Latvian: methods, tools and applications (Project No. lzp-2018/1-0527) that is being implemented at the Institute of Mathematics and Computer Science, University of Latvia (IMCS UL) since September 2018. The project is financed by Latvian Council of Science. This work is also a part of the Latvian State Research Programme Letonika - Fostering a Latvian and European Society project Research on Modern Latvian Language and Development of Language Technology (No. VPP-LETONIKA-2021/1-0006) and has received financial support from the Latvian Language Agency through the grant agreement No. 4.6/2019-029. Publisher Copyright: © European Language Resources Association (ELRA), licensed under CC-BY-NC-4.0.
dc.description.abstractThis paper presents the Latvian Language Learner Corpus (LaVA) developed at the Institute of Mathematics and Computer Science, University of Latvia. LaVA corpus contains 1015 essays (190k tokens and 790k characters excluding whitespaces) from foreigners studying at Latvian higher education institutions and who are learning Latvian as a foreign language in the first or second semester, reaching the A1 (possibly A2) Latvian language proficiency level. The corpus has morphological and error annotations. Error analysis and the statistics of the LaVA corpus are also provided in the paper. The corpus is publicly available at: http://www.korpuss.lv/id/LaVA.en
dc.description.statusPeer reviewed
dc.format.extent5
dc.format.extent1866703
dc.identifier.citationDarģis, R, Auziņa, I, Kaija, I, Levāne-Petrova, K & Pokratniece, K 2022, LaVA - Latvian Language Learner corpus. in N Calzolari, F Bechet, P Blache, K Choukri, C Cieri, T Declerck, S Goggi, H Isahara, B Maegaard, J Mariani, H Mazo, J Odijk & S Piperidis (eds), 13th Language Resources and Evaluation Conference, LREC 2022 : Proceedings. European Language Resources Association (ELRA), pp. 727-731, 13th International Conference on Language Resources and Evaluation, LREC 2022, Marseille, France, 20/06/22. < https://aclanthology.org/2022.lrec-1.77 >
dc.identifier.citationconference
dc.identifier.isbn9791095546726
dc.identifier.isbn9791095546726
dc.identifier.otherMendeley: abb50f8f-0711-34e0-97fa-973e21cdeb28
dc.identifier.urihttps://dspace.rsu.lv/jspui/handle/123456789/10007
dc.identifier.urlhttp://www.scopus.com/inward/record.url?scp=85144354957&partnerID=8YFLogxK
dc.identifier.urlhttps://aclanthology.org/2022.lrec-1.77
dc.language.isoeng
dc.publisherEuropean Language Resources Association (ELRA)
dc.relation.ispartof13th Language Resources and Evaluation Conference, LREC 2022
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectacquisition
dc.subjectannotated
dc.subjectLatvian
dc.subjectlearner corpus
dc.subject5.3 Educational sciences
dc.subject6.2 Languages and Literature
dc.subject3.1. Articles or chapters in proceedings/scientific books indexed in Web of Science and/or Scopus database
dc.subjectLanguage and Linguistics
dc.subjectLibrary and Information Sciences
dc.subjectLinguistics and Language
dc.subjectEducation
dc.titleLaVA - Latvian Language Learner corpusen
dc.type/dk/atira/pure/researchoutput/researchoutputtypes/contributiontobookanthology/conference

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LaVA_Latvian_Language_Learner_corpus.pdf
Size:
1.78 MB
Format:
Adobe Portable Document Format