Finnish OpenSubtitles 2017, source
Description
The Finnish OpenSubtitles 2017 source material corpus is available for download.
The corpus contains Finnish subtitles for movies and TV-series from http://www.opensubtitles.org/ The corpus is a derivative of the [OPUS OpenSubtitles2018](http://opus.nlpl.eu/OpenSubtitles2018.php) multilingual corpus. Information on the material processing up to sentence splitting can be found in the original publication Lison & Tiedemann (2016). The corpus has been tokenized and annotated with morpho-syntactic analysis produced with the [Turku Dependency Parser](http://turkunlp.github.io/Finnish-dep-parser/).
P. Lison and J. Tiedemann, 2016, OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016)
License: CC BY https://creativecommons.org/licenses/by/4.0/
Show moreYear of publication
2019
Type of data
Authors
University of Helsinki - Publisher
Tatu Huovilainen - Rights holder, Creator
User support FIN-CLARIN - Curator
Project
Other information
Fields of science
Languages
Language
Finnish
Open access
Open