The Suomi 24 Corpus 2001-2020, VRT version 1.1
Description
The corpus is available for download in Kielipankki - the Language Bank of Finland.
This collection contains two downloadable sets of Suomi24 data: "The Suomi24 Corpus 2001-2017, VRT version" and "The Suomi24 Corpus 2018-2020, VRT version".
Together, the two corpora cover all the discussion forums of the Suomi24 online social networking website from 1st January 2001 to 31st December 2020.
Updates:
2025-04-14: For version 1.1 the data has been updated with annotations of names recognized with FiNER 1.6 and languages of sentences identified with HeLI-OTS 2.0.
Show moreYear of publication
2021
Type of data
Authors
City Digital Group - Creator
University of Helsinki - Publisher
User support FIN-CLARIN - Curator
Project
Other information
Fields of science
Languages
Language
Finnish
Open access
Restricted access