Clotho dataset
Description
Clotho is a novel audio captioning dataset, consisting of 4981 audio samples, and each audio sample has five captions (a total of 24 905 captions). Audio samples are of 15 to 30 s duration and captions are eight to 20 words long.
Show moreYear of publication
2021
Type of data
Authors
Konstantinos Drosos - Creator
Samuel Lipping - Creator
Tuomas Virtanen - Creator
Zenodo - Publisher
Project
Other information
Fields of science
Computer and information sciences
Language
English
Open access
Open