Clotho dataset

Description

Clotho is a novel audio captioning dataset, consisting of 4981 audio samples, and each audio sample has five captions (a total of 24 905 captions). Audio samples are of 15 to 30 s duration and captions are eight to 20 words long.
Show more

Year of publication

2021

Type of data

Authors

Konstantinos Drosos - Creator

Samuel Lipping - Creator

Tuomas Virtanen - Creator

Zenodo - Publisher

Project

Other information

Fields of science

Computer and information sciences

Language

English

Open access

Open

License

Other

Keywords

Computer and information sciences

Subject headings

Temporal coverage

undefined

Related to this research data