Clotho Analysis Set

Description

This dataset is derived from the evaluation subset of Clotho dataset (https://zenodo.org/doi/10.5281/zenodo.3490683). It is designed to analyze the behavior of the captioning system under certain perturbation in order to try and identify some open challenges in automated audio captioning. The original audio clips are transformed with audio_degrader. The transformations applied are the following: Microphone response simulation Mixup with another clip from the dataset (ratio -6dB, -3dB and 0dB) Additive noise from DESED (ratio -12dB, -6dB, 0dB)
Show more

Year of publication

2022

Type of data

Authors

Huang Xie - Creator

Konstantinos Drossos - Creator

Samuel Lipping - Creator

Tuomas Virtanen - Creator

Unknown organization

Felix Gontier - Creator

Romain Serizel - Creator

Zenodo - Publisher

Project

Other information

Fields of science

Computer and information sciences

Language

English

Open access

Open

License

Creative Commons Attribution 4.0 International (CC BY 4.0)

Keywords

Computer and information sciences

Subject headings

Temporal coverage

undefined

Related to this research data