MACS - Multi-Annotator Captioned Soundscapes

Description

This is a dataset containing audio captions and corresponding audio tags for a number of 3930 audio files of the TAU Urban Acoustic Scenes 2019 development dataset (airport, public square, and park). The files were annotated using a web-based tool. Each file is annotated by multiple annotators that provided tags and a one-sentence description of the audio content. The data also includes annotator competence estimated using MACE (Multi-Annotator Competence Estimation). The annotation procedure, processing and analysis of the data are presented in the following papers: Irene Martin-Morato, Annamaria Mesaros. What is the ground truth? Reliability of multi-annotator data for audio tagging, 29th European Signal Processing Conference, EUSIPCO 2021 Irene Martin-Morato, Annamaria Mesaros. Diversity and bias in audio captioning datasets, submitted to DCASE 2021 Workshop (to be updated with arxiv link) Data is provided as two files: MACS.yaml - containing the complete annotations in the following format: - filename: file1.wav annotations: - annotator_id: ann_1 sentence: caption text tags: - tag1 - tag2 - annotator_id: ann_2 sentence: caption text tags: - tag1 MACS_competence.csv - containing the estimated annotator competence; for each annotator_id in the yaml file, competence is a number between 0 (considered as annotating at random) and 1 id [tab] competence The audio files can be downloaded from https://zenodo.org/record/2589280 and are covered by their own license.
Show more

Year of publication

2021

Type of data

Authors

Annamaria Mesaros - Creator

Irene Martin Morato - Creator

Zenodo - Publisher

Project

Other information

Fields of science

Computer and information sciences

Language

English

Open access

Open

License

Other

Keywords

Computer and information sciences

Subject headings

Temporal coverage

undefined

Related to this research data