Dataset of Audiovisual Speech for AR Telepresence Studies (Speech Recordings)

Description

Dataset of speech recordings made in the anechoic chamber "Lampio" at Aalto University. 21 Participant ("P1" - "P21") Four parts are included: 1) Conversations: Ten different scripted three-part conversations ("C1" - "C10"). Each participant is in two of them. All of the three parts is played by all three participants ("S1" - "S3"). See assignment_conversations.xlsx 2) Harvard_Sets: Sets 25 and 36 of the Harvard sentence lists 3) Sentence 1 from List 25 in five different voice levels (from "barely not whispering" to "screaming as loud as you can") 4) Native_Language: List 25 translated to native languages of 12 of the participants (French, Finnish, Hebrew, Hindi, Spanish (Mexico), Spanish (Chile), Catalan, Latvian, Italian, Polish, Romanian, German) Each file contain data from three receivers: Ch 1: GRAS 40 HF 1" low-noise meausurement microphone. 1.5 m away from the subject Ch 2: RØDE NT1 large diaphragm condenser microphone. 2 m away from the subject Ch 3: DPA 4060. Attached to the subject's clothes Calibration.wav: Calibration data for Ch1. (Recorded using a B&K 4231 Calibrator 1kHz, 94 dB) Accompanying video data can be obtained by personal request from nils.meyer-kahlen@aalto.fi

Year of publication

2025

Type of data

Authors

Aalto University

Department of Information and Communications Engineering

Anja Hofmann - Creator

Nils Meyer-Kahlen - Creator

Tapio Lokki - Creator

Friedrich-Alexander-Universität Erlangen-Nürnberg

Sebastian Schlecht - Creator

Zenodo - Publisher

Project

Other information

Fields of science

Electronic, automation and communications engineering, electronics

Language

Open access

Restricted access

License

Other

Keywords

Subject headings

Temporal coverage

undefined