Predictive Processing Approach to Modelling Prosodic Hierarchy for Speech Synthesis

Description of the granted funding

The project addresses one of the outstanding issues in speech technology: the ability to synthesize prosodically coherent and cohesive conversational speech, appropriately incorporating long-distance contextual and situational dependencies. The objective is to deliver a novel speech synthesis platform explicitly using encoded prosodic information as a source of conversation dynamics that helps maintain context-dependent cohesion and coherence in human-machine interaction and synthesized dialogues. In a reciprocal fashion, the system trained on a large data set of conversational speech will be reinterpreted as a complex statistical model and contribute to our theoretical understanding of features and wide-range interdependencies shaping conversational prosody. The design of the system will be informed by our expertise in prosodic analysis and deep learning.

Starting year

2023

End year

2027

Granted funding

Juraj Simko

University of Helsinki

499 819 €

Funder

Research Council of Finland

Funding instrument

Academy projects

Call

Academy Project Funding 2022

Other information

Funding decision number

357262

Fields of science

Languages

Research fields

Fonetiikka

Identified topics

languages, linguistics, speech