Skip to main content
To KTH's start page To KTH's start page

Congratulations to the papers “Let’s face it” and “Gesticulator”

winners of best paper awards at IVA 2020 and ICMI 2020

Published Nov 23, 2020

EECS doctoral students have received awards at prestigious conferences. The research behind the papers focus on generating meaningful gestures for a virtual avatar in a computer game or a cartoon by a machine-learning method (”Gesticulator”) and methods for generating head and face motions for virtual agents engaged in a conversation with a conversational partner (“Let’s face it”).

We have spoken to the first authors of each paper, doctoral students Patrik Jonell and Taras Kucherenko, about their research and what it means winning these awards.

The contributing team

A team contributed in different ways to the papers, consisting of Professor Gustav Eje Henter, Professor Jonas Beskow, Professor Hedvig Kjellström, Associate Professor Iolanda Leite, Doctoral student Sanne van Waveren and Postdoc Simon Alexanderson (more below on their different roles).

Taras Kucherenko: The ICMI 2020 Best Paper Award, for the paper "Gesticulator: A framework for semantically-aware speech-driven gesture generation"

Tell us about your paper and your research in general.

Authors contributions

”My research is on machine learning models for non-verbal behaviour generation, such as hand gestures and facial expressions. I mainly focus on hand gestures generation. In this paper we developed a new machine learning model for automatic gesture generation. While previous models for this task used only one modality of the speech: either audio or text, we use both of those modalities together to generate corresponding gestures. Such a model could be used to generate gestures for a virtual avatar in a computer game or a cartoon.”

How does it feel to have received this award?

”It feels like my life is too good to be true! I am quite overwhelmed and still struggle to digest what has just happened. Especially since I was the 2nd author at another paper which also receive Best Paper Award at IVA just a week before. It seems that I am in a Truman show and somebody presses the button "Even More Recognition" every week ... I think this Best Paper Award is my biggest professional achievement so far.”

What do you believe made your paper the winning paper?

”To be honest, that's still a mystery to me! I think that's because I was the first to do something that most of the people in my field wanted to do: to add semantic information into end-to-end co-speech gesture generation model.”

In what way is your area of research interesting for a non-researcher and for the future?

”Human communication is to a large extent non-verbal. While talking, people spontaneously gesticulate, which plays a key role in conveying information. If we want interaction with social agents (such as robots or virtual avatars) to be natural and smooth we need to enable them to gesticulate as well. "Gesticulator" is the first step toward generating meaningful gestures by a machine-learning method.”

Patrik Jonell: The IVA 2020 Best Paper Award with the paper "Let’s face it: Probabilistic multi-modal interlocutor-aware generation of facial gestures in dyadic settings"

What does it mean for you winning this award?
”I feel very honoured by the fact that the IVA (Intelligent Virtual Agent) community appreciated and recognised our work! We worked quite hard on this publication, so it feels really good that we got this award.”

Tell us a little about your paper.
”In our paper we describe a method for generating head and face motions for a virtual agent engaged in a conversation with a conversational partner. Specifically, we trained a machine learning model which is able to take head movements, facial expressions, and speech from the conversational partner into account when generating the head and face motions for the agent. We showed through a series of experiments that our method seems to be able to successfully use this information to create more appropriate non-verbal behaviours for the agent.”

What is your area of research and why is it relevant for society?
”My area of research is about developing machine learning methods for social agents (for example virtual agents or social robots) to adapt their non-verbal behaviour to the conversational partner. Most social agents today do not take into consideration how the user reacts or behaves, but instead only the words they say. This is quite far off from how we as humans communicate with each other. If I completely ignored how my friend reacts when I talk to them, I wouldn't be a very nice person to talk to! And bringing in these dynamic behaviours into social agents have shown to, for example, increase trust to the agent and improve the interaction, which can be extremely important in health care applications or in any situation where trust is of importance. Considering that we will most likely see more and more social agents in various parts of society, being able to communicate better with them is then quite important.”