The research team in the IST project ERMIS, which focused on linguistic and paralinguistic cues in human speech and finished at the end of December 2004, created a prototype able to analyse and respond to user input. The team included researchers with skills ranging from engineering and computer science to psychology and human communication.
"In looking for emotional cues in language, we worked on three major inputs to the system," explains Stefanos Kollias of the National Technical University of Athens. "Linguistic analysis of speech in English and Greek, work on paralinguistic features such as intonation and emphasis, and the study of facial expressions."
In the analysis phase, the team extracted some 400 features of common speech, then selected around 20-25 as the most important in expressing emotion. These terms were then fed into a neural network architecture that combined all the different speech, paralinguistic and facial communications features. For facial expression, some 19 were selected as the most relevant and were input accordingly.
The results of this analysis were incorporated into a prototype system with several on-screen characters, each of which were capable of reacting to and reproducing the emotional content in speech and facial expressions. By interacting with their human subjects, these computer characters would attempt to make the user angry, happy, sad or even bored. Sometimes with great success, says Kollias.
He emphasises, however, that the team did not just focus on extreme emotions. "We tried to develop real-life situations, with the language and facial reactions that expressed everyday emotions over a wide range. For example feeling positive and eager to participate, or negative and less motivated."
The result of the ERMIS team’s work is what they call the "sensitive artificial listener," a computer character that is capable of much more realistic expression of emotions in human communication. The project partners have taken these results and are now analysing them with a view to incorporation into their own products. BT for example is very interested in how the results could be used within its call centre technologies. Nokia, another partner, is investigating the possibility of incorporating such abilities into its multimedia mobile phones. Partner Eyetronics is incorporating what has been learnt into its own 3D virtual models, in order to enhance modelling of facial movements in virtual characters.
"Our work has shown that combined AV [audiovisual] and speech analysis is both feasible and has the potential for incorporation into working applications," says Kollias. The project results have also lead to a follow-on initiative, he says, the four-year HUMAIN (FP6) project. Kollias emphasises, however, that it is too soon to judge the full extent of interest in the project results, as they are still being presented at conferences around the world.
PLEASE MENTION IST RESULTS AS THE SOURCE OF THIS STORY AND, IF PUBLISHING ONLINE, PLEASE HYPERLINK TO: http://istresults.cordis.lu/
Contact: Tara Morris, +32-2-2861985, tmorris at gopa-cartermill.com
Emotional intelligence for computer-based characters?
Company: National Technical University of Athens
Contact Name: Professor Stefanos Kollias
Contact Email: stefanos@cs.ntua.gr
Contact Phone: +30 210 7722488
Contact Name: Professor Stefanos Kollias
Contact Email: stefanos@cs.ntua.gr
Contact Phone: +30 210 7722488