Mejora del Funcionamiento de Sistemas de Diálogo Hablado Mediante Reconocimiento del Estado Emocional de Usuarios

  1. López-Cózar Delgado, Ramón
  2. Silovsky, Jan
  3. Griol Barres, David
Revista:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Ano de publicación: 2010

Número: 45

Páxinas: 191-198

Tipo: Artigo

Outras publicacións en: Procesamiento del lenguaje natural

Resumo

In this paper we propose a new technique to enhance the performance of spoken dialogue systems by means of recognising users’ emotional states. The technique employs two fusion modules that combine emotional predictions. The former employs a number of fusion methods to combine predictions made by classifiers that deal with different types of information regarding each sentence uttered by the user. These predictions are the input to the second fusion modules, which employs a fusion method to combine the predictions and obtain the most likely emotional category. This category represents the final decision of our technique regarding the emotional state of the user. We have carried out experiments considering two emotional categories (‘Non-negative’ and ‘Negative’) and classifiers to deal with information regarding prosody, acoustics, lexical items and dialogue acts. The results obtained employing an emotional corpus collected in our University show that the first fusion module clearly outperforms the classifiers, and so it does regarding a baseline system. The second fusion module, which represents the novelty of our study, enables enhancing the accuracy of the former fusion method by 2.25% absolutely.

Referencias bibliográficas

  • Ai, H., Litman, D. J., Forbes-Riley, K., Rotaru, M., Tetreault, J., Purandare, A. 2006. Using system and user performance features to improve emotion detection in spoken tutoring systems. Actas de Interspeech, pp. 797-800.
  • Ang, J., Dhillon, R., Krupski, A., Shriberg, E., Stolcke, A. 2002. Prosody-based automatic detection of annoyance and frustration in humancomputer dialog. Actas de ICSLP, pp. 2037- 2039.
  • Bänziger, T., Scherer, K. R. 2005. The role of intonation in emotional expressions. Speech Communication, 46, pp. 252-267.
  • Devillers, L., Vidrascu, L. 2006. Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogs. Actas de Interspeech, pp. 801-804.
  • Klein, J., Moon, Y., Picard, R.W. 2002. This computer responds to user frustration: theory, design and results. Interacting with Computers, 14(2), pp. 119-140.
  • Lee, C. M., Narayanan, S. S., Pieraccini, R. 2002. Combining acoustic and language information for emotion recognition. Actas de ICSLP, pp. 873-876.
  • Lee, C. M., Narayanan, S. S. 2005. Toward detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing, vol. 13(2), pp. 293-303.
  • Liscombe, J., Riccardi, G., Hakkani-Tür, D. 2005. Using context to improve emotion detection in spoken dialogue systems. Actas de Interspeech, pp. 1845-1848.
  • López-Cózar, R., Araki, M. 2005. Spoken, Multilingual and Multimodal Dialogue Systems. Development and Assessment. John Wiley & Sons Publishers.
  • López-Cózar, R., Callejas, Z. 2005. Combining Language Models in the Input Interface of a Spoken Dialogue System. Computer Speech and Language, 20, pp. 420-440.
  • Morrison, D., Wang, R., De Silva, L. C. 2007. Ensemble methods for spoken emotion recognition in call-centres. Speech Communication, vol. 49(2) pp. 98-112.
  • Neiberg, D., Elenius, K., Laskowski, K. 2006. Emotion recognition in spontaneous speech using GMMs. Actas de Interspeech, pp. 809-812.
  • Piccard. R. 1997. Affective Computing. MIT Press. Tax, D., Van Breukelen, M., Duin, R., Kittler, J. 2000. Combining multiple classifiers by averaging or multiplying. Pattern Recognition, vol. 33, pp. 1475-1485.