Emotions, conversational systems and heterogeneous data sources

Eisman Cabeza, Eduardo Manuel

Emotions, conversational systems and heterogeneous data sources

Eisman Cabeza, Eduardo Manuel

Dirigida por:

Juan Luis Castro Peña Director

Universidad de defensa: Universidad de Granada

Fecha de defensa: 18 de diciembre de 2015

Tribunal:

Juan Carlos Cubero Talavera Presidente
Héctor Pomares Cintas Secretario
María Isabel Navarro Jiménez Vocal
Alejandro Moreo Fernández Vocal
Pablo Carmona del Barco Vocal

Departamento:

CIENCIAS DE LA COMPUTACIÓN E INTELIGENCIA ARTIFICIAL

Tipo: Tesis

Teseo: 393930 DIALNET DIGIBUG editor

Resumen

Conversational agents are intelligent systems, usually represented by a real character or a cartoon, which are able to engage in a certain natural language conversation with a human being, interact with their environment, and behave as a real person would do. The main objective of this technology is to make human-computer interaction easier. It can be used in any situation where there is a communication among people, with the agent playing the role of any of them. Nowadays, conversational agents are being more and more used in different application domains. From agents that simulate the behavior of real people for training on different fields, to systems that act like natural language query interfaces to large information sources. In short, an endless number of areas such as culture, entertainment, tourism, e-learning, e-commerce, or medicine, can greatly benefit from using this technology. From the users¿ point of view, many studies (Kaufmann and Bernstein, 2010; Zhou et al., 2012) reveal that there is a clear preference for natural language interfaces to the detriment of others more traditional like keywords search of classic search engines, formal query languages, or menu driven interaction. In addition, it has been demonstrated that the interest of users decreases exponentially with the increase in the number of mouse clicks, fact that is emphasized even more if we talk about mobile devices where traditional input interfaces are very limited. However, natural language systems do are able to ease and improve the user experience. This thesis is not exclusively theoretical, but it has a strong practical component. In fact, it provides solutions to three real problems demanded by information society, within the field of conversational agents and the world of natural language processing, which we have faced in different projects. First of all, the improvement of the naturalness of this kind of systems. This problem is motivated by the increasing interest in the development of intelligent systems that simulate the behavior of human beings. These agents must behave with a high realism level, so they must be able to emotionally react to the events that happen in the world they live. This highlights the need for an emotional state control system that could guarantee that agents behave in a consistent way and adapt themselves to the different situations naturally. Although there are some emotional state control systems (Yanaru et al., 1994; El-Nasr et al., 2000; Egges et al., 2003), non-covered or improvable necessities (realism, gradualness, personality, emotional filtering, temporal decay, reuse, adaptability, interpretability, non-determinism, emotional stability, memory, conduct types, agents¿ interaction, agent¿s health¿) made evident the importance of developing a new system that could solve our problem with total accuracy. In second place, the problem of information overload that there is nowadays on the Internet. There are certain websites which, in spite of having a good structure, organization and design, handle such amount of information that sometimes it is complicated to find a certain information in which we can be interested. In this sense, it was evident the necessity of defining a new mechanism for organizing and accessing the information that was highly efficient and effective. It is here where virtual assistants have proven to be the most effective technology in recent years because they are able to interpret and answer complex natural language questions. Although the first systems existing in literature could be considered more as tools that did not allow asking natural language questions (Lieberman, 1995), little by little these systems were gaining popularity (Microsoft¿s agents, Ball et al., 1997) and using more advanced techniques such as case-based reasoning, neural networks or graphs (Wexelblat and Maes, 1999), and they kept evolving and including features similar to the ones that we can find in conversational agents nowadays (Cassell et al., 2000). However, the missing and improvable features (e.g. the lack of a virtual character able to show emotions that make the interaction process friendlier, the ability to answer general domain questions in several languages, generate dynamic answers whose content depends on certain restrictions, or guide the user through a certain process) made us propose a methodology and a framework for the design of closed-domain virtual assistants which could be integrated into every existing website and would cover the features that our problem required. Finally, the third problem that we have faced is the access to heterogeneous data sources. Nowadays there are more and more applications that integrate the information from different origins or services in order to provide a solution to a problem. However, many times the way of accessing that information turns out to be complicated, not very natural and inefficient for users. This situation happens in many e-commerce portals. To solve this problem, we needed a natural language system that could be able to integrate, transparently for the user, the knowledge existing in several independent data sources, each of which could have a different format and structure. It had to be interactive, since sometimes the requests of users turn out to be a little bit vague in the sense that they need to be completed little by little with additional information, which could be perfectly provided in a dialog with the system. Likewise, it was essential that it could handle fuzzy concepts such as ¿cheap¿ or ¿recent¿ and temporal queries that involve relative dates. Finally, the system had to be able to advise users according to their preferences. The majority of existing systems are focused on a very specific part of the problem. Some like NaLIR (Li and Jagadish, 2014b) have interesting features as natural language interfaces. They support complex SQL queries, but they use just one knowledge source and lack conversational capabilities. Others, like ORAKEL (Cimiano et al., 2008), focus on minimizing the effort of adapting the system to a given domain. On the other hand, systems like Natural Language Assistant (Chai et al., 2002) do include a dialog manager, but they neither support general domain queries nor work with different types of data sources at the same time. In the case of a commercial system like IKEA¿s Anna, she is much more complete in that conversational side, but not so advanced in the interaction with the database (she does not handle fuzzy or temporal queries), and she does not seem to include a full emotional model. These and other problems such as the cleansing and filtering of information from the database, the use of taxonomies that enrich that information, or the inclusion of an advising module to advise users in a personalized way, are the ones that we try to solve with the design of a new system for accessing heterogeneous data sources. In this way, the main objective of this thesis is to provide solutions to a series of interesting open problems related to the domain of conversational systems: ¿ To make an emotion modeling system that is easy to interpret and modify, and improves the naturalness of conversational agents. Our hypothesis is that the application of fuzzy rule-based systems would provide a good solution to this problem (Eisman et al., 2009). ¿ To make easy the access, in an immediate and precise way, to great amounts of dynamic information related to a certain domain. We believe that the use of semantic structures such as ontologies and the application of rules and restrictions during the decision making process would allow organizing and taking advantage of all the existing knowledge in the domain (Eisman et al., 2012). ¿ To combine different heterogeneous data sources to allow users to access information in an integrated, transparent, effective, efficient, and pleasant way. We consider that the use of multi-agent systems able to combine expert agents and decision agents would allow solving this problem in a modular and scalable way (Eisman et al., 2015). Moreover, in order to achieve these objectives not only theoretically, we have considered to develop systems of commercial interest which put the acquired knowledge and the designed models into practice, and can be used to solve real problems. References Ball, Gene, Dan Ling, David Kurlander, John Miller, David Pugh, Tim Skelly, Andy Stankosky, David Thiel, Maarten Van Dantzich, and Trace Wax (1997). Lifelike Computer Characters: the Persona project at Microsoft Research. Cambridge, MA, USA: MIT Press, pp. 191¿222. isbn: 0-262-52234-9. Cassell, Justine, Joseph Sullivan, S. Prevost, and Elizabeth Churchill (2000). Embodied conversational agents. Cambridge, MA, USA: MIT Press. isbn: 9780262032780. Chai, Joyce, Veronika Horvath, Nicolas Nicolov, Margo Stys, Nanda Kambhatla, Wlodek Zadrozny, and Prem Melville (2002). Natural Language Assistant: A Dialog System for Online Product Recommendation. AI Magazine 23.2, p. 63. doi: 10.1609/aimag.v23i2.1641. Cimiano, Philipp, Peter Haase, Jörg Heizmann, Matthias Mantel, and Rudi Studer (2008). Towards portable natural language interfaces to knowledge bases - The case of the ORAKEL system. Data & Knowledge Engineering 65.2. Including Special Section: 3rd XML Schema and Data Management Workshop (XSDM 2006) - Five selected and extended papers, pp. 325 ¿354. issn: 0169-023X. doi: 10.1016/j.datak.2007.10.007. Egges, Arjan, Sumedha Kshirsagar, and Nadia Magnenat-Thalmann (2003). A Model for Personality and Emotion Simulation. Proceedings of the 7th International Conference on Knowledge-Based Intelligent Information and Engineering Systems 2003. Oxford, UK, September 3-5. Vol. 2773, PART 1. Conference Code: 63855, pp. 453¿461. isbn: 03029743 (ISSN). doi: 10.1007/978-3-540-45224-9_63. Eisman, Eduardo M., Víctor López, and Juan Luis Castro (2009). Controlling the emotional state of an embodied conversational agent with a dynamic probabilistic fuzzy rules based system. Expert Systems with Applications 36.6, pp. 9698 ¿9708. issn: 0957-4174. doi: 10.1016/j.eswa.2009.02.015. Eisman, Eduardo M., Víctor López, and Juan Luis Castro (2012). A framework for designing closed domain virtual assistants. Expert Systems with Applications 39.3, pp. 3135 ¿3144. issn: 0957-4174. doi: 10.1016/j.eswa.2011.08.177. Eisman et al., 2015. Eduardo M. Eisman, María Navarro, and Juan Luis Castro (2015). A multi-agent conversational system with heterogeneous data sources access. Submitted (to be published). El-Nasr, Magy Seif, John Yen, and Thomas R. Ioerger (2000). FLAME ¿ Fuzzy Logic Adaptive Model of Emotions. Autonomous Agents and Multi-Agent Systems 3.3, pp. 219¿257. doi: 10.1023/A:1010030809960. Kaufmann, Esther and Abraham Bernstein (2010). Evaluating the usability of natural language query languages and interfaces to Semantic Web knowledge bases. Web Semantics: Science, Services and Agents on the World Wide Web 8.4. SemanticWeb Challenge 2009 User Interaction in SemanticWeb research, pp. 377 ¿393. issn: 1570-8268. doi: 10.1016/j.websem.2010.06.001. Li, Fei and H. V. Jagadish (2014b). NaLIR: An Interactive Natural Language Interface for Querying Relational Databases. Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. SIGMOD ¿14. Snowbird, Utah, USA: ACM, pp. 709¿712. isbn: 978-1-4503-2376-5. doi: 10.1145/2588555.2594519. Lieberman, Henry (1995). Letizia: An Agent That Assists Web Browsing. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI-95). Ed. by Chris S. Mellish. Montreal, Quebec, Canada:Morgan Kaufmann publishers Inc.: San Mateo, CA, USA, pp. 924¿929. Wexelblat, Alan and Pattie Maes (1999). Footprints: History-Rich Tools for Information Foraging. CHI ¿99: Proceedings of the SIGCHI conference on Human factors in computing systems. Pittsburgh, Pennsylvania, United States: ACM, pp. 270¿277. isbn: 0-201-48559-1. doi: 10.1145/302979.303060. Yanaru, Tarao, Toyohiko Hirotja, and Naoki Kimura (1994). An emotion-processing system based on fuzzy inference and its subjective observations. International Journal of Approximate Reasoning 10.1, pp. 99 ¿122. issn: 0888-613X. doi: 10.1016/0888-613X(94)90011-6. Zhou, Lina, Ammar S. Mohammed, and Dongsong Zhang (2012). Mobile personal information management agent: Supporting natural language interface and application integration. Information Processing & Management 48.1, pp. 23¿31. issn: 0306-4573. doi: 10.1016/j.ipm.2011.08.008.