Alignment and Spoken Dialogue Systems – Influences on Trustworthiness and User Satisfaction
University of Muenster, Germany
With the increasing dissemination of spoken dialogue systems (SDS) spoken human–computer interaction gains importance in everyday life. Currently, SDS are — to varying degrees — able to recognize, possess, and generate human-like, natural speech. They are employed in different contexts, for example as personal assistants in smartphones (like Siri® from Apple) or special devices (like Echo® from Amazon), as in-car applications or for purposes of persons with special needs. For the communication with SDS to be successful, alignment plays a central role. Alignment describes the tendency to adapt linguistic characteristics of an interlocutor. Lexical alignment, for instance, comprises the adaptation of word choices. Besides its contribution to communicative success, alignment can reflect the interlocutors’ relationship. Like other computer systems or computational agents, SDS can be perceived as social agents. Therefore, the communication between users and SDS depends, inter alia, on whether an SDS is perceived as trustworthy.
This dissertation project explores the role of alignment for spoken dialogue systems and how that affects SDS’ trustworthiness and user satisfaction with the interaction. For this purpose, the dissertation focuses on lexical alignment shown by users toward an SDS (Study 1) as well as on lexical alignment shown by an SDS toward users (Study 2). Furthermore, the dissertation includes politeness behavior as a form of pragmatic alignment shown by an SDS (Study 3).
Study 1 focused on how characteristics of the conversational partner influence the amount of lexical alignment shown by the interlocutor. Especially, the present experiment involved a human versus a computer partner, namely an SDS. Furthermore, we varied the elaborateness of the employed language style. When studying lexical alignment, previous research has often only examined the phenomenon in simple reference–tasks. This study aimed at the creation of a more natural conversational situation, to allow participants for the usage of their own words. We used a Wizard-of-Oz scenario to simulate the respective conversational partner that was employed in telephone interviews regarding satisfaction with students’ life. 132 students took part in the experiment. It revealed that persons lexically aligned more to an interlocutor employing a restricted, robot-like language style, irrespective of whether persons perceived talking to a human or SDS. However, they were more satisfied with the conversation when talking to a human partner and an elaborated language style.
Study 2 included two experiments that were conducted to investigate the influence of lexical alignment on trustworthiness and satisfaction with the conversation. In the first experiment, we examined how users experience lexical alignment shown by an SDS when they talked to it themselves. 130 students were asked to talk to an SDS, allegedly for the SDS’s improvement. Participants read out a set of ten questions and statements related to university life to a pretended SDS via telephone. In the second experiment, we focused on the perceived trustworthiness of the same SDS from the point of view of observers. 135 student participants listened to the conversation between a student and a spoken dialogue system. The SDS either showed lexical alignment to the student or not. The system’s and the student’s utterances were the same as in experiment 1. However, this experiment was conducted as an online experiment. In sum, analyses showed that when SDS lexically align to users, this decreases the perceived cognitive demand. Participants that merely listened to a conversation between an SDS and a peer ascribed more integrity and likeability to an aligning SDS.
With increasing capabilities of SDS, their perceptions as social agents has an impact on further aspects of their design. Not only lexical alignment is important but also other facets of alignment that reflect pragmatics. A relevant case of pragmatic alignment is politeness. Following politeness theory (Brown & Levinson, 1987), politeness can be conveyed by respecting the interlocutor’s positive or negative face. Persons do not only show polite behavior toward humans but also toward SDS. In study 3, we examined the assessment of an SDS either displaying politeness or rudeness. Polite communicative behavior toward computers includes the avoidance of face threats. 58 high school students participated and were asked to assess an SDS. The study revealed that when SDS employ polite behavior instead of being rude, this leads to positive perceptions by users, including trustworthiness in form of a higher assessment of goodwill and integrity.
In sum, this dissertation shed light on how alignment can influence important aspects of communicative success and can inform designing principles for SDS. If the SDS requires precise input on the word level, it can be recommended to employ a restricted language style for the SDS in order to elicit lexical alignment. Furthermore, the choice of language style can also be exploited in human–human communication. In interview situations, which aim at a particular topic, it might be useful to choose a restricted language style. On the other hand, an elaborated language style should be considered if conversational partners are supposed to use their own words and not to stick on given specifications. In addition, the first study revealed that an elaborated language style enhances the likeability of the conversational partner, which holds true for both the SDS and the assumed human partner. Thus, the ideal recommendation regarding language style depends on the purpose of the conversation. Moreover, the role of alignment implemented in SDS’ linguistic behavior should be discussed. The results of the third study imply that the consideration of politeness enhances the perception of the SDS regarding pleasantness, appropriateness on the one hand and trustworthiness on the other hand. These results are in accord with findings in human–human interaction.
Future research should include current SDS and further alignment levels.
Searching for Equivalence: An Exploration of the Potential of Online Probing with Examples from National Identity
GESIS-Leibniz Institute for the Social Sciences, University of Mannheim
Over the last decades, a tremendous increase has occurred in cross-national data production in social science research (Harkness 2008). The large-scale provision and the wide-spread use of cross-national data sets constitute a huge opportunity for the research community but also pose the challenge to develop cross-national comparable survey items (Lynn, Japec, and Lyberg 2006). At the same time, substantive researchers are increasingly aware of the necessity to understand respondents’ cognitive processes when answering a survey question (Smith et al. 2011). Recently, the method of online probing has been developed that implements probing techniques from cognitive interviewing in web surveys. In the traditional probing approach, interviewers obtain additional information by asking follow-up questions called probes (Beatty and Willis 2007). In contrast, online probing transfers probing questions as open-ended questions in the web context. It can reveal the cognitive processes of web survey participants and it helps to assess whether respondents’ interpretations of an item differ across countries (Braun et al. 2015).
The implementation of probes within web surveys offers respondents a higher level of anonymity of their answers in comparison to the laboratory situation during cognitive interviewing (Behr and Braun 2015), which potentially reduces social desirability effects in the response process (Bethlehem and Biffignandi 2012). Online probing can easily realize large samples sizes, which increases the generalizability of the results, enables an evaluation of the prevalence of problems or themes, and can explain the response patterns of specific subpopulations (Braun et al. 2015). Since all probes have to be programmed in advance, all respondents receive the same probe, and the procedure is highly standardized (Braun et al. 2015). When applied to cross-national data, online probing is a powerful tool to assess the comparability of questions. In contrast to traditional quantitative approaches to assess the equivalence of items (e.g., measurement invariance tests), online probing can explain why respondents in certain countries might misunderstand a specific item or why they adopt different perspectives when providing a response (Behr et al. 2014a).
The overarching goal of this dissertation project is to explore the potential of the method of online probing vis-à-vis other relevant methods that share similar goals (cognitive interviewing and measurement invariance tests) and as an assessment tool for single-item indicators in cross-national surveys. In particular, the dissertation addressed the following research questions: 1) Does online probing arrive at similar results than other methods? 2) Which are the strength and weaknesses of online probing in comparison to other methods? 3) How can online probing be combined with other methods in a mixed-methods approach? 3) How useful is online probing to assess the cross-national comparability of single-item indicators? Since the dissertation’s goal is to compare the methods of online probing, cognitive interviewing, and measurement invariance tests in regard to their potential to detect problematic issues at the item level, the field of national identity has been chosen as a substantive application for the method comparisons due to the existence of potentially problematic measures in a cross-national context. This dissertation focused on items from the 2013 International Social Survey Programme module on National Identity.
The first article of this dissertation (“Comparing Cognitive Interviewing and Online Probing: Do They Find Similar Results?”; published in Field Methods) analyzed whether online probing and cognitive interviewing arrive at similar conclusion with regard to error detection and themes that are mentioned by respondents when applied to the same set of items (ISSP item battery on specific national pride). The study compares data from cognitive interviews conducted with 20 German respondents in April 2013 with a web survey conducted with 532 German respondents in September 2013. The article revealed that both methods share complementary strength and weaknesses. While probing answers in cognitive interviewing show indications for a higher response quality, online probing can compensate through a larger sample size. The article also provides the researcher with guidance which method is preferable in a given research situation and advocates the combination of both methods in a mixed-methods approach.
The second article of this dissertation (“Necessary but Insufficient: Why Measurement Invariance Tests Need Online Probing as a Complementary Tool”; forthcoming in Public Opinion Quarterly, “2016 AAPOR/WAPOR Janet A. Harkness Award” and “2016 QDET2 Monroe Sirken Innovative Paper Award for Young Scholars of Question Evaluation”) provides an example for a mixed-method approach that combines online probing with quantitative measurement invariance tests. With the examples of the concepts of constructive patriotism and nationalisms, this study explains how the combination of both methods can reveal incomparable items and countries but also explain issues related to cross-national comparability. By analyzing data from the 2013 ISSP and a web survey with 2,685 respondents from five countries, online probing discovered the reasons for missing comparability (varying lexical scope and silent misunderstanding of a key term) that was also detected during the measurement invariance tests.
Finally, the third article showed the potential of online probing for the assessment of the cross-national comparability of single-item indicators with the example of the general national pride item. Online probing provides a unique solution for the decision whether single-item indicators are equivalent because the traditional approach of measurement invariance tests presupposes multiple-indicator measures and is, therefore, inapplicable for single-item indicators. This study analyzed 2,685 probe responses from a web survey that was conducted in five countries. Online probing uncovered several potentially problematic issues and the fact that respondents in all countries associate various concepts with the general national pride item.
Therefore, the contribution of this dissertation is:
1. The insight that online probing arrives at similar results than cognitive interviewing and measurement invariance tests.
2. A clear understanding of the method’s strength and weaknesses vis-à-vis cognitive interviewing and measurement invariance tests.
3. An explanation of optimal implementations of online probing in a mixed-methods approach.
4. A demonstration of the usefulness of online probing to assess the cross-national comparability of single-item indicators.
5. But also, an assessment of the cross-national comparability of measures of national identity for substantive researchers.
Monitoring and Expressing Opinions on Social Networking Sites – Empirical Investigations based on the Spiral of Silence Theory
University of Duisburg-Essen, Germany
Social networking technologies such as Facebook are increasingly used for the exchange of information and opinions on politically and civically relevant issues. This development, on the one hand, may foster public deliberation and political learning as every user has the same chances to publicly voice their opinion on social issues and learn how other people argue about and judge a subject. On the other hand, there is a risk that (non-representative) opinion expressions in large-scale online discussions convey distorted pictures of public opinion to users who might adapt their attitudes and behaviors to this alleged opinion climate. Given these opportunities and risks, it seems plausible to ask how long-standing theoretical propositions focusing on the dynamics of public opinion do justice to these new ways of gauging and expressing opinions in mass-interpersonal contexts as provided by social media.
The spiral of silence theory (Noelle-Neumann, 1993) proposes that human beings continuously monitor their social environment to assess prevailing opinion trends. The silence hypothesis states that people are reluctant to publicly voice their viewpoint on a controversial topic when they encounter an opinion climate that opposes their personal opinion. People remain silent because of their fear of being isolated and experiencing sanctions from their environment for being deviant. Although there are initial studies on the validity of these theoretical tenets in online communication, limited empirical support has been found for the hypotheses of spiral of silence theory in online realms. The inconclusive state of knowledge prompts one to analyze the particular circumstances under which people may (or not) be sensitive to the opinion climate when expressing their opinion through social media. Given that online networking platforms function as social spheres with an unprecedented quality, for instance, in terms of the size or the composition of the audience (including close friends, acquaintances, co-workers but also strangers at the same time), it seems conceivable that these environmental factors intervene in the silencing process, requiring new explanations for people’s communication behavior. Following this line, it seems a pressing need for research to not only assess the validity but also theorize potential boundaries of the spiral of silence theory in these new communication channels. For this purpose, this dissertation is intended to investigate (a) whether and how users monitor other people’s opinions through social networking technologies and (b) under which circumstances they are willing to contribute to these opinion climates by expressing their personal viewpoint on these platforms.
These two processes were addressed empirically by a multi-methodological approach consisting of five studies. Study 1 examined the effects of different opinion cues (available on Facebook) on people’s inferences about public opinion. Results of a two-session online experiment (N = 657) showed that individuals’ fear of isolation sharpened their attention toward user-generated comments, which, in turn, affected recipients’ public opinion perceptions in the direction advocated in the comments. The latter influenced subjects’ opinions and their willingness to participate in social media discussions. Study 2 explored the situational manifestations of people’s fear of isolation and environmental variables as influence factors of people’s outspokenness. Results from qualitative in-depth interviews (N = 12) revealed a variety of sanctions people expect from others when voicing a minority opinion and a series of factors such as the size or the relationship to the audience which could exert an impact on one’s willingness to express their opinion in online realms. Study 3 further investigated the expectations of sanctions and their explanatory value regarding people’s communication behavior in different situations. Findings from an online experiment (N = 365) demonstrated that the expectation of being personally attacked can explain why people are more inclined to express a minority opinion in offline rather than in online communication settings. Drawing on results from the previous studies of this dissertation, Study 4 tested whether the publicness of social networking platforms in terms of the size and relational diversity of the audience affect people’s outspokenness. Results from a cross-cultural experiment (N = 312) showed that in Germany, a higher level of publicness of a controversial discussion on Facebook reduced people’s likelihood to express their viewpoint, attenuating the influence of the opinion climate. This pattern, however, was not found in Singapore. Study 5 zoomed in on the role of the online audience and analyzed whether the relationship to the audience determines people’s likelihood to express their opinion on Facebook. Findings from a laboratory experiment (N = 119) showed that the relational closeness to the envisioned audience on Facebook does not directly affect people’s likelihood to express their opinion on a controversial topic. However, findings revealed that users’ certainty about the prevailing opinion distribution among the particular audience is a crucial predictor of people’s outspokenness.
This collection of studies extends previous research by testing the validity of the spiral of silence theory but also pointing to potential boundaries thereof in the context of increasingly popular communication environments: While people were found to infer overarching opinion climates from opinion cues in the form of user-generated comments on the social networking platform Facebook, their likelihood of opinion expression was largely not contingent on whether they found the opinion climate in agreement with their personal viewpoint or not. Results instead showed that (at least in a Western culture) environmental factors pertaining to the platform such as a greater publicness in terms of a large and relationally diverse audience attenuates the influence of the opinion climate on people’s outspokenness. From a practical point of view, results indicate a rather low general willingness of people to publicly voice their personal opinion on social networking platforms, also showing that predominantly voices of the “hard core” (i.e., those with a higher attitude certainty and topical involvement) are resembled in online opinion climates. Based on this, it seems desirable to reflect upon how technologies or educators can increase the ideological representativeness and the rationality of online discourses. When it comes to increase users’ motivation to participate in discourses on social networking services, making audiences better visible to users or rewarding users who corroborate their online opinion expressions with valid arguments appear promising practical approaches in face of this dissertation’s findings.