General Online Research 2019
F 2: Poster Session (Part II)
Selection Bias and Representativeness of Survey Samples: the Effectiveness of Mixing Modes and Sampling Frames
1Demetra Opinioni.net, Italy; 2University of Milano-Bicocca, Italy
Relevance&ResearchQuestion: Nowadays mixed-mode approaches are used to deal with the non-coverage issue in sample surveys. There are many examples of surveys that mix web, telephone and F2F modes, often using the same sampling frame. Drawing on our previous work, we apply a mixed-mode survey design to different sampling frames (landline phone list and online panelists). We found that telephone coverage bias may be reduced adopting approaches that use different sampling frames. This paper aims to study the representativeness of samples from a mixed-mode survey design (web-landline phone) and from a telephone survey (calling mobile phone and landline phone numbers), comparing their estimates to the Italian population’s characteristics and to the observed values from registered voters’ records.
Methods&Data: We use data from 5 telephone and web surveys conducted in Italy from March 2018 to December 2018 on landline or mobile phone owners and on members of an Italian online panel. We designed a mixed-mode survey (a Computer Assisted Web Interview - CAWI survey followed by a Computer Assisted Telephone Interview - CATI survey, using two different sampling frames) and a survey with two different sampling frames (a Computer Assisted Mobile phone Interview - CAMI survey followed by a CATI survey). To study the representativeness of the samples, we compare the estimated vote behaviour from the two survey designs to the observed values of vote behaviours in the recent National elections (in March 2018). We also compare the employment status and education of all respondents to a “gold standard”.
Results: Results from our previous work are confirmed. Indeed, we find that mixing both modes and sampling frames is more effective in reducing selection bias than mixing sampling frames only. In our analysis, the CAWI-CATI samples perform better than the CAMI-CATI ones in estimating vote behaviour and employment status of the Italian population. Both CAWI-CATI and CAMI-CATI respondents are more educated than general population.
AddedValue: Our poster contributes to expand the knowledge on mixing modes and sampling frames to reduce bias. The main value of this work is the large number of public opinion surveys we added to those conducted two years ago, providing robust findings.
“Ok google” - The role of digital Voice Assistants in the lifeworlds of users - An empirical study on relationship types between Voice Assistants and users
1TH Koeln, Germany; 2Skopos Connect, Germany
Relevance & Research Question: Alexa, Siri and Google Assistant have found their way into the personal privacy. According to a forecast, global user numbers are expected to rise from 1,376 million (2019) to 1,831 by 2020 (Tractica 2018). The ability to imitate and understand language creates a social interaction between human and machines.
This study examines the question of whether there is a human-like relationship between digital Voice Assistants and users and which types can be derived from this. Differences in the relationships will be explored and central features within these will be identified.
The study is based on the concept of Media Equation (Reeves, Nass 1996), which examines influencing factors of humanization and behavior patterns towards digital media. Theories of attachment (Bowlby 1969) and social network (Granovetter 1973) can be transferred to social concepts of technology.
Methods & Data: We implemented an online community platform with a mixed-method design. The Study is divided in two online survey periods, of each 14 days. The explorative design contains 14 diary entries on a private blog and 11 tasks with open questions on a survey platform. A system board (Kaspersky Lap 2016) and the BFI-10 (Rammstedt et al. 2013) are part of it. The field phase was flexible and contained a daily diary as a central theme. Additionally 2 video interviews were conducted. In sum data of 54 participants (54 % female) was generated.
Results: The results underline the theory of media equation with regard to Voice Assistants. Three different types of relationships between the users and the Voice Assistants could be identified. These types differ essentially in degree of humanization of the device and its integration into the user's lifeworld’s. From these findings it can be concluded that Voice Assistants for some users represent more than just a technical device.
Added Value: The knowledge gained for science consists of creating parallels and links to areas that have already been researched more intensively, such as computer research of human-machines interaction and to expand social concepts of technology. It remains interesting to observe the development of relationships with Voice Assistants as technology progresses.
What Predicts the Validity of Self-Reported Paradata? Results from the German HISBUS Online Access Panel
German Centre for Higher Education Research and Science Studies (DZHW), Germany
Paradata, such as user agent strings (UAS), provide us with important client-side information about the technical conditions of web surveys. If due to different reasons for example data protection issues UAS are not available, one may directly ask survey participants for the required data. Until now it is unclear whether the validity of these self-reported paradata is determined by participants’ general attitudes towards surveys, their willingness to participate and – in connection with technically demanding questions – distraction while answering.
To shed light into the question what predicts the consistency between UAS and self-reported paradata, we use the HISBUS Online Access Panel. The sample comprised 3,137 members with UAS known to us that were asked for used device (DEV), operating system (OS) and web browser (WB). Additionally, data on general attitudes towards surveys (SAS; de Leeuw et al. 2010), survey participation evaluation (SPE; Struminskaya et al. 2015) and multitasking (MT; Zwarun/Hall 2014) were collected. The Big Five personality traits (BF; Rammstedt et al. 2013) serve as covariates in our applied logistic and ordinal regression analyses for DEV as well as OS/WB, respectively. Predictors with p<.05 were included in the final models.
First of all, UAS and self-reported paradata were highly consistent (kappa: DEV=.95, OS=.95, WB=.87). Agreement regarding DEV depends on SAS subscale value (OR=1.31) and SPE (OR=0.80). The OS/WB agreement was predicted by electronic and non-electronic MT (OR=1.32; OR=0.72).
We conclude that directly asking web survey participants is a promising way to get valid information about their technical equipment if UAS data are not available. The chance to get valid DEV data is higher if surveys are generally considered valuable, but lower if the evaluation of willingness to participate is considered to be solid. The chance for valid OS/WB data is higher with electronic MT present that may indicate technical skills. Non-electronic MT seems to be rather distracting and predicts lower chances for valid OS/WB data.
Working towards understanding and enhancing Enterprise Social Network use
1Westfälische Wilhelms-Universität Münster, Germany; 2Federal Centre for Health Education (BZgA), Germany
Relevance & Research Question:
Enterprise Social Networks (ESN) are established by many organisations in order to promote consumption of and contribution to knowledge among their employees. Despite high costs and effort to implement an ESN, many fail as a consequence of low user participation. Users‘ missing ability to translate usage intentions into specific use cases is regarded as a major reason for this outcome (Chin, Evans, & Choo, 2015). In an experimental study, we examined whether strengthening users‘ ability to use ESN by boosting self-efficacy and using Implementation Intentions are possibilities to enhance employee participation.
Methods & Data:
The field experiment was conducted with a sample of users of inforo, an online community for health professionals run by the German Federal Centre for Health Education (BZgA). 63 participants (mean age 48.10 years; 44 women) were randomly assigned to the conditions self-efficacy and Implementation Intentions (II) in a 2x2 study design. Self-efficacy was promoted by adding supporting formulations in the invitation mail to the study and by adding an explanatory video about the ESN. In the II condition, participants were instructed to formulate Implementation Intentions in comparison to mere intentions in the non-II condition. For each participant, ESN-use two weeks after participation was used as outcome measure (objective behavioral data from a dedicated logging tool).
Manipulation of self-efficacy in this study was successful (r = .34). However, neither self-efficacy nor Implementation Intentions nor their interaction could significantly explain ESN-use. These findings indicate that either users‘ ability has no influence on ESN-use or it cannot compensate for other influential factors. The latter explanation is supported by the overall low user participation in the ESN, which indicates hindrances that are independent of individual-centred factors.
The findings show that enhancement of ESN-use needs to be regarded in a broader context of human-computer interaction. Besides individual-centered variables, further research should focus on organizational, technical and social factors to extend our understanding of ESN-use. Using objective online tracking data - as it was used in this study - should be continued in this field of research to create relevant information for practitioners.
Developing Podcasts that Inspire Listeners and Facilitate Learning
University of Münster, Germany
Relevance & Research Question: Teacher enthusiasm can be defined as the occurrence of distinct behavioral expressions, such as nonverbal (e.g., gestures) and verbal (e.g., tone of voice) behaviors. It has been shown that teacher enthusiasm is linked to various positive outcomes: It is linked to students’ enjoyment, interest, achievement, motivation and vitality. However, most teacher enthusiasm research is based on correlational data and therefore no causal inferences can be drawn.
Methods & Data: To overcome this limitation, a between-subject experimental design was used to analyze the effects of teacher enthusiasm on instructional quality. Two versions of an evolutionary psychology podcast were developed: A neutral and an enthusiastic version. While the wording was kept identical between both versions, the speaker was instructed to read the podcast script either in a neutral or in an enthusiastic manner. It was hypothesized that listening to the enthusiastic version would result in more positive instructional quality ratings. University students with diverse majors listened to the podcast. To test the hypothesis, independent sample t-tests were conducted.
Results: Overall, the results show that listening to the enthusiastic version resulted in more positive instructional quality ratings: Participants who listened to the enthusiastic version of the podcast rated it as more interesting and exiting; they enjoyed the listening process more; had a higher motivation to learn more about the topic; evaluated the podcast host as more trustworthy; and gave the podcast a higher overall rating.
Added Value: The results demonstrate that teacher enthusiasm can be a powerful instructional tool when developing educational podcasts.
„Eggs“-plaining Differences in Market Share and Optimal Pricing – a Comparison of Online Methods
SPLENDID RESEARCH GmbH, Germany
Data science is incurring ever-larger parts of market research. A large part of that success is due to the ability to predict future sales from past user behavior. Market research could counter with its ability to predict future sales from individual preferences, making it possible to analyze potential success even before putting a product to market and preventing expensive flops. For this purpose, market research employs experimental methods, namely price sensitivity measurement (PSM) and choice-based conjoint analysis (CBC), but often asks directly for purchasing preferences. This poster compares willingness to pay and market share for eggs from organic, free-range and barn poultry farming in Germany, generated by PSM, CBC and direct questions. It is based on a quantitative online study with 1.011 interviews with participants between 18-69 years of age, sampled representatively for gender, age and region from members of the German online access panel of Splendid Research.
When queried directly, consumers tend to overestimate their propensity to buy organic or free-range produced eggs. This results in greater expected market shares for the ethically correct products. Calculating market shares based individual utilities derived from choice-based conjoint analysis produces almost exactly the market shares determined by the validation question and can be considered extremely valid. The optimal prices for organic eggs determined by PSM and both CBC variants come close to the actual market prices. The optimal price for free-range eggs in PSM is much lower than the actual price at EDEKA and the price arrived at by both conjoint models. All three models indicate much higher optimal prices for barn eggs than the ones in place. The lower prices for barn eggs exist probably due to bait pricing, as inputting the current prices leads to very similar market shares.
All in all, market research’s, and especially online market research’s experimental methods can be trusted to provide good estimations of unknown quantities if used the right way. Unreflected and simplified questions can cause large bias in the estimation of those same quantities. The poster therefore advocates the use of experimental designs and creative validation techniques in online surveys.
The impact of a mobile option in a migration survey on sample composition and data quality. Results from a multilingual feasibility study.
Robert Koch Institut, Germany
Relevance & Research Question:
A common new method of generating survey data is via online mobile devices. Especially in so called hard-to-reach populations (e.g. people with migration background) the implementation of a smartphone survey option can increase the response. However; there are methodological challenges: The usage of different devices to complete a survey could lead to device effects, compromising the data quality. Utilizing a survey among migrant populations in Germany, our objective is to identify determinants influencing the usage of mobile devices and to investigate differences in the resulting data quality between desktop and mobile devices.
Methods & Data:
We used data from a multilingual feasibility study that was conducted in two German federal states, utilizing data from residents’ registry. The target populations were persons with Turkish, Polish, Romanian, Syrian and Croatian citizenship living in Germany (AAPOR RR1 15,9%; N: 1190) and focused only on the web-based interviews. We used logistic regression to determine factors associated with mobile device usage. Furthermore, we investigated potential device effects. Therefore we constructed different data quality indicators: missing values, straightlining in grid choice questions, potential social desirable items and survey duration.
Female respondents were more likely to participate via smartphones than male respondents. Higher educated participants were more likely to participate via desktop compared to participants with lower educational level. We found no significant differences comparing the overall item nonresponse rate between desktop and smartphones. Furthermore there are no differences in the level of reporting in sensitive items. Participants who responded with a smartphone have a significantly longer completion time of the survey than respondents who participated via desktop. Our investigation also showed a significant negative device effect for smartphones on straightlining for two variables.
Implementing a survey option optimized for mobile phones could lead to a higher completion rate in typical hard-to-reach-populations, e.g. the low educated. This implementation might not compromise the data quality, since there are just minor differences on data quality between desktop and smartphone participants.
Using kinship big network data to overcome mistrust in recruiting the hard-to-reach populations: the case of Formosan endangered language survey
Academia Sinica, Taiwan
Relevance & Research Question: Hard-to-reach populations are those hard to access due to geographical location or/and social status. They are characterized by being vulnerable, excluded, and hidden in a society. Recruiting hard-to-reach populations has long been a big challenge for survey study. Barriers relevant to recruiting hard-to-reach populations are various. The most crucial challenges are: how to (1) label the population for study, and (2) overcome mistrust of participants during survey. Using Formosan endangered language survey for example, the research utilizes (1) household individual data to label hard-to-reach population and (2) a kinship network database to help overcome the mistrust during survey.
Methods & Data: The survey study aims to access language skills, including listening and speaking of Taiwan indigenous peoples (TIPs). TIPs are an ideal example of hard-to-reach population. The research uses household registration data of TIPs to help label potential hard-to-reach population for study and thus to facilitate sampling design and survey strategy. Since mistrust serves as the most important barrier during survey once a potential individual is labelled. To overcome mistrust issue, the research makes use of a complex kinship network database that is construed based on computational social science. The database enables us to identify a group of persons (parents, siblings, relatives, and friends) who are kin to the person for survey. If the person for survey refuses to participate due to mistrust, we turn to seek for assistance from those she/he may know well, in a hope to reduce mistrust. All measures and strategy mentioned are examined by an IRB board.
Results: Utilizing population information in household registration helps labeling potential hard-to-reach populations. But more important is that making use of constructed kinship network database substantially helps us contact those who may know well about the sampled individuals. Such measure in turn help to reduce the mistrust of sampled population and increase survey participation rate.
Added Value: the survey results of endangered Formosan languages shed lights on the determinants of language utilization and language shift. We thus are able to propose relevant policy suggestions.
Linguistic Properties of Echo Chambers and Hate Groups on Reddit
Universität Passau, Germany
Relevance & Research Question:
There is a growing debate about Echo Chambers (EC), that is, online groups in which like-minded people share attitudinally consistent information and which might amplify societal polarization. First attempts to analyze the psychological underpinnings of EC have been made, but so far there is little evidence regarding differences in conversational style between EC and more neutral, attitudinally diverse groups. Based on psychological theories like Self-Categorization Theory, EC can be expected to exhibit a more polarizing conversational style, for example by containing more pronouns referring to in- and outgroup members. Moreover, EC share structural characteristics with Hate Groups (HG, online groups propagating hate and violence), for example their users' ideological homogeneity. Therefore, some linguistic characteristics previously found in HG, such as more negative emotion and swearing, should also occur in EC. We tested these assumptions by examining user-submitted texts from online groups of the website Reddit.
Methods & Data:
We analyzed 14.642 user-submissions and 2.230.802 user-comments from 14 groups: Six neutral groups (e.g. /r/neutralpolitics), six EC, that is, ideologically homogenous groups explicitly forbidding divergent opinions (e.g. /r/latestagecapitalism), and two HG already banned by Reddit due to inciting violence (/r/incels, /r/physicalremoval). Linguistic properties (word type percentages) were calculated via the program LIWC and compared between EC, HG and neutral groups via MANOVA.
Compared to neutral groups, EC as well as HG displayed a more polarizing style with significantly more pronouns referring to in- and outgroup members, more plural than singular references, and more negative and less positive emotion. Additionally, this style was more pronounced in HG, which also displayed significantly more swearing and "you"-references than EC.
The results demonstrate that systematic linguistic differences between neutral and problematic online groups exist. Such differences might be used to build classification models that help platform providers and moderators to identify online groups requiring intervention. To illustrate this, we trained two logistic regression models with elastic net regularization that classify user-submissions as EC vs. neutral and HG vs. neutral based on linguistic characteristics. Both performed well with correct classification rates of 88% and 76%.