Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Filter by Track or Type of Session 
Session Overview
Date: Thursday, 10/Sep/2020
10:00Track A: Survey Research: Advancements in Online and Mobile Web Surveys
10:00Track B: Data Science: From Big Data to Smart Data
10:00Track C: Politics, Public Opinion, and Communication
10:00Track D: Digital Methods in Applied Research
10:00Track T: GOR Thesis Award 2020
10:00 - 10:30GOR 20 Conference Kick-off
10:30 - 11:30A 1: Smartphones in Surveys
Session Chair: Bella Struminskaya, Utrecht University, Netherlands, The

Effects of mobile assessment designs on participation and compliance: Experimental and meta-analytic evidence

David Richter1, Cornelia Wrzus2

1DIW Berlin, Germany; 2University of Heidelberg, Germany

When conducting mobile-phone based assessment in people’s daily life, researchers need to know how design characteristics (e.g., study duration, sampling frequency) affect selectivity and compliance, that is, who will participate in the study and provide how much information. We addressed the issue of selectivity in the Innovation Sample of the Socio-Economic Panel, who were invited to participate in an ESM study on happiness, and were either offered only feedback in 2015 or feedback and monetary reimbursement in 2016. Participation increased from 7% in 2015 to 36% when receiving feedback and monetary reimbursement, and compliance was much higher as well (29% in 2015 vs. 86% in 2016). Furthermore, participants differed from non-participants regarding age and gender, but negligibly regarding personality characteristics. To further examine design effects on participants’ compliance, we conducted a meta-analysis on ESM studies from 1987 to 2018, from which we coded a random subsample of 402 studies regarding sample characteristics, study design (e.g., study duration, sampling type and frequency, sensor usage), type of incentives, as well as compliance and drop out. Initial results showed that associations between design characteristics and compliance varied with sample type and type of incentive. For example, in adolescent and young adult samples, compliance was non-linearly related to number of assessments, whereas in adult samples compliance was larger with larger numbers of assessments. Providing incentives to participants, especially monetary incentives, predicted higher compliance rates compared to no incentivizing, except in physically-ill samples. This latter effect is likely attributable to high intrinsic motivation to provide information among participants dealing with chronic and other illnesses. We thus conclude that both the study design and the incentive should be adapted to the intended sample and we offer first empirical findings to guide these decisions.

Using geofences to trigger surveys in an app

Georg-Christoph Haas1,2, Mark Trappmann1,4, Florian Keusch2, Sebastian Bähr1, Frauke Kreuter1,2,3

1Institut für Arbeitsmarkt- und Berufsforschung der Bundesagentur für Arbeit (IAB), Germany; 2University of Mannheim, Germany; 3University of Maryland, United States of America; 4University of Bamberg, Germany

Relevance & Research Question: Within the survey context, geofences can be defined as geographical spaces that trigger a survey invitation, when an individual enters, leaves, or stays within this geographical space for a prespecified amount of time. Geofences may be used to administer context specific surveys, e.g., an evaluation survey of a shopping experience in a specific retail location. While geofencing is already used in other contexts (e.g., marketing and retail), this technology seems so far underutilized in survey research. In this talk, we will share our experiences with the implementation of geofences within an app data collection study. Given the limited research on this topic, we will answer the following exploratory research questions: How well did the geofencing approach work? What are the reasons for geofencing to fail?

Methods & Data: In 2018, we invited participants of the PASS panel survey to download the IAB-SMART app. The app passively collected smartphone sensor data (e.g., geolocation, app usage and location) and administered short surveys. Overall, 687 individuals installed the app. While most in-app surveys were triggered by a predefined time schedule, one survey module was triggered by a geofence. To define geofences and trigger survey invitations our app used the Google Geofence API.

Results: Overall, the app sent 230 invitations and received 225 responses from 104 participants. However, only 56 of the 225 responses stated that they actually accessed the geofence. Cross-validating the Google Geofence API survey triggers with our custom built geolocation measurement in the app shows frequent mismatches between the two. Our data indicates that in many cases individuals should not have received a survey invitation because they were actually not in the specified geofence.

Added Value: Existing literature about geofencing (largely consisting of Youtube videos and short blog posts) only provides a short introduction to this technology and virtually no use of geofencing is documented in survey research. Our presentation evaluates the reliability of geofences, shares lessons learned, and discusses limitations of the geofencing approach to a broader audience.

Mobile friendly design in web survey: Increasing user convenience or additional error sources?

Jean Philippe Decieux1, Philipp Emanuel Sischka2

1University of Duisburg-Essen, Germany; 2University of Luxembourg, Luxembourg


At the beginning of the era of online surveys, these were programmed to be answered using desktop PCs or notebooks. However, due to technical development such as the increasing role of mobile devices, studies on online survey research detect an increase of questionnaires that are answered on mobile devices (md). However, survey navigation on md is different compared to PC: it takes place on a smaller screen and usually involves a touch pad rather than a mouse and a keyboard. Due to these differences in questionnaire navigation, some of the traditional used web question formats are no longer convenient to be answered on a md. The most common formats are matrix questions. To deal with this development, so called mobile-friendly or responsive-designs were developed, which change the layout of specific questions that are not convenient on a md into an alternative mobile-friendly-design. In case of matrixes, these were separated into item-by-item questions which are suggested to be more comfortable to answer on a mobile device.

Research question

However, from a psychometric perspective the question whether these changes in question format produce comparable results is too often ignored. Therefore, this paper elucidates the following research question: Do different versions of responsive-designs actually produce equivalent response?

Data & Methods

Using the data of the first two waves of the Germ and Emigration and Remigration Panelstudy we can base our analysis on more than 19.000 cases (appox. 7.000 using different md). As GERPS makes use of a responsive design, we are able to investigate measurement invariance between different md and desktop device groups.


As the data management is still in progress and will be finished in the end of October, we will be able to present first-hand information based on fresh data. However, first initial analyses reveal differences between md and desktop device versions.

Added Value

Our study is one of the first that elucidates the equivalence of responsive design options. Thus, it enhances the perspective on the existence of possible new biases and error sources due to the increased use of md within web surveys.

10:30 - 11:30B 1: Digital Trace Data
Session Chair: Florian Keusch, University of Mannheim, Germany

Can social media data complement traditional survey data? A reflexion matrix to evaluate their relevance for the study of public opinion

Maud Reveilhac, Stephanie Steinmetz, Davide Morselli

Lausanne University, Switzerland

Relevance & research question:

Traditionally, public opinion (PO) has been measured through probability-based surveys, which are considered the gold standard for generalising findings due to their population representativeness. The turn to social media data (SMD) to gauge PO in recent years has however led to a discussion about the potential for augmenting or even replacing survey data. Survey and social media researchers have, therefore, increasingly explored ways in which social media and survey data are likely to yield similar conclusions. In this context, two core challenges can be identified: i) researchers have mainly emphasised on replacement of survey data by SMD, rather than on the complementarity between both data sources; ii) there are currently two understandings of PO, which makes complementarity of both data sources quite difficult. As a result, researchers still need more guidance on to best complement SMD with survey data.

(Methods & Data):

Whereas the recent extension of the Total Survey Error framework to SMD is an important step to account for the quality of both data sources, we would like to propose an addition step that should come ideally before the discussion and evaluation of the quality of the collected data. Building on four key challenges, we develop a reflexion matrix to provide practical guidelines dealing with the complementarity of both data sources for the study of PO.


Our results convey two main take-home messages: i) we demonstrate that the main approach validating what we have found via surveys using SMD is problematic as survey measures convey an idea of simplicity and aggregation, whereas SMD are complex and multi-dimensional; ii) we provide researcher with an orientation of how SMD can be a potential complementary source to survey data.

Added Value:

We argue for the necessity to develop different and complementary views of PO if conducting research with a mixed-method approach, where complementarity of the data sources is one essential criteria. In addition, we point to possible solutions from other disciplines which have been little considered in studies of PO yet.

Using Facebook & Instagram to Recruit LGBTQ for Web Survey Research

Simon Kühne

Bielefeld University, Germany

Relevance & Research Question: In many countries and contexts, survey researchers are facing decreasing response rates and increasing survey costs. Data collection is even more complex and expensive when rare or hard-to-reach populations are to be sampled and surveyed. Alternative sampling and recruiting approaches are usually needed in these cases including non-probability and online convenience sampling. A rather novel approach to recruit rare populations for online & mobile survey participation uses advertisements on social media. In this study, I present the fieldwork results of a survey of Lesbian, Gay, Bisexual, Trans, and Queer (LGBTQ), for which participants were recruited via ads on Facebook and Instagram.

Methods & Data: In 2019, an ad campaign was launched on Facebook and Instagram to recruit German web survey participants self-identifying as LGBT or Q. The survey was part of a research project on LGBTQ and Rainbow Families in Germany conducted at Bielefeld University. The questionnaire covered a variety of topics including partnership, family/children, employment, health, and experience of discrimination.

Results: Over the course of 5 weeks, over 7,000 respondents participated in the web survey (completed interviews). Comparatively few screen-outs and survey break-offs were observed throughout the fieldwork period. Plausibility checks and measurement error indicators point to good data quality.

Added Value: This study provides novel insights on how to plan and conduct an ad campaign for recruiting rare and hard-to-reach populations for web survey research. First, the practical details and challenges of fieldwork and campaign management are discussed. Second, the survey data is analyzed focusing on survey error and potential data quality issues. And finally, the social media sample survey is compared to a probability-based, face-to-face sample survey of LGBTQ in Germany that was carried out simultaneously as part of the research project.

Some like it old

Clemens Rathe, François Erner

respondi, Germany & France

Relevance & Research Question:

French people have discovered a passion for recycling! Circular economy is growing, offline with “brocantes” (=flea markets) everywhere, and online with online marketplaces selling second hand products.

But French people do not use ebay, they use; the 4th most visited website in France (with 26 million unique visitors per month, right below giant tech companies) where all types of secondhand products can be found.

Another surprise comes from the youngsters (aged 15-25 yo). They do love old stuff! For many categories of products, ranging from clothes to smartphones, younger consumers would rather shop second hand than new.

This research, conducted for Le Bon Coin, is aimed at explaining the attractiveness of old products to the younger generation.


This research relies on a two-step process

1) Online behaviour (web visits and apps usages) of 300 young French tracked for more than one year. We have focused on their secondhand products shopping online patterns.

2) These respondents were then invited to participate in an online community. We organized a 7-day community on the topics of consumption, collaborative consumption and online vintage shopping with more than 100 individuals.


This research shows second hand goods fulfill two major aspirations of these new generations; firstly the quest for more sustainable practices, and secondly the desire for individual distinction. Circular economy is perceived as ethically more acceptable, and second hand products tend to be more unique or exclusive: vintage goods offer double the reward for these young consumers, who wish to appear more responsible and less conformist than their elders.

These results are based on a segmentation of young people buying old stuff online.

Added value

These findings help to understand the disruptive mindset of these new customers and will be of great interest to retailers and brands that need to quickly address shifts in attitudes and behaviors of young consumers. From a more actionable perspective, our research also focuses on best practices for online platforms for buying, selling and trading used products to highlight what leboncoin has achieved in France and what ebay did not.

10:30 - 11:30C 1: News Consumption and Preferences
Session Chair: Simon Munzert, Hertie School, Germany

Data privacy and public data: New theory and empirical evidence

Nicholas Biddle

Australian National University, Australia

Relevance & Research Question:

Attitudes towards data privacy, either with regards to one’s own data or data more broadly, has the potential to be one of the major factors in the functioning of commercial markets and the effective operation of government and public policy. However, there are legitimate concerns about that data being used in ways that lead to negative outcomes for us as individuals, for groups that we identify with, or for society more broadly. Governments are continuously attempting to balance those costs and benefits, and commercial organisations need to be wary of losing their social licence with regards to data. Public attitudes towards data privacy and data governance is a key input into this balancing, and there is a growing social science literature on how these attitudes are shaped, how they vary between and within populations, and the impact they have on decision making. This paper consolidates existing literature and presents new empirical research (observational and experimental) with the aim of developing a theoretical model of data privacy. This predictive model builds on the behavioural economics literature and combines elements of risk, trust, perceived benefits, perceived costs, and behavioural biases.

Methods & Data:

The new data presented in this paper is based on a series of survey questions (experimental and attitudinal) fielded on the Life in Australia panel. Primary data collection is supported by analysis of survey data from comparable jurisdictions (primarily New Zealand, the US and Europe)


The main finding is that the model is shown to predict behaviour with regards to data linkage consent, public health data, and the consumer data right/open banking legislative reforms. Specifically, we show the importance of trust in government, the interaction that trust has with risk preference, the complicated impact of framing, and the importance of issue salience. We also demonstrate the instability of preferences through time.

Added Value:

While primarily based on Australian data, this is the first theoretical model of data privacy that (a) i built on experimental data from a representative panel (b) utilises behavioural insights and biases and (c) is shown to predict actual behaviour.

What you read is who you support? Online news consumption and political preferences

Ruben Bach1, Denis Bonnay2, Christoph Kern1

1University of Mannheim, Germany; 2respondi, Université Paris Nanterre, France

Relevance & Research Question: Passively measured digital trace data (e.g., from social media platforms or web browsing logs) are increasingly used in the social sciences as a cost-efficient alternative or as a complement to surveys. In this context, particularly augmenting survey data with information from respondents' online activities represents a promising direction to utilize the advantages of both worlds. However, extracting meaningful measurements from digital traces for substantive research is a challenging task. In this study, we use natural language processing techniques (NLP) to classify respondents based on the news content they consumed online and study the link between individuals’ political preferences and their online behavior.

Methods & Data: We augment survey data with web browsing logs by members of an online panel running in three countries (France, Germany and the UK). Data were collected during four weeks in May 2019, covering the European elections. Respondents answered questions about their voting behavior and political preferences in two surveys, one before and one after the election. In addition, respondents gave consent to having their online activities on their personal computers and mobile devices recorded. We extract information about news content consumed by respondents and search queries made to online search engines from the web logs. To analyze this content, we use natural language processing techniques (BERT; Bidirectional Encoder Representations from Transformers). We then model respondents’ political preferences and voting behavior using features from the content consumed online.

Results: Preliminary results indicate that the content of news consumed is a good predictor of voting behavior and political preferences and outperforms information that merely summarize visits to news and other websites. Overall, however, records of online activities do not seem to predict voting behavior and political preferences in an almost deterministic way.

Added Value: Our results add to the debate of selective news exposure and changes in political engagement due to Internet use. Moreover, they confirm previous findings regarding the limited effect size of internet use and selective news exposure on political behaviors and preferences. Overall, it seems that online societies may not be as fragmented as some early commentators postulated.

How do news and events impact climate anxiety and how are people reacting?

Jhanidya Bermeo

Brandwatch, Germany

Relevance & Research Question: Global climate change continues to increase in visibility through observable environmental events, news, or political attention. As a result, psychological and mental health implications have begun to be considered in academic and practitioner circles. Awareness of the threat of climate change has led to an increase in “eco-anxiety” or “climate anxiety” in which individuals feel stressed, worried, nervous, or distressed about the future on Earth. This research aims to add to the growing body of literature on climate anxiety by examining American-based social media posts with the intention of understanding what type of climate-related events, news stories or political activities impact expressions of climate anxiety. Additionally, this research seeks to examine if people suggest taking any household, local, or national actions in light of climate anxiety, such as reducing consumption, switching to green products, or advocating and voting for climate friendly politicians.

Methods & Data: With access to Twitter and Reddit data through Brandwatch Consumer Research software, close to 50,000 posts about climate anxiety were collected from January 2016 through October 2019 (45 months) geolocated in the United States. An exploratory data-led approach will be taken with a sample of this data to establish themes and actions being discussed. Once these themes and actions are established, automated text analysis will be used to populate the categories against data from the full dataset such that they will continue to segment new conversation as it happens. This will allow revisiting the data in the future to become aware of any longitudinal changes in people’s opinions and behaviour.

Results: The analysis will be completed early 2020.

Added Value: The research intends to create a framework for understanding how social data can be used to examine drivers of climate anxiety, as well as possible tipping points that create impetus for personal, social, or political action among those most worried about the impact of climate change on our future. Additionally, understanding tipping points can aid pro-climate politicians and activists to capitalise on events and stories that most trigger action.

10:30 - 11:30D 1: GOR Best Practice Award 2020 Competition I
Session Chair: Otto Hellwig, respondi AG & DGOF, Germany
Session Chair: Alexandra Wachenfeld-Schell, GIM & DGOF, Germany

How to better uncover emotions in early-stage innovation research

Julia Görnandt1, Sofia Jorman2

1SKIM; 2Johnson & Johnson

Have you ever conducted innovation research online and found yourself in a situation where you don’t entirely trust what consumer feedback is telling you? Many of us have had to deal with overstated interest in a new product. Uncovering both rational and emotional needs is vital for new product development (NPD) strategies to accurately size the unmet need or opportunity. While traditional qual techniques can uncover emotions, the results can’t easily be scaled. Alternatively, quant research can deliver stated emotions, but lack depth.

Together with Johnson & Johnson, we developed a new hybrid Qual-Quant-AI online research approach for early-stage NPD research. By using a voice analytics tool, we can analyze ‘how’ people communicate their needs, attitudes and interest, to better uncover emotions for more effective innovation strategies. The voice analytics AI tool we selected, audEERING, can detect emotions from voice. In an online survey with voice input a hypothetical ‘Smart Health Band’ was evaluated by consumers. We 1) analyzed the content of responses to look at what people said and 2) using the voice AI tool analyzed how they said it and which emotions were present.

The content analysis - the what was said - showed high interest in the concept. However, the voice AI delivered additional, unexpected results: while still high, the implicit or subconscious interest was not as high as the content analysis had suggested. Analyzing the emotions delivered through voice showed a more realistic level of interest in the product.

Especially for new products it is essential to get accurate estimates for the product’s market potential that are not overstated. The additional voice element helped Johnson & Johnson calibrate the interest in the product to show how much of it was genuine, and in consequence make better informed choices about future strategy. In addition, the emotional analysis uncovered differences in gender reactions. These segmentation insights could prove valuable for future marketing and communications strategies.

Chilling with VR – A Case Study with H/T/P, Electrolux and Vobling. How the interplay between classical qualitative and VR generated efficiencies and effectiveness

Katrin Krüger1, Jessica Adel2

1Happy Thinking People, Germany; 2Electrolux AB Europe, Sweden

Relevance & Research Question:

The benefits of using VR prototypes for higher-cost categories in innovation and design projects are well known – logistics, greater modification flexibility, lower cost, virtual in-store and competitive choice scenarios.

With VR technology becoming more sophisticated and user-friendly, how can market research further benefit from it?

We will present a case study on a new fridge-freezer concept with Electrolux and a leading VR specialist company.

Methods & Data:

54 VR explorations across three countries - Germany, Italy, Sweden - accompanied by a total of 9 in-depth focus groups.


Across all age groups the VR part worked very well, both as a stand-alone and in combination with focus groups. Following outputs were key benefits:

Usability: Detailed feedback was gathered – particularly in comparison to fragile 3-D renderings.

Immersive: With prototypes seeming so real, we received more spontaneous and little post-rationalizing feedback.

Curiosity: A thrilling tech factor led to higher engagement.

Involvement: Fridges – a potentially lower interest category – enjoyed higher levels of excitement.

Playfulness: Tasks were treated more like games.

Flexibility & Speed: 3D renderings could be changed from one fieldwork session to the next.

Democracy: VR created equal conditions for prototype and comparison device.

Realistic Environment: Simulating a realistic shop floor atmosphere including competitive products was made possible.

Focus Groups + VR:

Higher participant focus due to the VR experience– everyone was highly engaged.

Higher attention to detail due to the amount of time spent with the prototypes during the VR experience!

There was nevertheless a strong creative dynamic present in the F2F groups.

Research moderation expertise is needed to manage “digital excitement” –managing overexcitement that leads them to jumping from one design aspect to the next.

Added Value:

The VR approach was more cost-efficient, more environmentally friendly, offered higher flexibility and in the end enabled deeper and more valuable insights – particularly with regards to usability!

Benefits of VR continued after the research: designers were more open to implementing the design changes as they felt less "emotionally attached" to the stimulus (compared to physical models). Our results were thus received more openly and applied without hesitation.

Measuring the Incrementality of Marketing Online and Offline on Non-Experimental Data

Daniel Althaus1, Thies Jarms1, Ralf Schweitzer2

1Neustar GmbH, Germany; 2Media-Saturn Marketing GmbH, Germany

Relevance & Research Question: For large companies it has become increasingly difficult to measure marketing effectiveness across a multitude of media types, offline- and online channels and campaigns. It has also become more important to have a consolidated view of all marketing and non-marketing activities. MediaMarktSaturn and Neustar set out to answer the question how much incremental value is generated by 120 marketing activities of MediaMarktSaturn, many of which are happening simultaneously.

Methods & Data: The solution uses MMM (Marketing-Mix Modeling) and MTA (Multi-Touch Attribution) statistical models to analyze the data. It integrates a huge variety of data types from sales volume, spends and online tracking data to survey-based funnel KPIs, weather data and store traffic counts. To account for the complexity of the data, a hierarchical Bayesian approach is used.

Results: The effects of marketing activities on brand KPIs, web and store traffic and online and offline sales are being analyzed coherently. It can be shown that different campaign types and media touchpoints influence these KPIs in different ways, opening the opportunity to optimize on specific campaign goals. Survey-based market research results can proxy long-term brand health in the process. The measured efficiency of media types supports a balanced media mix, moving towards the use of online channels, however, it also shows that media types like out-of-home and radio still play a strong part.

Added Value: The study proves that it is possible to measure the incrementality of offline and online marketing activities in a non-experimental environment. It allows MediaMarktSaturn to compare all marketing activities on the basis of ROI and optimize its media budget accordingly. The introduction of survey-based brand kpis provides the basis for steering multiple outcomes. Last but not least the study supplies further evidence of the utility of hierarchical Bayesian methods in tackling imperfect and highly differentiated data.

10:30 - 11:30T1: GOR Thesis Award 2020 Competition I
Session Chair: Frederik Funke, Dr. Funke SPSS- & R-Trainings & LimeSurvey GmbH, Germany

Using Artificial Neural Networks for Aspect-Based Sentiment Analysis of Laptop Reviews

Sonja Rebekka Weißmann

Catholic University Eichstätt-Ingolstadt, Germany


On e-commerce websites such as Amazon, customers readily comment on product highlights and flaws. This provides an important opportunity for companies to collect customer feedback. Fine-grained analyses of customer reviews can support managerial decision-making, especially in marketing. The vast amount of user-generated content necessitates the application of automated analysis techniques. One method of computationally processing unstructured text data is sentiment analysis, which examines people’s opinions, evaluations, emotions, and attitudes towards products, services, organizations, or other topics (Liu 2012, p. 1). Past studies have primarily focused on document-level or sentence-level sentiment analysis. However, for practical applications, there is a substantial need for finer-grained analyses to determine what exactly customers like or dislike – thus, for aspect-based sentiment analysis (ABSA). However, the implementation of ABSA is challenging.

The need for ABSA in marketing, the limitations of traditional sentiment analysis methods, and recent progress in the field of artificial neural networks make the latter’s application to ABSA a relevant research topic.

The objectives of this thesis are to

̶ propose and evaluate a novel model architecture for ABSA that combines gated recurrent units and convolutional neural networks;

̶ apply the proposed model to laptop reviews to gain insight into customer requirements and satisfaction;

̶ discuss limitations of the proposed model, also from a market research perspective.




ABSA is divided into the subtasks of aspect term extraction and aspect sentiment classification. One neural network was trained to predict for each word of every sentence whether the word is an aspect. All aspects and their corresponding sentences were passed on to a second neural network, which predicts whether the sentiment expressed regarding the aspect is positive, negative, or neutral. Building on previous research, this thesis suggests combining two artificial neural network types, namely gated recurrent units and convolutional neural networks.

The proposed model was trained using the SemEval 2014 laptop review dataset (SemEval 2014). In addition, the thesis author manually annotated laptop reviews. The combined dataset consists of 5,165 sentences, totaling approximately 72,900 words. To evaluate model performance compared to previous research, the proposed model was trained only on the original SemEval training set and tested on the SemEval test set.




The evaluation results (Fig.1, 2) are promising. Without using sentiment lexica, handcrafted rules or manual feature engineering, the proposed system achieves competitive results on the benchmark dataset. It is especially effective at extracting aspects.

Model F1 score

Proposed model 81.63%

Filho and Pardo (2014) 25.19%

Pontiki et al. (2014) 35.64%

Toh and Wang (2014) 70.41%

Chernyshevich (2014) 74.55% †

Liu, Joty, and Meng (2015) 74.56%

Poria, Cambria, and Gelbukh (2016) 77.32%

Xu et al. (2018) 77.67%

Fig. 1: Performance on aspect term extraction on the SemEval test set

†: trained on twice as much training data, use of an additional training set


To ensure comparability, only the performance of models with publicly available word embeddings are reported in Fig. 1. With domain-specific word embeddings and a set of linguistic patterns, Poria, Cambria, and Gelbukh (2016) reached an F1 score of 82.32%, which appears to be the current state of the art for this task.


Model | Accuracy | Macro F1 score

Proposed model | 68.45% | 63.92%

Pontiki et al. (2014) | 51.37% | n/a

Negi and Buitelaar (2014) | 57.03% | n/a

Wang et al. (2016) | 68.90% | n/a

Wagner et al. (2014) | 70.48% | n/a

Kiritchenko et al. (2014) | 70.48% | n/a

Tang et al. (2016) | 71.83% | 68.43%

Chen et al. (2017) | 74.49% | 71.35%

Fig. 2: Performance on aspect sentiment classification on the SemEval test set


Sentiment misclassifications can be grouped into three types of mistakes: predicting the opposite sentiment, predicting a strong sentiment instead of neutrality, and predicting neutrality instead of a strong sentiment. For marketers who interpret the model predictions, the first type of mistake would be the most severe. The third type of mistake was most common. It is arguably the least severe mistake and shows that the proposed system tends to be conservative in its predictions.

Overall, model performance is encouraging, especially because the model used only two features. This is in sharp contrast to traditional methods. Moreover, no specialized knowledge of linguistics was needed to develop the proposed system. In addition, it does not use any sentiment lexica, which is especially beneficial when considering languages other than English.

A case study in the laptop domain illustrates how and to what degree the proposed ABSA system is useful for practical purposes in market research.




The paper’s contributions are

̶ provision of a labeled dataset for ABSA, which could enhance other models;

̶ provision of refined annotation guidelines that consider marketing needs;

̶ proposal and implementation of a system that combine gated recurrent units and convolutional neural networks;

̶ performance evaluation of the system;

̶ error analyses, which can help practitioners to interpret the model output and may allow academics to improve future models;

̶ model outputs that summarize the customer opinions voiced in unlabeled and unstructured reviews;

̶ some insight into customer satisfaction and preferences (regarding the case study laptops), which might facilitate decision-making in marketing;

̶ guidance on why, how, and under what limitations to use ABSA, especially for marketing purposes.

With only words and part-of-speech tags as inputs, the proposed system achieves competitive results on the benchmark dataset. Sentiment lexica, handcrafted rules or manual feature engineering are not required. The system can be readily used to analyze English customer reviews of laptops. Given appropriate training data, the approach may also be applicable to other product categories and languages.

ABSA offers a structured representation of the most frequently mentioned positive and negative aspects in customer reviews. Moreover, it does so in a timely manner. The output can help to determine what reviewers like and dislike about a product. Given a large amount of review text, ABSA provides a detailed picture of customer satisfaction and can stimulate product improvements. It can also support marketers in inferring the reviewers’ reasons to purchase the product and the purposes for which they use it. Moreover, ABSA can complement traditional marketing research, especially as a preliminary study or by providing up-to-date information. In short, it can help companies to understand customers.




see thesis

Data Sharing for the Public Good? A Factorial Survey Experiment on Contextual Privacy Norms

Frederic Gerdon

University of Mannheim, Germany


Individual data that are collected when using smartphone applications or other digital technologies may not only be used to improve recommendations or products provided to the user. Many of these data may at the same time be employed for a public use, for instance when data collected in navigation apps are used by municipalities for urban planning. Against this background, the advancement of smart technologies opens new possibilities for the provision and maintenance of public goods, such as public health care and infrastructure development. However, such practices entail considerable ethical and social challenges, particularly with respect to privacy violations. From empirical research on privacy norms, we know that individual perceptions of which data transmissions are acceptable heavily depend on situational characteristics and their interactions. Thus, while individuals may accept data uses from which they receive an immediate personal benefit, it is unclear under which conditions using the same data for a public benefit is considered appropriate as well. To investigate this issue, the present paper draws on and advances the application of the theoretical framework of privacy as contextual integrity (Nissenbaum 2004, 2018) that conceptualizes the context dependence of privacy norms. It proposes concrete situational characteristics that impact the forming of these norms: data type, involved actors, and the terms of data transmission. These characteristics interact, meaning that in most cases no single situational feature can explain norms on its own. The question is: Do individuals hold different norms of appropriateness for private and public benefit data uses, and on which concrete situational characteristics does acceptance depend?


A factorial survey experiment (“vignette study”) was employed in a German online non-probability sample with 1,504 respondents to compare how personal normative beliefs of appropriateness of specific data transmissions are affected by concrete situational characteristics. The vignettes, i.e. situations shown to the respondents, varied along the parameters of data type, data recipient, and use for a private or public benefit. The investigated data types were health, location, and energy use data, while the recipient was either a public administration or a private company. Moreover, general privacy concerns, perceived sensitivity of the three data types, as well as trust in public and private entities were measured to account for possible moderating effects.


Results of linear regression analyses show that whether respondents perceive a public benefit use of their data as appropriate – compared to a personal benefit use – strongly depends on the concrete data type scenario at hand. In line with the notion of contextual integrity, considerable interactions of situational characteristics are present, i.e. the effects of use vary with data type and recipient. Particularly, public benefit uses are more accepted when public instead of private actors use the data. These findings show that the acceptability of a public benefit use is context dependent and support a contextual conceptualization of privacy norms.

Moreover, normative beliefs can partly be explained by individual characteristics. Interestingly, in two out of three data type scenarios, general privacy concerns decreased acceptance, suggesting that more general privacy sentiments are not always obliterated by concrete situational factors. Perceived sensitivity of a given data type and general privacy concerns strongly contribute to the explanation of variance in normative beliefs, i.e. they appear as major influential factors. However, interactions between a given data type and its sensitivity as well as the interaction between recipient and trust in the recipient were miniscule. Therefore, the data do not allow the inference that higher trust or higher perceived sensitivity strongly altered the impact of recipients or data types.


The present study contributes to an extension of the application of the contextual integrity framework for privacy norms and offers first insights into conditions under which using data for public benefit uses may be deemed appropriate. It suggests to cautiously design data transmissions of individual data for public benefit uses, particularly as the advancement of possibly invasive technologies promises improvements of the provision of public goods. No one-fits-all preference for public benefit uses of individual data exists but, importantly, public actors are preferred recipients for such a data use. This study paves the way for future investigations of data sharing for the public good and argues to further investigate interindividual differences as drivers of normative beliefs. Furthermore, research and practice will profit from examining additional data sharing scenarios as well as behavioral implications of privacy norms for public benefit use contexts.


Nissenbaum, Helen (2004): Privacy as Contextual Integrity, Washington Law Review 79(1): 119–157.

Nissenbaum, Helen (2018): Respecting Context to Protect Privacy. Why Meaning Matters, Science and Engineering Ethics 24(3): 831–852.

11:30 - 11:40Break
11:40 - 1:00A 2: Motivation and Participation
Session Chair: Bella Struminskaya, Utrecht University, Netherlands, The

Do previous survey experience and being motivated to participate by an incentive affect response quality? Evidence from the CRONOS panel

Hannah Schwarz1, Melanie Revilla1, Bella Struminskaya2

1Pompeu Fabra University (UPF), Spain; 2Utrecht University, The Netherlands

Relevance & Research Question:

As ever more surveys are being conducted, respondents recruited for a survey are more likely to already have previous survey experience. Furthermore, it becomes harder to convince individuals to participate in surveys and thus incentives are increasingly used. Both having previous survey experience and participating in surveys due to incentives have been discussed in terms of their links with response quality. In both cases, theoretical arguments exist that argue these factors could increase or decrease response quality. Empirical evidence is scarce and findings are mixed. This study thus aims to shed more light on the link of previous survey experience and participating due to incentives with response quality.

Methods & Data:

We analysed data of the probability-based CROss-National Online Survey (CRONOS) panel covering Estonia, Slovenia and Great Britain. We use three response quality indicators (item nonresponse, occurrence of primacy effects and nondifferentiation) as outcome variables and indicators for having previous web survey experience and being motivated to participate due to the incentive as predictors of main interest in our regression models.


We found that previous web survey experience had no impact on item nonresponse and occurrence of a primacy effect but reduced nondifferentiation. Being motivated to participate by the incentive did not have a significant impact on any of the three response quality indicators. Hence, overall we find little evidence that response quality is impacted by either of the two factors, previous web survey experience and participating due to the incentive, which are increasingly present in target populations these days.

Added Value:

We add to the scarce pool of empirical evidence on the link between having previous survey experience and participating in surveys due to incentives with response quality. An explicit measure of extrinsic motivation is used in contrast to previous research which often simply assumes extrinsic participation motivation to play a role if incentives are provided.

Moderators of response rates in psychological online surveys over time. A meta-analysis

Tanja Burgard1, Nadine Wedderhoff1,2, Michael Bosnjak1,2

1ZPID - Leibniz Institute for Psychology Information, Germany; 2University of Trier, Germany

Relevance & Research Question:

Response rates in surveys have been declining in various disciplines in the last decades. Online surveys have become more popular due to their fast and easy implementation, but they are especially prone to low response rates. At the same time, the increased use of the internet may also lead to a higher acceptance of online surveys in the course of time. Thus, the overall time trend in response rates of psychological online surveys is in question. We hypothesize a decrease in response rates and examine possible moderators of this time effect, as the invitation mode or contact protocols.

Methods & Data:

We searched PsycInfo and PubPsych with the search terms: (Online Survey or Web survey or Internet survey or email survey or electronic survey) and (response rate or nonresponse rate). This resulted in 913 hits. The abstracts of these records were screened to exclude studies not reporting results of online surveys. We excluded 84 records, resulting in 829 articles for full text screening.

So far, 318 full texts have been screened to assess, whether an article reports response rates of online surveys only and if the information on the participant flow is sufficient to compute response rates, the outcome of interest. Using the metafor package in R, mixed effects multilevel models will be used to investigate the hypothesized time effect and the moderating effects of recruitment, invitation and contact protocols.


With information on 95 samples from 83 reports, the estimated mean response rate in this meta-analysis is 46 %. There is evidence of declining response rates over time. Personal contact to invite respondents improves the willingness to participate. The response rates are slightly higher in samples that received a pre-notification. The number of reminders has no effect on response rates.

Added Value:

The results of the meta-analysis are important to guide decisions on the conduction of online surveys in psychology. To reach the target population, personal contact and use of a pre-notification is recommended. Further investigations, for example on the use of incentives and on the effect of respondent burden, will follow.

We’re only in it for the money: are incentives enough to compensate poor motivation?

Valentin Brunel, Blazej Palat

Sciences Po, France

Relevance & Research Question:

Attrition has been one of the main targets of survey research in panels from its beginnings (Lazarsfeld 1940, Massey and Tourangeau 2013).

This research aims at using different tools measuring motivation (closed items, paradata, open-ended questions…) to understand how initial motivation, interacting with different types of incentives, plays a part in the decision to leave the panel in the context of changing panel functioning.

Methods & Data:

ELIPSS panel, whose panelists were equipped for survey completion with dedicated tablets connected to the internet, was active from 2012 to 2019. Panelists’ motivations to join the panel were systematically measured during recruitment. Their response quality and motivation were also assessed using paradata and recurrent survey measures. As the panel’s functioning was about to change, an experiment on using unconditional differential incentives to encourage staying in the panel was designed. Three randomly selected groups of panelists were formed. The first received financial incentives of the same amount repeatedly: at t1 and four months later at t2. The second received the same financial incentive once at t2, and the third received the same financial incentive coupled with a gift at t2. We analysed the influence of those incentives on panel attrition in interaction with motivation indicators.


Initial motivation studies in the ELIPSS panel have outlined the importance of this indicator in further behavior inside the panel. We observed that as panelists declared themselves more interested by incentives, their chances of leaving increased. Results of the experiment show a marginaly significant effect of incentive type on attrition. Panelists who were given additional incentives didn't seem to remain more often, quite the contrary. However, interpretation of those effects should be clearer when taking motivation into account. Certain types of incentives may have differential results on different types of panelists.

Added Value:

The study results add to the knowledge of how the effects of initial motivation to join an online, non commercial research panel interact with unconditional differential incentives to influence panel attrition in an exceptional context. It is thus of primary importance for panel management purposes.

Should I stay or should I go? - Why do participants remain active in market research communities?

Ruth Anna Wakenhut, Jaqueline Fürwitt, Sophie Vogt


Relevance & Research Question:

The conception and realization of medium- and long-term market research online communities confront researchers with great challenges: In particular qualitative communities are dependent on committed participants staying during the whole field time in order to gain insightful, relevant answers. Some community projects seem to fail precisely in keeping people motivated over a longer period of time.

Through other studies we have already learned that monetary incentive plays an important, but not all determining role. What other factors are relevant? We want to explore the reasons for commitment in research communities and how it can be positively influenced to achieve comprehensive participation throughout a longer community. We focus on the field time, on everything happening after recruitment. We want to better understand how a good online community works from the participants' perspective.

Methods & Data:

We apply a mixed approach and conduct our research in 3 phases with increasing depth: A quantitative questionnaire with about 200 participants is followed by a short digital qualitative project (e.g. forum, associative and projective tasks) with about 40 participants and 8-12 one-hour webcam interviews. This iterative approach allows a gradual identification of participants' experiences, expectations and needs in relation to online communities. We work with partici-pants from different recruitment sources (panel and field service providers), all already took part in qualitative digital research. The total field time is about 2 weeks.


None yet. We will conduct the study in January-February 2020 and will be able to present the results at the conference.

Added Value:

Our industry is facing the challenge of continuing to attract people who take the time to answer our questions – even over a longer period of time. Especially in qualitative research communities we need open dialogue partners who are willing to stay a while with us. Without participants, there is no market research. Motivated participants who feel valued are likely to provide personal, authentic answers. Understanding why people stay in online communities is important to ensure that those needs are met and people continue their participation.

11:40 - 1:00B 2: Turning Unstructured (Survey) Data into Insight with Machine Learning
Session Chair: Stefan Oglesby, data IQ AG, Switzerland

Huge and extremely deep language models for verbatim coding at human level accuracy

Pascal de Buren

Caplena GmbH, Switzerland

Relevance & Research Question:

In the last 2 years, substantial improvements in many natural language processing tasks were achieved by building models with several 100s of millions of parameters and pre-training them on massive amounts of publicly available texts in an unsupervised manner [1] [2]. Among the tasks are also many classification problems, which are similar to verbatim coding, the categorization of free-text responses to open-ended questions popular in market research. We wanted to find out if these new models could set new state-of-the-art results in automated verbatim coding and thus potentially be used to accelerate or replace the tedious manual coding process.

Methods & Data:

We train a model based on [1] but adapted to the task of verbatim coding through architecture tweaks and pre-training on millions of reviewed verbatims on This model was benchmarked against simpler models [3] and widely available tools [4] as well as against human coders on real survey data. Chosen datasets were provided by two independent market research institutes in the Netherlands (Dutch) and in Brazil (Portuguese and English) with n=525 and n=5900 respectively to test the new model on small surveys and large ones alike.


Our model was able to outperform both our previous simpler models as well as standard online tools on a variety of surveys in multiple languages with a weighted-F1 improvement on the Brazilian data from 0.37 to 0.59. We achieve new state-of-the art results for automated verbatim coding matching or even surpassing the intercoder agreement between human coders with an F1 of 0.65 from our model vs 0.61 for the human intercoder agreement on the Dutch dataset.

Added Value:

Researchers can now have a tool that performs verbatim coding almost instantaneously and at a fraction of the cost with similar accuracy to full human coding. Besides the quantitative benchmark results, we also provide qualitative examples and guidance as to which surveys are well suited for automated coding and which less so.




[3] Support-Vector-Machines with Bag-of-Words-Features and Long-Short-Term-Memory Networks with Word-Embeddings


A Framework for Predicting Mentoring Needs in Digital Learning Environments

Cathleen M. Stuetzer1, Ralf Klamma2, Milos Kravcik3

1TU Dresden, Germany; 2RWTH Aachen, Germany; 3DFKI - Deutsches Forschungszentrum für Künstliche Intelligenz, Berlin, Germany

Relevance & Research Question:

Modeling online behavior is a common tool for studying social “ecologies” in digital environments. For this purpose, most of the existing models are used for analysis and prediction.More prediction tools are needed for providing evidence especially for exploring online social behavior to depict the right target group within the right contexts. As a prominent use case In higher education research, the quality of learning processes shall be ensured by providing suitable instruments like mentoring for the pedagogical and social support of students. But how can we identify mentoring needs in digital learning environments? And to what extent can predictive models contribute to the quality assurance of learners’ progress?

Methods & Data:

For contributing to the research questions, we firstly extract and analyze text data from online discussion boards in a distance learning environment by using (automated) text mining procedures (e.g. by topic modelling, semantic analytics, and sentiment analytics). Based on the results, we identify suitable behavioral indicators for modeling specific mentoring occasions by using network analytic instruments. To analyze the emergence of (written) language and to explore latent patterns of sentiments within students’ discussions, GAT (discourse analytic language transcription) instruments are applied. Finally, we build a predictive model of social support in digital learning environments and compare the results with findings from neurolinguistic models.

Results & Added Value:

The contribution proposes an analytical implementation framework for predicting procedures to handle behavioral data of students in order to explore social online behavior in digital learning environments. We present a multidisciplinary model that implements theories from current research, depicts behavioral indicators, and takes systemic properties into account, so that it can be used across the board for applied research. In addition, the findings will provide insights on social behavior online to implement suitable pedagogical and social support in digital learning environments. The study is still a work in progress.

Using AI for a better Customer Understanding

Stefan Reiser1, Steffen Schmidt1, Frank Buckler2

1LINK Institut, Switzerland; 2Success Drivers, Germany

Relevance & Research Question:

Many companies are struggeling with the growing amount of customer data and touchpoint-based feedback. Instead of learning from this feedback and instead of using it to continuously improve their processes and products, many tend to waste it. Besides, traditional causal models like regession analyses do not help to truely understand why KPIs end up at a certail level. Our research objective was to develop a new process, based on AI models, that help to...

a) most automatically structure and analyze customer feedback.

b) better understand customers incl. hidden drivers of their behaviour.

c) help companies to take action based on their customer feedback.

Methods & Data:

Our approach was to make use of information that almost every company has: the NPS score and open-ended text feedback (=reasons for giving this NPS score per customer). The text feedback is being coded by means of supervised learning incl. the sentiment behind statements (=NLP), then the outcomes are being analyzed by means of neural network analysis.


The explorative nature of this approach (= open-ended feedback and machine learning algorithms) reveals, which influence individual criteria will have on the NPS. Unexpected nonlinearities and interactions can be unveiled. Hidden Drivers to leverage the NPS can be uncovered. A large amount of data is reduced to the most significant aspects, we also developed a simple dashboard to illustrate these.

Added Value:

We found that this approach produces a far better explanation power than traditional methods like manual coding and linear regression - usually, the explanation power is twice as good! Besides, these analyses may be implemented for any industry or product, and they can produce insights for historical data. Finally, when dealing with big data sets, the machine learning approaches help to be faster and more efficient.

Read my hips. How t9 address AI transcription issues

André Lang1, Stephan Müller1, Holger Lütters2, Malte Friedrich-Freksa3

1Insius, Germany; 2HTW Berlin, Germany; 3GapFish GmbH, Germany

Relevance & Research Question:

Analyzing voice content in large scale is a promising but difficult task. Various sources of voice data exist, ranging from audio streams of YouTube videos to voice responses in online surveys. While automated solutions for speech-to-text transcription exist, the question remains how to validate, quality-check and leverage the output of these services to provide insights for market research. This study focuses on methods for evaluating quality, filtering and processing of the texts returned from automated transcript solutions.

Methods & Data:

The feasibility and challenges of processing text transcripts from voice is assessed along the process steps of error recognition, error correction and content processing. As a baseline, a set of 400 voice responses, transcribed with Google’s Cloud Speech-to-Text service, are enriched with the real responses taken from original audio. Differences, being errors in recognition, are categorized into their different causes, such as unknown names or similar sounding words. Different methods for error detection and correction are proposed, applied and tested against this goldset corpus and evaluated. The resulting texts are processed with an NLP concept detection method usually applied to UGC content in order to check how further insights such as inherent topics can be derived.


Although having improved substantially over the past years, handling speech-to-text output remains challenging. Unsystematic noise generated from mumbling or individual pronunciation is less problematic, but systematic errors such as mis- or undetected person or brand names, due to difficult or ambiguous pronounciation, may distort results substantially. The error detection methods shown, along with detection confidence values retrieved from the transcription service, provide a first baseline for filtering, rejecting input of low quality, and further processing in order to get meaningful insights.

Added Value:

Previous studies have shown that voice responses are more intense and longer than typed ones. Having ways of evaluating and controlling the output of speech-to-text services, knowing their limitations, and checking the applicability of NLP methods for further processing is vital to build robust analytical services. This study covers the topics that have to be addressed in order to draw substantial benefits from voice as a source of (semi-)automated analytics.

11:40 - 1:00C 2: Hate Speech and Fake News
Session Chair: Pirmin Stöckle, University of Mannheim, Germany

Building Trust in Fake Sources: An Experiment

Paul C. Bauer1, Bernhard Clemm von Hohenberg2

1MZES Mannheim, Germany; 2European University Institute, Italy

Relevance and research question:

Today, social media like Facebook and Whatsapp allow anyone to produce and disseminate “news”, which makes it harder for people to decide which sources to trust. While much recent research has focussed on the items of (mis)information that people believe, less is known what makes people trust a given. We focus on three source characteristics: Whether the source is known (vs. unknown), on what channel people receive its content (Facebook vs. web site) and whether previous information by that source was congruent (vs. incongruent) with someone's worldview.

Methods & Data:

In an pre-registered online survey experiment with a German quota sample (n = 1980), we expose subjects to a series of news reports, manipulated in a 2x2x2 design. Through the use of HTML, we create highly realistic stimulus material that is responsive to mobile and desktop respondents. We measure whether people believe that a report is true, and whether they would share it.


We find that individuals have a higher level of belief in and a somewhat higher propensity to share news reports by sources they know. Against our expectation, the source effect on belief is larger on Facebook than on websites. Our most crucial finding concerns the impact of congruence between facts and subjects’ world view. People are more likely to believe a news report by a source that has previously given them congruent information—if the source is unknown.

Added value:

We re-evaluate older insights from the source credibility literature in a digital context, accounting for the fact that social media has changed the way sources appear to news consumers. We further provide causal evidence that explains why people tend to trust ideologically aligned sources.

Social Media and the Disruption of Democracy

Jennifer Roberton1, Matt Browne2, François Erner1

1respondi; 2Global Progress

Relevance & Research Question:

It seems nostalgic to recall that the early days of the internet inspired hopes for a more egalitarian and democratic society. Some of this promise has been fulfilled — connectivity has enabled new forms of collective mobilization and made human knowledge accessible to anyone. But we are also living with the side effects of the Internet. Among them, pervasive disinformation in the polity that is weakening the integrity of our democracies and bringing people to the streets.

Fake news, hostile disinformation campaigns and polarization of the political debate have combined to undermine the shared narrative that once bound societies together. Trust in the institutions of democracy has been eroded. Tribalism and a virulent form of populism are the hallmarks of contemporary politics. The rules of politics are being rewritten.

Conducted as part of multi-stakeholder dialogue with the social media platforms on the renovation of democracy, our research explores the impact of social media on democratic society AND the impact of democratic disruptions on the reputation of social media platforms themselves.


20 minute surveys conducted in June and July 2019 in France, Germany and UK (n=500 in each country, representative age and gender). All agreed to install a software that monitors their online activity. They all have been tracked for 12 months before they participated in the research

Kmean segmentation combining declarative and passive data. Declarative data includes the attitude of each respondent towards “traditional” fake news. Passive data is mainly focused on the types of sites where they find information (main news websites or user generated content for instance).


This research reveals three paradoxes of the digital democracy.

1. Those who supported, and who benefited from the digital revolution the most are those who trust the GAFAs the least, and think they now need to be controlled.

2. Those who trust democratic institutions the least are those who believe in Facebook political virtues the most.

3. To believe in fake news is less a cognitive matter than a political statement.

Added value

This research describes mechanisms by which facebook takes advantage of fake news.

What Should We Be Allowed to Post? Citizens’ Preferences for Online Hate Speech Regulation

Simon Munzert1, Richard Traunmüller2, Andrew Guess3, Pablo Barbera4, JungHwan Yang5

1Hertie School of Governance, Germany; 2University of Frankfurt, Germany; 3Princeton University, United States of America; 4USC, United States of America; 5UIUC, United States of America

Relevance & Research Question:

In the age of social media, the questions of what is allowed to say and how hate speech should be regulated are ever more contested. We hypothesize that content- and context-specific factors influence citizens’ perceptions of the offensiveness of online content, and also shape preferences for action that should be taken. This has implications for the legitimacy of hate speech regulation.

Methods & Data:

We present a pre-registered study to analyze citizens’ preferences for online hate speech regulation. The study is embedded in nationally representative online panels in the US and Germany (about 1,300 respondents, opt-in panels operated by YouGov). We construct vignettes in forms of social media posts that vary along key dimensions of hate speech regulation, such as sender/target characteristics (e.g., gender and ethnicity), message content, and target’s reaction (e.g., counter-aggression or blocking/reporting). Respondents are asked to judge the posts with regards to their offensiveness and consequences the sender should face. Furthermore, the vignette task was embedded in a framing experiment, motivating it by (a) looming government regulation protecting potential victims of hate speech, (b) civil rights groups advocating against censorship online, or (c) a neutral frame.


While for half (48%) of the posts the respondents saw no need for action by the platform provider, for 11% of the posts they would have liked to see the sender to be banned permanently from the platform. Violent messages are substantively more critically evaluated than insulting or vilifying messages. At the individual level, we find that females are significantly more likely to regard the posts as offensive or hateful than males. With regards to the framing experiment, we find that, compared to the control group, respondents confronted with the government prime are 20pp less likely to demand no action in response to offensive posts.

Added Value:

While governments around the world are acting towards regulating hate speech, little is known about what is deemed acceptable or inacceptable speech online in different parts of the population and societal contexts. We provide first evidence that could inform future debates on hate speech regulation.

11:40 - 1:00D 2: GOR Best Practice Award 2020 Competition II
Session Chair: Otto Hellwig, respondi AG & DGOF, Germany
Session Chair: Alexandra Wachenfeld-Schell, GIM & DGOF, Germany

Significant improvement of relevant KPIs with optimization of the programmatic modulation

Silke Moser1, Frank Goldberg2

1GIM Gesellschaft fuer Innovative Marktforschung mbH, Germany; 2DMI Digital Media Institute GmbH, Germany

Relevance & Research Question:

Advertising impact research for a spot of a new OTC product played on digital displays at various touchpoints (DOOH). The main focus of the study was to investigate the increase of KPIs achieved with DOOH for several target groups as well as to determine the contribution of individual touchpoints. Who sees advertisements, when and how receptive is a person to this TP is a central question.

Methods & Data:

A combination of methods was used: online survey, geo location tracking and a follow-up survey. The samples were drawn according to representative structural data. The follow-up survey comprised n = 2 x 2,000 participants. One part (n=2000/mobility panel) was asked to track their movement for 21 days throughout the campaign period. Additionally, for this sub-sample we had information about its values and settings. After the campaign, both samples were interviewed in the online follow-up survey on various parameters of product and advertising awareness.


For the mobility panel, the combination of tracking and survey data enabled the recall of the advertisement to be determined with additional information on the actual advertising contact opportunities and frequencies. Additionally, the question about target group specifics of individual touchpoints could be answered. The comparison with participants who were not tracked shows the high value of the mobility data. It allows location and time dependent programmatic modulation of the advertising. The content of each touchpoint therefore can be adjusted specifically for certain target groups and its effect can be analysed to e.g. reduce scattering losses, which ultimately can increase its recall and recognition and thus the ROI.

Added Value:

Geo-tracking enables the continuous mapping of the movement patterns of target groups. In combination with the described profile data, incl. decision-making style, it allows precise knowledge about different target groups. This leads to a significantly improved accuracy in addressing target groups and advertising effectiveness of touchpoints. It will also enable further approaches to understanding target groups and their behaviors, as it is possible to identify people who came in contact with not only specific touchpoints but also POIs, making it possible to determine the impact of contact with a brand on brand relevance and image.

Appetite for Destruction: The Case of McDonald’s Evidence-based Menu Simplification

Steffen Schmidt1, Marius Truttmann2, Philipp Fessler1, Severine Caspard2

1LINK Institut, Switzerland; 2McDonald's Suisse, Switzerland

Relevance & Research Question: The substantial increase from 8 to more than 90 products in around 40 years, increased the complexity of the product production and preparation process of McDonald’s Switzerland, and consequently threatens the customer-perceived product quality. The critical research challenge and insights goal was to regain efficient resources through removing selected products from the menu but without threatening customer growth.

Methods & Data: In order to ensure valid and reliable insights with a precise predictive forecasting power, a set of highly advanced methods was applied. In detail, several advanced implicit association tests regarding a brand and product assessment has been employed online and an online-based discrete choice experiment that simulated various consumption situations depending on the shown task-related behavior before with up to 5 choice tasks. In total, a set of five strongly behavior-related indicators (e.g., implicit brand-product-fit, purchase potential etc.) has been derived. Finally, based on that indicators, a final single value indicator for each investigated product has been calculated to determine the overall customer-related consumption value that has been further utilized to derive a review list of products to identify the products that have no substantial impact on the customer’s future behavior. In total, 2056 customers of McDonald’s aged 15 to 79 years who visit a McDonald’s restaurant in the last 12 months at least one time have been interviewed.

Results: Based on the derived product review list, selective products in the bottom have been removed from the market. Before, a swiss-wide field experiment in more than 30 restaurants has been conducted to check the actual behavior. As predicted, no decrease related to customer’s visit frequency and average menu spending could be identified. Subsequently, the respective products have been removed from all restaurants.

Added Value: The innovative method combination was capable to identify customer’s product value, and thus customer’s behavior in the real world. Before, other country organizations of McDonald’s had tried before just one single method (e.g., menu choice based conjoint), but failed. Currently, an increasing number of other country organizations is applying this developed approach for a strengthened market performance.

11:40 - 1:00T2: GOR Thesis Award 2020 Competition II
Session Chair: Frederik Funke, Dr. Funke SPSS- & R-Trainings & LimeSurvey GmbH, Germany

The Digital Architectures of Social Media: Platforms and Participation in Contemporary Politics

Michael Joseph Bossetta

Lund University, Sweden

Social media platforms (SMPs) influence the communication of virtually all stakeholders in democratic politics. Politicians and parties campaign through SMPs, media organizations use them to distribute political news, and many citizens read, share, and debate political issues across multiple social media accounts. When assessing the political implications of these practices, scholars have traditionally focused on the commonalities of SMPs, rather than their differences. The implications of this oversight are both theoretical and methodological. Theoretically, scholars lack an overarching conceptual framework to inform cross-platform research designs. As a result, the operationalization of social media variables across platforms is often inconsistent and incomparable, limiting the attribution of platform-specific effects.

This dissertation therefore contributes to the study of digital media and politics by developing a cross-platform theory of platforms’ digital architectures. Digital architectures are defined as the collective suite of technical protocols that enable, constrain, and shape user behavior in a virtual space. The dissertation’s central argument is that the digital architectures of SMPs mediate how users enact political processes through them. Focusing on politicians’ campaigning and citizens’ political participation during elections, I show how political communication processes manifest differently across platforms in ways that can be attributed to their digital architectures. Moreover, I demonstrate how both politicians and citizens manipulate the digital architectures of platforms to further their political agendas during elections.

To mount these arguments, the dissertation adopts a highly conceptual, exploratory, and interdisciplinary approach. Its main theoretical contribution, the digital architectures framework, brings together fragments from literatures spanning archeology, design theory, media studies, political communication, political science, social movements, and software engineering. Methodologically, the study combines qualitative and quantitative methods to address the research questions of four individual research articles (Chapters 4-7). These studies have been published in Information, Communication & Society; Journalism & Mass Communication Quarterly; Language and Politics; and a book chapter in Social Media and European Politics (Palgrave). The main empirical cases included in the dissertation are the 2015 British General Election, the 2016 Brexit Referendum, and the 2016 U.S Presidential Election.

The structure of the dissertation is as follows. Chapter 1 introduces the dissertation’s overarching research questions and design. Chapter 2 situates the digital architectures framework within the existing literature by critiquing existing theoretical approaches to social media and political participation. Chapter 3 outlines the main challenges in studying participation on social media, as well as summarizes the dissertation’s methodological approach. Chapter 4 then presents the digital architectures framework through a systematic, cross-platform comparison of Facebook, Twitter, Instagram, and Snapchat. In this chapter, I illustrate how the digital architectures of these SMPs shaped how American politicians used them for political campaigning in the 2016 U.S. election.

Shifting focus from politics to citizens, Chapter 5 examines how the digital architectures of social media influence citizens’ political participation. Chapter 5 characterizes the various styles and degrees of political participation through SMPs, and it shows how the architectures of Twitter and Facebook lead to different manifestations of online participation in the context of European politics.

Building on the Chapter 5’s conceptual work, Chapters 6 and 7 use digital trace data to empirically investigate citizens’ participation on Twitter and Facebook, respectively. Chapter 6 offers a new theory of online political participation by conceptualizing it as a process, rather than as an activity. Chapter 6 develops a typology of political participation and applies it to citizens’ use of Twitter in the 2015 British General Election. We find that a small number of highly active citizens dominate the political discussion on Twitter, and these citizens tend to promote right-wing, nationalist positions.

Chapter 7 finds similar patterns in citizens’ political participation on Facebook during the 2016 Brexit referendum. Using metadata to chart the commenting patterns of citizens across media and political Facebook pages, Chapter 7 reveals that Leave supporters were much more active in political commentary than Remain supporters. However, this phenomenon is, again, due to a small number of active citizens promoting right-wing, nationalist positions. Few citizens commented on both media and campaign Facebook pages during the referendum, but those who did commented on the media first. This finding, together with the observation that political commentary overwhelmingly took place on media pages, supports the notion that the mainstream media maintain their agenda-setting role on SMPs.

Lastly, Chapter 8 argues that different digital architectures afford varying degrees of publicness, which in turn affects how political participation is actualized across platforms. This chapter, and the thesis, concludes with a discussion of why the digital architectures of social media are critical to consider when assessing social media’s impact on democracy.

Optimizing measurement in Internet-based research: Response scales and sensor data

Tim Kuhlmann

Universität Siegen, Germany

The present PhD thesis lies within the area of Internet-based research; specifically it is concerned with Internet-based assessment via questionnaires and smartphones. The thesis investigates the influence of response scales and objective sensor data from smartphones on the data gathering process and data quality. Internet-based questionnaires and tests are becoming increasingly common in psychology and other social sciences (Krantz & Reips, 2017; Wolfe, 2017). It is therefore important, for researchers and practitioners alike, to base decisions about their research design and data gathering process, on solid empirical advice.

The first research article compared two types of response scales, visual analogue scales (VASs) and Likert-type scales, with regard to non-response. Participants of an eHealth intervention were randomly allocated to answer an extensive questionnaire with either VASs or Likert-type scales as response options to otherwise identical items. A sample of 446 participants with a mean age of 52.4 years (SD=12.1) took part. Results showed lower SDs for items answered via VAS. They also indicated a positive effect of VASs with regard to lowering dropout of participants, OR=.75, p=.04.

The second research article investigated the validity of measurement, again comparing VASs and Likert-type scales. The response scales of three personality scales were varied in a within-design. A sample of 879 participants filled in the Internet-based questionnaire, answering the personality scales twice in a counterbalanced design. Results of Bayesian hierarchical regressions largely indicated measurement equivalence between the two response scale versions, with some evidence for better measurement quality with VASs for one of the personality scale Excitement Seeking, B10=1318.95, ΔR2=.025.

The third research article investigated the validity of objective sensor data in an experience sampling design. The association of subjective well-being with smartphone tilt was investigated in two separate samples implementing different software to gather data. In both samples measurements consisted of cross-sectional questionnaires and a longitudinal period of three weeks, with measurements twice per day. Results provided evidence for the validity of smartphone tilt as an indicator of subjective well-being, t(3392)=-3.9, p<.001. In addition to the analysis of tilt and well-being, potential biases and problems when implementing objective data are discussed, specifically when different software implementations and operating systems are involved.

In conclusion, the PhD thesis offers valuable insights on Internet-based assessment. The VAS’s position as a superior response scale was strengthened.Its advantages over more traditional Likert-type scales, e.g., offering better distributional properties and more valid information, were confirmed and no disadvantages emerged. Smartphone sensor data were shown to provide a way to validate self-report measurement, if potentially important caveats related to differences in data are identified and addressed.

1:00 - 1:20Break
1:20 - 2:20P 1.1: Poster I

Reproducible and dynamic meta-analyses with PsychOpen CAMA

Tanja Burgard, Robert Studtrucker, Michael Bosnjak

ZPID - Leibniz Institute for Psychology Information, Germany

Relevance & Research Question:

A problem observed by Lakens, Hilgard, & Staaks (2016) is, that many published meta-analyses remain static and are not reproducible. The reproducibility of meta-analyses is crucial for several reasons. First, to enable the research community to update meta-analyses in case of new evidence. Second, to give other researchers the opportunity to use subsets of meta-analytic data. Third, to enable the application of new statistical procedures and test the effects these have on the results of a meta-analysis.

We plan to set up an infrastructure for the dynamic curation and analysis of meta-analyses in psychology. A CAMA (Community Augmented Meta-Analysis) serves as an open repository for meta-analytic data, provides basic analysis tools, makes meta-analytic data accessible and can be used and augmented by the scientific community as a dynamic resource (Tsuji, Bergmann, & Cristia, 2014).

Methods & Data:

We created templates to standardize the data structure and variable naming in meta-analyses. This is crucial for the planned CAMA to enable interoperability of data and analysis scripts. Using these templates, we standardized data of meta-analyses from two different psychological domains (cognitive development and survey methodology) and replicated basic meta-analytic outputs with the standardized data sets and analysis scripts.


We succeeded in standardizing various meta-analyses in a common format using our templates and in replicating the results of these meta-analyses with the standardized data sets. We tested analysis scripts for various meta-analytic outputs for the CAMA planned, as funnel plots, forest plots, power plots, and meta-regression.

Added Value:

Interoperability and standardization are important requirements for an efficient use of open data in general (Braunschweig, Eberius, Thiele, & Lehner, 2012). The templates and analysis scripts presented moreover serve as the basis for the development of PsychOpen CAMA, a tool for the research community to collect data and conduct meta-analyses in psychology collaboratively.

Survey Attitude Scale (SAS) Revised: A Randomized Controlled Trial Among Higher Education Graduates in Germany

Thorsten Euler, Ulrike Schwabe, Nadin Kastirke, Isabelle Fiedler, Swetlana Sudheimer

German Centre for Higher Education Research and Science Studies, Germany

Relevance & Research Question:

Various empirical evidence signals that general attitudes towards surveys do predict willingness to participate in (online) surveys (de Leeuw et al. 2017; Jungermann/Stocké 2017; Stocké 2006). The nine-item short form of the Survey Attitude Scale (SAS) as proposed by de Leeuw et al. (2010, 2019) differentiates between three dimensions: (i) survey enjoyment, (ii) survey value, and (iii) survey burden. Previous analyses in different datasets have shown that especially the two dimensions, survey value and survey burden, do not perform satisfactory with respect to internal consistency and factor loadings in different samples (Fiedler et al. 2019). Referring to de Leeuw et al. (2019), we therefore investigate into the question whether the SAS can be further improved by reformulating single items and adding new ones from existing literature (Stocké 2014; Rogelberg et al. 2001; Stocké/Langfeldt 2003).

Methods & Data:

Consequently, we implemented the proposed German version of the SAS, adopted from the GESIS Online Panel (Struminskaya et al. 2015) in an online survey for German Higher Education Graduates being conducted recently (October - December 2019, n = 1,378). Furthermore, we realised a survey experiment with split-half design aiming to improve the SAS by varying the wording of four items and adding one supplemental item per dimension. To compare both scales, we use confirmatory factor analysis (CFA) and measures for internal consistency within both groups.


Comparing CFA results, our empirical findings indicate that the latent structure of the SAS is reproducible in the experimental as well as in the control group. Factor loadings as well as reliability scores support the theoretical structure adequately. But, we do find evidence that changes in the wording of the items (with respect to harmonize the use of terms and to avoid survey mode mentioning) can partially improve the internal validity of the scale.

Added Value:

Overall, the standardized short SAS is a promising instrument for survey researchers. By intensively validating the proposed instrument in an experimental setting, we contribute to the existing literature. Since de Leeuw et al. (2019) also do report shortcomings of the scale; we show possibilities for further improvement.

„Magic methods“, bigger data and AI - Do they endager quality criteria in online surveys?

Stephanie Gaaw, Cathleen M. Stuetzer, Stephanie Hartmann, Johannes Winter

Technical University Dresden, Germany

Relevance & Research Question: ---Quality criteria in the field of (online) surveys are already existing for quite a long time and are therefore also viewed as well established. With the upcoming of new methodological approaches like new sampling procedures, it's questionable though if those criteria are still up-to-date and how current research achieves a methodological suitable reconstruction of quality. Therefore this contribution deals with the current state of the art in meeting quality criteria of online surveys in times of big data, self-learning algorithms and AI.---

Methods & Data: ---On the basis of a narrative literature review, the current state of research will be presented. Findings from both academic as well as applied research are brought together and transfered into recommendations for action. Current (academic) contributions were analysed for challenges and potentials related to quality assurance procedures in the area of online research. In addition, scientific standards and current codes of conduct for the industry were elaborated and examined for their adaptability and scalability for market, opinion and social research.---

Results: ---The results are currently being processed. However, there still are general quality criteria such as objectivity, reliability and validity. The construct „representativeness“ though is still in discussion and it's not clear yet whether the gold standard of a representative survey does work online without manipulation procedures. Therefore, a new view on quality criteria in online contexts seems essential in order to give orientation for the successfully implementation of new methods of online research in the future.---

Added Value: ---The aim is to make a sustainable contribution in the field of quality assessment for both academic and applied online research, especially online surveys. A particular benefit for applied research is to address problems such as survey fatigue and acceptance issues.---

Semi-automation of qualitative content analysis based on online research

Annette Hoxtell

HWTK University of Applied Sciences, Germany

Relevance & Research Question:

According to the GRIT-report 2019, market-researchers consider research and analysis automation a crucial opportunity for their industry. Although qualitative studies are harder to automate than quantitative ones, the automation of qualitative content analysis, a major qualitative evaluation method, is already partially feasible and expected to be further developed.

Main research question: How can qualitative content analysis be (semi-)automated?

Sub-questions: What would an automated qualitative research process as a whole look like? How does it advance online research?

Methods & Data: Semi-automation of qualitative content analysis as well as the research process as a whole are conceptualized based on the hermeneutic method, which is applied to a non-automated study carried out by the author using case-study methodology. Automation approaches already in use are identified through a systematic literature-review.

Results: Currently, qualitative content analysis and the qualitative research process as a whole can only be semi-automated since they depend on continuous human-machine interaction. Full automation seems feasible with the advance of artificial intelligence. It would be based on online and mobile technologies.

Added Value: This poster highlights a roadmap for the automation of qualitative research comprising qualitative content analysis, an increasingly important topic in qualitative social research, and especially in market research.

Assessing the Reliability and Validity of a Four-Dimensional Measure of Socially Desirable Responding

Rebekka Kluge1, Maximilian Etzel1, Joseph Walter Sakshaug2, Henning Silber1

1GESIS Leibniz Institute for the Social Sciences, Germany; 2Institute for Employment Research (IAB), Germany

Relevance & Research Question: Socially desirable responding (SDR), understood as the tendency of respondents to present themselves in surveys in the best possible light, is often understood as a one- or two-dimensional construct. The two short scales, Egoistic (E-SDR) and Moralistic Socially Desirable Responding (M-SDR) understand SDR as a four-dimensional construct. This understanding represents the most comprehensive conceptualization of SDR. Nevertheless, these short scales have not yet been applied and validated in a general population study. Such an application is important to measure and control for social desirability bias in general population surveys. Therefore, we test the reliability and validity of both short scales empirically to provide a practical measure of the four dimensions of SDR in self-administered surveys.

Methods & Data: The items of the source versions of the E-SDR and M-SDR were translated into German using the team approach. To avoid measuring a response behavior rather than social desirability bias, we balanced negative and positive formulated items. The scales together comprise 20 items. We integrated these 20 items into a questionnaire within a mixed-mode mail- and web-based survey conducted in the city of Mannheim, Germany (N~1000 participants). The sample was selected via Simple Random Sampling (SRS).

We assess the reliability and validity of E-SDR and M-SDR by using different analytical methods. To test the reliability, we aim to compute Cronbach’s alpha, the test-retest stability for the two short-scales, and the item-total correlation. To investigate the validity, we will test the construct validity by confirmatory factor analysis (CFA). For measuring discriminant and convergence validity, we correlate the two short-scales with the Big Five traits Extraversion, Agreeableness, Conscientiousness, Emotional Stability, and Openness.

Results: The field period will be from November 2019 to December 2019, and the first results will be available in February 2020.

Added Value: Based on our findings, we can evaluate the four-dimensional measurement of SDR with E-SDR and M-SDR short-scales in self-administered population surveys. If the measurement turns out to be reliable and valid, it can be used in future general population surveys to control for SDR.

1:20 - 2:20P 1.2: Poster II

Associations in Probability-Based and Nonprobability Online Panels: Evidence on Bivariate and Multivariate Analyses

Carina Cornesse, Tobias Rettig, Annelies Blom

University of Mannheim, Germany

Relevance & Research Question:

A number of studies have shown that probability-based surveys lead to more accurate univariate estimates than nonprobability surveys. However, some researchers claim that, while they do not produce accurate univariate estimates, nonprobability surveys are “fit for purpose” regarding bivariate and multivariate analyses. In this study, we therefore assess to what extent bivariate and multivariate survey estimates from probability-based and nonprobability online panels lead to accurate conclusions.

Methods & Data:

We answer our research question using data from a large-scale comparison study in which three waves of data collection were commissioned in parallel to two academic probability-based online panels and eight commercial nonprobability online panels in Germany. For each of the online panels, we calculate bivariate associations and multivariate models and compare the results to gold-standard benchmarks, examining whether the strength, direction, and statistical significance of the coefficients accurately reflects the expected outcomes.


Regarding key substantive political scientific variables (voter turnout and voting for the main German conservative party (CDU)), we find that the probability-based online panels in our study generally lead to more accurate associations than the nonprobability online panels. Unlike the probability-based online panels, the nonprobability online panels produce a number of significant associations that are contrary to expected outcomes (e.g., that older people are significantly less likely to vote for the main conservative party). Furthermore, while the two probability-based online panels in our study produce similar findings, there is a lot of variability in the results from the nonprobability online panels and none of them consistently outperform the others.

Added Value:

While a number of studies have assessed the accuracy of univariate estimates in probability-based and nonprobability online panels, our study is one of the few that examine bivariate associations and multivariate models. Our preliminary results do not support the claim that nonprobability surveys are fit for the purpose of bivariate and multivariate analyses.

Semiautomatic dictionary-based classification of environment tweets by topic

Michela Cameletti2, Stephan Schlosser1, Daniele Toninelli2, Silvia Fabris2

1University of Göttingen, Germany; 2University of Bergamo, Italy

Relevance & Research Question:

In the era of social media, the huge availability of digital data allows to develop several types of research in a wide range of fields. Such data is characterized by several advantages: reduced collection costs, short retrieval times and production of almost real-time outputs. At the same time, this data is unstructured and unclassified in terms of content. This study aims to develop an efficient way to filter and analyze tweets by means of sentiment related to a specific topic.

Methods & Data:

We developed a semiautomatic unsupervised dictionary-based method to filter tweets related to a specific topic (environment, in our study). Starting from the tweets sent by a selection of Official Social Accounts linked with this topic, a list of keywords, bigrams and trigrams is identified in order to set up a topic-oriented dictionary. We test the performance of our method by applying the dictionary to more than 54 million tweets posted in Great Britain between January and May 2019. Since the analyzed tweets are geolocalized due to the method of data collection, we also analyze the spatial variability of the sentiment for this topic across the country sub-areas.


All the performance indexes considered denote that our semiautomatic dictionary-based approach is able to filter tweets linked to the topic of interest. Despite the short time window considered, we highlight a growing inclination to environment in any area of Great Britain. Nevertheless, the spatial analysis found a lack of spatial correlation (probably because environment is a broad argument, but also strongly affected by local factors).

Added Value:

Our method is able to build (and to periodically update) a dictionary useful to select tweets about a specific topic. Starting from this, we classify selected tweets and we apply a spatial sentiment analysis. Focusing on environment, our method of setting up a dictionary and of selecting tweets by topic leaded to interesting results. Thus, it could be reused in the future as a starting point for a wide variety of analysis, also on other topics and for other social phenomena.

What is the measurement quality of questions on environmental attitudes and supernatural beliefs in the GESIS Panel?

Hannah Schwarz1, Wiebke Weber1, Isabella Minderop2, Bernd Weiß2

1Pompeu Fabra University (UPF), Spain; 2GESIS Leibniz Institute for the Social Sciences

Relevance & Research Question:

The measurement quality of survey questions, defined as the product of validity and reliability, indicates how well a latent concept is measured by a question. Measurement quality also needs to be estimated in order to correct for measurement error. Multitrait-Multimethod (MTMM) experiments allow us to do this. Our research aims to determine the measurement quality resulting from variations in formal characteristics such as number of scale points and partial versus full labelling of scale points, for the given questions in web mode.

Methods & Data:

We conducted two MTMM experiments on the mixed-mode (majority web) GESIS panel, one dealing with environmental attitudes and the other with supernatural beliefs. We estimate the quality of three different response scales for each of the experiments by means of structural equation modelling.


We do not have results yet. Based on evidence from face-to-face surveys, we would expect that, in both cases, a continuous scale with fixed reference points will lead to the highest measurement quality among the three, that a partially labelled 11-point scale will result in the second highest measurement quality and that a fully labelled 7-point scale will yield the lowest measurement quality.

Added Value:

Quite some research exists on MTMM experiments in more traditional modes, especially face-to-face. However, only few MTMM experiments in web mode have been conducted and analyzed so far.

Open Lab: a web application for conducting and sharing online-experiments

Yury Shevchenko1, Felix Henninger2

1University of Konstanz, Germany; 2University of Koblenz-Landau, Germany

Relevance & Research Question:

Online experiments have become a popular way of collection data in social sciences. However, high technical hurdles in setting up a server prevent a researcher from starting an online study. On the other hand, proprietary software restricts the researcher’s freedom to customize or share the code. We present Open Lab – the server-side application that makes online data collection simple and flexible. Open Lab is not dedicated to one particular study, but is a hub where online studies can be easily carried out.

Methods & Data:

Available online at, the application offers a fast, secure and transparent way to deploy a study. It takes care of uploading experiment scripts, changing test parameters, managing the participants’ database and aggregating the study results. Open Lab is integrated with the lab.js experiment builder (, which enables the creation of new studies from scratch or the use of templates. The lab.js study can be directly uploaded to Open Lab and is ready to run. Integration with the Open Science Framework allows researchers to automatically store the collected data in an OSF project.


At the conference, we will present the main features of the web application together with results of empirical studies conducted with Open Lab.

Added Value:

Open Lab enables interdisciplinary projects where behavioral scientists work together, and participants not only play a role of passive subjects, but also learn about the science, talk to a researcher or even propose and implement new versions of the task.

Using Nonprobability Web Surveys As Informative Priors in Bayesian Inference

Joseph Sakshaug1,2,3

1Institute for Employment Research, Germany; 2Ludwig Maximilian University of Munich, Germany; 3University of Mannheim, Germany

Relevance & Research Question: Survey data collection costs have risen to a point where many survey researchers and polling companies are abandoning large, expensive probability-based samples in favour of less expensive nonprobability samples. The empirical literature suggests this strategy may be suboptimal for multiple reasons, amongst them probability samples tend to outperform nonprobability samples on accuracy when assessed against population benchmarks. However, nonprobability samples are often preferred due to convenience and cost effectiveness.

Methods & Data: Instead of forgoing probability sampling entirely, we propose a method of combining both probability and nonprobability samples in a way that exploits their strengths to overcome their weaknesses within a Bayesian inferential framework. By using simulated data, we evaluate supplementing inferences based on small probability samples with prior distributions derived from nonprobability data. The method is also illustrated with actual probability and nonprobability survey data

Results: We demonstrate that informative priors based on nonprobability data can lead to reductions in variances and mean-squared errors for linear model coefficients.

Added Value: A summary of these findings, their implications for survey practice, and possible research extensions will be provided in conclusion.

1:20 - 2:20P 1.3: Poster III

A Theoretical Model for Trust in Digital Information Systems

Meinald T. Thielsch, Sarah M. Meeßen, Guido Hertel

University of Münster, Germany

Relevance & Research Question: Trust has been an important topic in technology research and adoption of web-based services. Users’ trust is not only essential for acceptance of new systems but also for a continued adoption. As a first step of a systematic approach to design trusted information systems and enable trustful interaction between users and information systems, we develop a new comprehensive model of trust in information systems based on existing literature. This models includes both precursors (e.g., perceived trustworthiness) and consequences of trust in IS (e.g., actual use, forgetting of information stored in the IS for cognitive relief).

Methods & Data: Based on extant literature, we differentiate experienced trust in IS from perceived trustworthiness of an IS, intentions to use an IS, and actual usage of an IS. Moreover, exiting experiences with an IS as well as perceived risk and more general contextual factors are considered as moderating factors. Finally, the resulting model is reflected and refined in light of empirical findings on precursors and outcomes of trust in IS.

Results: Our new comprehensive model not only structures the existing research, but also provides a number of concrete propositions. Perceived IS trustworthiness should affect trust particularly when users already have experience with the IS. Experienced trust in IS should strengthen users’ intention to adopt an IS, but this effect should be counteracted by perceived contextual risks. Trust should be influenced by the users’ personality, e.g., their general disposition to trust. The translation of users’ intentions to use an IS into actual behavior should be affected by enabling (e.g., control) and inhibiting (e.g., existing work routines) contextual factors. Finally, the model contains feedback loops describing how actual use of an IS influences perceived trustworthiness of an IS, and following trust experience of users. Evidence from two experimental studies supports the validity of our model.

Added Value: We provide a new theoretical model that outlines a structured process chain of trust in IS. In addition to the theoretical contribution, this systematic approach provides various practical implications for the design of IS as well as for the implementation and maintenance process.

Factors Influencing the Perception of Relevant Competencies in the Digitalized Working World

Swetlana Franken, Malte Wattenberg

Bielefeld University of Applied Sciences, Germany

Relevance & Research Question: Many companies see the lack of skilled workers as a central obstacle to the digital transformation. It is well-known that diverse workforces lead to more balanced decisions and more innovation. Nevertheless, women, for example, are still underrepresented in STEM-professions. The following research question arises: Are there any differences in the perception of relevant competencies for the digitalized working world according to gender, age, employment status and migration background?

Methods & Data: Following preliminary literature research and qualitative expert interviews [n=6], a quantitative study was conducted from Nov. – Dec. 2018. Participants [n=515] were recruited among students and companies using faculty email lists, paper form and social media. Participants were asked to assess a total of 14 competencies, knowledge resources and behaviours in their relevance for the digitalized working world on a 6-tier scale. Correlations were determined by calculating Chi-square according to Person and Cramer’s V. Means were compared using T-Test and Levene.

Results: Respondents consider openness to change (5.50), IT and media skills (5.40) and learning ability (5.36) to be the most relevant. Analytical skills (4.79) and empirical knowledge (4.56) are less in demand.

Men rate innovation competence (χ²=10.895, p=.028, V=0.146), decision-making ability (χ²=13.801, p=.017, V=0.164) and ability to think in context (χ²=14.228, p=.014, V=0.167) slightly higher than women. No correlation can be found regarding respondents’ migration background. Among company representatives, eight competencies are rated significantly higher than by students, especially communicative competence (+0.91) and interdisciplinary thinking and acting (+0.74). Moreover, it is noticeable that older participants (generation X, born 1964-1979) consider all competencies to be more important than younger ones (generation Z, 1996-2009), apart from IT and media competence. The items openness to change (T-Test p=.004, Levene p=.004), self-organisation (T-Test p<.001, Levene p=.020) and problem-solving competence (T-Test p=.011, Levene p=.019) show significant correlation between age and assessment.

Added Value: First, results reveal a ranking of needed competencies for the digital transition, which companies and educational institutions should address. Second, differences between the employee groups could be discovered which have to be considered in the further approach, be it education or research.

How to regionalize survey data with microgeographic data

Barbara Wawrzyniak, Julia Kroth

Infas360 GmbH, Germany

Relevance & Research Question: Using an online access panel of 10,000 households to precise target groups and their area-wide and microgeographical prediction for all regions and addresses

Methods & Data: The basis for the estimation is an online survey with 10,000 participants for which we have postal addresses. The addresses are crucial to match the microgeographic data from infas 360. The process of matching is called geocoding. Geocoding validates and locates addresses and integrates the individual geo key to which the microgeographical data are also attached. This database contains nearly 400 microgeographic variables to each of the 22 million unique addresses, e.g. building typology, building use, number of private households, average monthly net income, number of foreigners per block, unemployment rate per city district. Through the combination of the survey data and the microgeographic data, we localize, describe and analyze in a first step the responding households geographically, socio-demographically and according to other relevant characteristics. In a second step, we estimate the target group at 40 million households in Germany using a multi-level model. We demonstrate the procedure by means of an example, which is the prediction of the number of dogs per city district.

Results: Among other things the analysis in this example shows that dog owners live in houses with large gardens and are located mostly at the town border. Usually they live in regions with a high purchasing power and with a low migration share. This information is used to predict the number of dogs for all addresses with private households. Besides the quality criteria for the estimation model, we compare the results with official data from the Statistical Office Berlin on the level of city district. The comparison shows that the estimated data reflect the official data very well.

Added Value: Linking the online survey data with micro-geographical data from infas 360 enables a comprehensive description and segmentation of target groups as well as their address-specific transfer to the overall market as a prediction with astonishing precision. The use of an online access panel is cost efficient, flexible in time – and it is regionalizable.

"Like me": The impact of following prime ministerial candidates on social networks on perceived public agendas

Dana Weimann Saks, Vered Elishar Malka, Yaron Ariel, Ruth Avidar

Academic College of Emek Yezreel, Israel

Relevance & Research Question: Agenda-Setting research has been performed for more than four decades now as a matter of routine, in both traditional and online media. Recent years have seen a growing political use of social media messaging, especially during elections campaigns. The current study analyzes the effects of prime ministerial candidates' online messages on their followers' perceived agenda, as a function of the following patterns and the voting intentions, on which these followers have reported.

Methods & Data: To answer these questions, a representative sample of Israeli voters (n=1600) have answered a detailed Questionnaire. Questions regarded voting intentions, patterns of following prime ministerial candidates' accounts on social networking sites, and the followers perceived agenda.

Results: 43% reported that they follow a social network account of at least one prime minister candidate. 88% of them follow candidates through Facebook, 19% follow through Twitter and 18% through Instagram. The prominence of issues on the agenda differed significantly between those who reported that they followed candidates to those who did not follow them. Among the voters of the ruling party, 80% followed the party candidate exclusively, and 18% follow the party candidate in addition to the candidate of the leading opposition party. Among leading opposition party's voters, 48% follow the opposition candidate exclusively, and 47% also followed the ruling party candidate. The prominence of the issues on the agenda differed significantly between the exclusive followers of each of the two candidates.

Added Value:The present study indicates that social networking sites have a substantial impact on followers' perceived agenda during election campaigns. As social networks turn prominent than ever, it has become highly essential to deepen our understanding of their unique role in the political arena.

Social-media based research: the influence of motivation and satisficing on empirical results

Daniela Wetzelhütter1, Dimitri Prandner2, Sebastian Martin1

1University of Applied Sciences Upper Austria, Austria; 2Johannes Kepler University Linz, Austria

Relevance & Research Question: Frequently business decisions are relying on data collected through self-administered web surveys. The arguments derived from such datasets are based on non-probability convenience samples, provided by a self-selection of social media users completing online questionnaires. This kind of research is focusing on informative samples. Troubles caused by a lack of possibilities to test the representability are pushed into the background. However, projects rooted in such approaches have the potential to further increase undesirable respondent behaviors that include but may not be limited to speeding or straight lining. Not least because social media usage is connected to the users motivation, as it applies for survey participants. Based on this the present paper addresses two research questions: i) whether and to what extend does motivation (in connection with satisficing) influence the results of social-media surveys? ii) Are different effect detectable, caused by different accesses to the field?

Methods & Data: Analyses are based on three different data sets of social-media surveys. The first consists of 104 users of online forums for “gamers”, the second of 234 Facebook-users, which followed an invitation to evaluate the Facebook dialog with public utility companies and the third dealt with politics and was composed from a quota sample based on the Facebook network of students. In order to measure motivation and satisficing, individual indicators of motivation were included in the questionnaire, while satisficing strategies were measured by paradata (e.g. item-nonresponse, speeding). Correlation analyses and linear regressions were conducted.

Results: Indicators of motivation and of satisficing are just partly correlated. Unmotivated gamers are rather clicking though the survey (speeding), while on energy or politics interested, but unmotivated, Facebook-user, are rather causing item-nonresponses. The effect of motivation and satisficing on substantial results differs as well. Non-differentiation is of more importance following the results of gamers, while indicators of motivation has to be considered when the results of “unmotivated” Facebook-user are interpreted.

Added Value: The presentation raises awareness for the relevance of motivation and subsequently satisficing of social-media survey participants. Conducting such information is important in order to interpret the results of such informative samples appropriately.

1:20 - 2:20P 1.4: Poster IV

Data quality in ambulatory assessment studies: Investigating the role of participant burden and presentation form

Charlotte Ottenstein

University of Koblenz-Landau, Germany

Relevance & Research Question: Parallel to the technical development of mobile devices and smartphones, the interest in conducting ambulatory assessment studies has rapidly grown during the last decades. Participants of those studies are usually asked to repeatedly fill out short questionnaires. Besides numerous advantages, such as a reduced recall bias and a high ecological validity, the participant burden is higher than in cross-sectional studies or classical longitudinal studies. In our study, we experimentally manipulated participant burden (questionnaire length low vs. high) to investigate whether higher participant burden leads to lower data quality (e.g., as indicated by the compliance rate and careless responding indices). Moreover, we aimed to analyze effects of participant burden on the association between state extraversion and pleasant mood. We provided the questionnaires on two different platforms (questionnaire app vs. online questionnaire with link via e-mail) to investigate differences in the usability of those two presentation forms.

Methods & Data: Data were collected via online questionnaires and a smartphone application. After an initial online questionnaire (socio-demographic measures), participants were randomly assigned to one of four experimental groups (short vs. long questionnaires x app vs. online questionnaire). The ambulatory assessment phase lasted three weeks (one prompt per day in the evening). Participants rated the extent of situational characteristics, momentary mood, state personality, daily life satisfaction, depression, anxiety, stress, and subjective burden due to study participation. At the end of the study, participants filled out a short online questionnaire about their overall impression and technical issues. We computed the required sample size for mean differences (two-way ANOVA). 245 participants were needed to detect a small to medium effect (Φ = 0.18, power > 80%).

Results: Data collection is still running. The results of the preregistered hypotheses will be presented at the conference.

Added Value: This study can add insights into the effects of participant burden on data quality in ambulatory assessment studies. The results might serve as a basis for recommendations about the design of those studies. It would be desirable to conduct future ambulatory assessment studies in a way that participants are not overly stressed by their study participation.

Guess what I am doing? Identifying Physical Activities from Accelerometer data by Machine Learning and Deep Learning

Joris Mulder, Natalia Kieruj, Pradeep Kumar, Seyit Hocuk

CentERdata - Tilburg University, The Netherlands

Relevance & Research Question:

Accelerometers or actigraphs have long been a costly investment for measuring physical activity, but nowadays they have become much more affordable. Currently, they are used in many research projects, providing highly detailed, objectively measured sensory data. Where self-report data might miss everyday life active behaviors (e.g. walking to the shop, climbing stairs) accelerometer data provides a more complete picture of physical activity. The main objective of this research is identifying specific activity patterns using machine learning techniques and the secondary objective is improving the accuracy of identifying the specific activity patterns by validating activities through time-use data and survey data.

Methods & Data:

Activity data was collected through a large-scale accelerometer study in the probability-based Dutch LISS panel, consisting of 5.000 households. 1200 respondents participated in the study and wore a GeneActiv device for 8 days and nights, measuring physical activity 24/7. In addition, a diverse group of 20 people labeled specific activity patterns by wearing the device and performing the activities. These labeled data were used to train supervised machine-learning models (i.e. support vector machine, random forest) detecting specific activity patterns. A deep learning model was trained to enhance the detection of the activities. Moreover, 450 respondents from the accelerometer study also participated in a time-use study in the LISS panel. Respondents recorded their daily activities for two days (weekday and weekendday) on a smartphone, using a time-use app. The labeled activities were used to validate the predicted activities.


Activity patterns of specific activities (i.e. sleeping, sitting, walking, cycling, jogging, tooth brushing) were successfully identified using machine learning. The deep learning model increased predictive power to better distinguish between specific activities. The time-use data proved to be useful to further validate certain hard to identify activities (i.e. cycling).

Added Value:

We show how machine learning and deep learning can identify specific activity types from an accelerometer signal and how to validate activities by time-use data. Gaining insight in physical activity behavior can, for instance, be useful for health and activity research.

Embedding Citizen Surveys in Effective Local Participation Strategies

Fabian Lauterbach, Marc Schaefer

wer denkt was GmbH, Germany

Relevance & Research Question: Citizen surveys as an initiating and innovative form of participation are becoming increasingly popular. They are advocated as a cost-effective and purposeful method of enhancing the public basis of the policy-making process, thus representing an appealing first step towards participation for local governments. With the rapid advance and increasing acceptance of the Internet it is now possible to reach a sufficiently large number of people from various population groups. However, in order to exploit these advantages to their full potential, it is important to gain insights into how to maximise the perceived impact and success for citizens. Ergo, how should local municipalities design and follow up on citizen surveys?

Methods & Data: We want to present key insights into citizen surveys as a participatory driving force based on more than twenty citizen surveys of various sizes and on various topics with over 12,000 participants in numerous municipalities (e.g. Alsfeld, Friedrichshafen, Konstanz, Marburg). More precisely, our main focus lies on the effective communication with the target population at the beginning of the process and the subsequent processing, visualisation and presentation of survey results.

Results: Citizen surveys can be used as an initiating process of enhancing political mobilisation and participation in the context of broader political processes, provided that rules and conditions are communicated early & clearly. Consulting citizens first but then deciding contrastingly is the worst imaginable approach, and yet this is still continuously occurring in practice. Key factors which contribute to the success of a survey include an objective evaluation, a thourough analysis and the usage of its results as future guidelines for policy-making.

Added Value: While citizen surveys are particularly well suited for initiating participation, it often remains unclear, how citizens perceive the impact their participation has and the overall success of the survey. Although there has been extensive research and debate about the specific design, the issues of preparating and following-up on citizens in order to promote responsiveness and efficiency, has – up until now – been widely. Accordingly, we seek to advance knowledge on these essential, yet scarcely studied, stages of implementation.

Cognitive load in multi device web surveys - Disentangling the mobile device effect

Ellen Laupper, Lars Balzer

Swiss Federal Institute for Vocational Education and Training SFIVET, Switzerland

Relevance & Research Question: Increased survey completion time for mobile respondents’ completing web surveys is one of the most persistent findings. Furthermore, it is often used as a direct measure of cognitive load. However, as the measurement and interpretation of completion time faces various challenges, in our study we examined which possible sources of device differences ad to cognitive load, operationalized as completion time (objective indicator) as well as perceived cognitive load (subjective indicator). Furthermore, we wanted to examine whether cognitive load was functioning as a mediator between these sources and several data quality indices as proposed in the "Model of the impact of the mode of data collection on the data collected" by Tourangeau and colleagues.

Methods & Data: An extra questionnaire was added to our institutions mobile optimized, routinely used online course evaluation questionnaire. Key variables like distraction, multitasking, presence of others, attitude toward course evaluation in general as well as mobile device use were assessed. Additionally, paradata like device type and completion time were collected.

The sample consisted of participants of 107 mostly one-day continuing training courses for VET/PET professionals from the Italian-speaking part of Switzerland (N=1795).

Results: Consistent with previous research we found for mobile device use a self-selection bias and more reported distractions in the mobile completion situation as well as longer completion times and a higher perceived cognitive load. Several data quality indices like breakoff rate and item nonresponse were higher too, whereas straightlining was less. In addition, we found that the key variables in our study predicted the objective and subjective indicator of cognitive load differently and to a varying degree.

Added Value: The presented study suggests that cognitive load is a multifaceted construct. Its findings add to the existing limited knowledge on the question which survey factors are related to which aspect of cognitive load and how these in turn are related to different data quality indices.

Assessing Panel Conditioning in the GESIS Panel: Comparing Novice and Experienced Respondents

Fabienne Kraemer1, Joanna Koßmann2, Michael Bosnjak2, Henning Silber1, Bella Struminskaya3, Bernd Weiß1

1GESIS Leibniz Institute for the Social Sciences, Germany; 2ZPID - Leibniz-Institute for Psychology Information, Germany; 3Utrecht University, The Netherlands

Relevance and Research Question:

Longitudinal surveys allow researchers to study stability and change over time and to make statements about causal relationships. However, panel studies also hold methodological drawbacks, such as the threat of panel conditioning effects (PCE), which are defined as artificial changes over time due to repeated survey participation. Accordingly, researchers cannot differentiate “real” change in respondents’ attitudes, knowledge, and behavior from change that occurred solely as a result of prior survey participation which may undermine the results of their analyses. Therefore, a closer analysis of the existence and magnitude of PCE is crucial.

Methods and Data:

In the present research, we will investigate the existence and magnitude of PCE within the GESIS Panel - a probability-based mixed-mode access panel, administered bimonthly to a random sample of the German-speaking population aged 18+ years. To account for panel attrition, a refreshment sample was drawn in 2016. Due to the incorporation of the refreshment sample, it is possible to conduct between-subject comparisons for the different cohorts of the panel in order to identify PCE. We expect differences between the cohorts regarding response latencies, the extent of straightlining, the prevalence of don’t know-options, and the extent of socially desirable responding. Specifically, we expect that more experienced respondents show shorter response latencies due to previous reflection and familiarity with the answering process. Secondly, experienced respondents are expected to show more satisficing (straightlining, speeding, prevalence of don’t know-options). Finally, becoming familiar with the survey process might decrease the likelihood of socially desirable responding of experienced respondents.


Since this research is work in progress and related to a DFG-funded project which just started in December last year, we do not have results yet, but will present first results at the GOR conference in March.

Added value:

PCE can negatively affect the validity of widely used longitudinal surveys and thus, undermine the results of a multitude of analyses that are based on the respective panel data. Therefore, our findings will make a further contribution to the investigation of PCE on data quality and may encourage similar analyses with similar data sets in other countries.

1:20 - 2:20P 1.5: Poster V

Indirect questioning techniques: An effective means to increase the validity of online surveys

Adrian Hoffmann, Julia Meisters, Jochen Musch

University of Duesseldorf, Germany

Relevance & Research Question: The validity of surveys on sensitive issues is threatened by the influence of social desirability bias. Even in anonymous online surveys, some respondents try to make a good impression by responding in line with social norms rather than truthfully. This results in an underestimation of the prevalence of socially undesirable attitudes and behaviors. Indirect questioning techniques such as the Crosswise Model claim to control the influence of social desirability and to thereby increase the proportion of honest answers. The lower efficiency of indirect questioning techniques requires the use of larger samples that are more easily obtained online. We empirically investigated whether indirect questioning techniques indeed lead to more valid results.

Methods & Data: In a series of experiments we surveyed several thousand participants about different sensitive attitudes and behaviors. We randomly assigned the respondents to either a conventional direct questioning condition or an indirect questioning condition using the Crosswise Model. Prevalence estimates from the different conditions were compared using the “more is better” criterion. According to this criterion, higher estimates for socially undesirable attributes are potentially less distorted by the influence of social desirability and thus more valid.

Results: We found that higher and thus potentially more valid prevalence estimates could be obtained in the Crosswise Model conditions compared to the conventional direct questioning conditions. This finding shows that indirect questioning techniques were indeed capable of controlling the influence of social desirability and could thus increase the validity of the prevalence estimates obtained.

Added Value: Untruthful answers to questions on sensitive topics pose a serious threat to the validity of survey results. Our studies show that even in anonymous online surveys, the proportion of honest answers can be further increased through the use of indirect questioning techniques such as the Crosswise Model. Against this background, indirect questioning techniques appear as an effective means to overcome the harmful influence of social desirability bias on the results of online surveys.

Gender Differences Regarding the Perception of Artificial Intelligence

Swetlana Franken, Nina Mauritz, Malte Wattenberg

Bielefeld University of Applied Sciences, Germany

Relevance & Research Question: Technical progress through digitalisation is constantly increasing. Currently, the most relevant and technically sophisticated technology is artificial intelligence (AI). Women are less frequently involved in research and development on AI, clearly in the minority in STEM-professions and study programmes, and less frequently in management positions. Previous AI applications have often been based on data that under-represents women and thus map our society with existing disadvantages and injustices.

So do men and women have different ideas about the role and significance of AI in the future? Do women have different requirements or wishes for AI?

Methods & Data: Following the previously conducted in-depth literature research, a combination of qualitative interview study [n=6] and quantitative online survey [n=200] is planned. The target group will consist of company representatives and students whereby the evaluation of differences and correlations will be based in particular on gender.

Results: A literature review of existing studies reveals that while more people are in favour of AI development than against it, it is mainly men with a high level of education and income. According to their self-awareness, women have a lower understanding of AI than men. Moreover, AI research and development is predominantly in the hands of men. Just under 25% of those employed in the AI sector are women, in Germany even only 16%. Old stereotypes are thus not only the basis for decisions regarding the development of AI but also incorporated into the data basis for AI: Voice and speech recognition systems are less reliable for female voices, as is face recognition for female faces. Search engines more often present male-connoted image and text results for gender-neutral search terms. The expected results of the questionnaire will be gender-relevant aspects in the perception, evaluation, development and use of AI.

Added Value: The identification of gender-relevant differences in the perception and attitude towards AI will enable developers and researchers to be sensitised to the possible risks of AI applications in terms of prejudice and discrimination. In addition, opportunities for using AI to strengthen gender equality will be recognized.

“Weather and timing is to blame - additional influences towards data quality in social media research”

Daniela Wetzelhütter1, Sebastian Martin1, Birgit Grüb2

1University of Applied Sciences Upper Austria, Austria; 2Johannes Kepler University Linz, Austria

Relevance & Research Question: Weather is one unescapable environmental factor in human live. It significantly affects human behavior (e.g. daily activities), mood (e.g. helpfulness), well-being (ability to perform cognitive tasks) and communication (e.g. happiness of tweets). Human behavior, including communication, is also related to timing as another influencing factor, since days are determined by work-life routines (spending time at work), day-of-the-week (weekday vs. weekend) and holidays. The emotional status and mood is affected by the time-of-the-day and day-of-the-week (e.g. blue Monday), too. In further consequence, it cannot be excluded that weather and timing influences online communication. Therefore, this research aims to show the differences in the nature of data of social media communication due to the time of data collection - in connection with the then prevailing weather conditions.

Methods & Data: 321 postings, published at a public utilities official Facebook-account in the time-frame between August 2016 and February 2018 are taken into account. The influence of weather and timing on the posting content (company-relation), the posting visualization, the posting length and subsequently on the stakeholders’ reactions (number), comments (yes/no) and shares (yes/no) are examined by means of multivariate analyses.

Results: Temperature and barometric pressure are influencing – quite consistent – the company’s released postings, while precipitation and humidity is more decisive for the users reply to a post. The timing, on the other hand, shows rather unstable influences. Nevertheless, both factors influence the virtual communication and the data quality in social media research.

Added Value: The presentation raises awareness for the relevance of weather and timing - when investigating the behavior in social media, researchers need to be aware of the effect of both.

Are you willing to donate? Relationship between perceived website design and willingness to donate

Louisa Küchler, Guido Hertel, Meinald Thielsch

Westfälische Wilhelms-Universität Münster, Germany

Relevance & Research Question:

Online fundraising is becoming increasingly important for non-profit organisations, but the

factors that convince people to make a donation online have not yet been fully investigated.

Methods & Data:

In the present work, data of two studies (total N = 2525) was used to examine factors of online donation. Probands completed an online survey, where they rated the design of specific websites as well as gave a statement about a possible donation to this website.

An effect of website design (as content design, usability and aesthetics) on willingness to donate was postulated. Furthermore, research questions about demographic aspects such as age and gender as well as trust in the organization were posed. For statistical analysis, logistic regressions were performed.


The results showed different predictors of donation in different scenarios of donation.

For the donation of one's own money, the perceived content (Odds Ratio = 1.99) and trust in the organization (OR = 1.95) showed the greatest correlations.The usability, on the other hand, showed a negative correlation to this, the effect strength was clearly smaller here (OR = 0.76). In this model, the perceived aesthetics of the website showed no significant correlation to the dependent variable.

When donating other people's money, the aesthetics of the website was the most important factor for the willingness to donate (OR = 1.35). The logistic regression showed that the majority of predictors had no significant relationship to the dependent variable.

The demographic variables showed different correlations between the studies.

Added Value:

The relevance of the design of the website, but also of trust in the organization, was shown, from which the first implications for online fundraising can be derived. The effects seem to vary for different scenarios of online donation. The differences between the two scenarios can be explained by the increased relevance of the decision, which results from donating one's own money. Therefore, more factors are included and reflected in such a decision. Regarding this, further research is necessary, examining influences of other variables and establishing implications for successful digital donation generation in the healthcare sector.

2:20 - 2:30Break
2:30 - 3:20Keynote 1

Market Research Blends with AI and Analytics – “Market Research Digital Transformation”

Patricio Pagani

The Black Puma Ai, Argentine Republic

Establishing a digital-connection with our customers on a superficial level is not hard. Chatbots have been around for years and however that’s not really a deep connection. Consumers in most industries expect significantly higher levels of personalisation of our products and services. But mass-personalisation requires companies to bring together and analyse huge volumes of data that the Market Research Industry is not used to analysing. So what’s going to be our role in this new world?

Client Business Intelligence (BI) and Advanced Analytics departments all around the world are trying to align and harmonise badly-structured data. And they are not calling market research traditional firms to do it. Are they using AI? Not yet, but will they?

What’s the state of the art of AI in other industries and how is that relevant to the world of MR? I invite you to brainstorm together.

Patricio Pagani is the founder of The Black Puma Ai, a company dedicated to blending the power of human and digital brains to augment organizational intelligence. Patricio is also a Digital Transformation catalyst that is helping large corporations embrace what the future holds for them.
An Angel Investor in technology startups (mobility, IOT), Patricio has a portfolio of companies he's advising. Also, he is a Board director at Infotools, a leading provider of market research software tools & services.
A sought-after keynote speaker at various marketing forums, you will find Patricio discussing what the future may hold for the Business Intelligence and Market Research industry. Patricio used to be the president of the New Zealand Market Research Society for several years and is currently the ESOMAR representative for Argentina.

3:20 - 3:30Break
3:30 - 4:30A 3.1: New Technologies in Surveys
Session Chair: Bella Struminskaya, Utrecht University, Netherlands, The

Effects of the Self-View Window during Videomediated Survey Interviews: An Eye-tracking Study

Shelley Feuer1, Michael F. Schober2

1U.S. Census Bureau, United States of America; 2The New School for Social Research, United States of America

Relevance & Research Question: In videomediated (Skype) survey interviews, how will the small self-view window affect people's disclosure of sensitive information and self-reported feelings of comfort during the interviews? This study replicates and expands on previous research by (a) tracking where video survey respondents look on the screen—at the interviewer, at the self-view, or elsewhere—while answering questions and (b) examining how gaze location and duration differ for sensitive vs. nonsensitive questions and for more and less socially desirable answers.

Methods & Data: In a laboratory experiment, 133 respondents answered sensitive questions (e.g. sexual behaviors) and nonsensitive questions (e.g. reading novels) taken from large scale US government and social scientific surveys over Skype, either with or without a self-view window. Respondents were randomly assigned to having a self-view or not, and interviewers were unaware of the self-view manipulation. Measures of gaze were recorded using an unobtrusive eye-tracking system.

Results: The results show that respondents who could see themselves looked more at the interviewer during question-answer sequences about sensitive (compared to nonsensitive) questions, while respondents without a self-view window did not. Respondents who looked more at the self-view window reported feeling less self-conscious and less worried about how they presented to the interviewer during the interview. Additionally, the self-view window increased disclosure for a subset of sensitive questions, specifically, total number of sex partners and frequency of alcohol use. Respondents who could see themselves reported perceiving the interviewer as more empathic, and reported having thought more about what they said (arguably reflecting increased self-awareness). For all respondents, gaze aversion—looking away from the screen entirely—was linked to sensitive (or socially undesirable) responses and self-presentation concerns.

Added Value: Together, the findings demonstrate that gaze patterns in videomediated interviews can be informative about respondents’ experience and their response processes. The promise is that findings like these can contribute to the design of new, potentially cost-saving video-based data collection interfaces. This study also provides necessary groundwork for continued investigation not only of mode effects on disclosure in surveys (as one measure of response accuracy) but also on interactive discourse more generally.

Measuring expenditure with a mobile app: How do nonprobability and probability panels compare?

Carina Cornesse1, Annette Jäckle2, Alexander Wenz1,2, Mick Couper3

1University of Mannheim, Germany; 2University of Essex, United Kingdom; 3University of Michigan, United States of America

Relevance & Research Question: So far, a number of studies have examined nonprobability and probability-based panels, but mostly only with regard to survey sample accuracy. In this presentation, we compare nonprobability and probability-based panels on a new dimension: we examine what happens when panel members are asked to use a mobile app to record their spending. We answer the following research questions: Do different types of people participate in the app study? Are there differences in how participants use the app? Do differences between samples matter for key outcomes? And do differences between samples remain after weighting?

Methods & Data: To answer our research questions, we use data from Spending Study 2, which is an app study that was implemented in May to December 2018 in two different panels in Great Britain: Understanding Society Innovation Panel, which is a probability-based panel, and Lightspeed UK, which is a nonprobability online access panel. In both panels, participants were asked to download a mobile app and use it for one month to report their spending. In our presentation, we compare the app data collected from the participants of the two panels.

Results: Our analyses show that different people participate in the app study implemented in the nonprobability and probability-based panel, both in terms of socio-demographic characteristics and with regard to digital affinity and financial behavior. Furthermore, the app study leads to different conclusions in terms of key substantive outcomes, such as the total amount and type of spending. Moreover, differences between the app study samples on substantive variables remain after weighting for socio-demographic characteristics. Only the way in which the app study participants use the app does not seem to differ between the panels.

Added Value: Our study contributes to the ongoing discussion on nonprobability and probability-based panels by adding new empirical evidence. Moreover, our study is the first to examine app study data rather than survey data. Furthermore, it covers a wide range of data quality aspects, including sample accuracy, respondent participation behavior, and weighting procedures. We thereby contribute to widening the debate to non-survey data and multi-dimensional data quality assessments.

Are respondents on the move when filling out a mobile web survey? Evidence from an app- and browser-based survey of the general population

Jessica Herzing1, Caroline Roberts1, Daniel Gatica-Perez2

1Université de Lausanne, Switzerland; 2EPFL and Idiap, Switzerland

Relevance & Research Question: Mobile devices are designed to be used while people are on the move. In the context of a mobile web survey, researchers should consider the potential consequences of respondent mobility for data quality. Being exposed to sources of distraction could result in suboptimal answers and an increased risk of breakoff. This study investigates whether there are between-device differences (web and mobile web browser vs. smartphone app) in terms of the context in which questionnaires are completed. We consider: 1) day, time and location of survey participation; 2) whether participants’ location changes during completion; and 3) whether differences in completion context are related to breakoffs and item nonresponse.

Methods & Data: We use data from an experiment embedded in a three-wave probability, general population survey conducted in Switzerland in 2019 (N=2,000). Half the sample was assigned to an app-based survey; the other half to a browser-based survey, encouraging mobile web completion. We use a combination of questionnaire data (on current location), paradata (timestamps and location indicators), and respondents’ photos of their surroundings taken at the beginning and end of the survey to gain insight into completion conditions.

Results: Our results suggest a minority of respondents was ‘on the move’ while filling out the survey questionnaires. Mobile web browser users were more likely to answer in the evening, while PC browser users responded in the late afternoon. Photographs indicate that app users tended to complete the survey at home, although the app was designed to be used on the move (using a modular design with questionnaire chunks which took less than three minutes). Furthermore, app users were unwilling to move outside to complete a different photo task.

Added value: The findings inform the design of mobile web surveys, providing insights into ways to optimise data collection protocols (e.g. by tailoring the timing of survey requests in a mixed device panel design), and to improve the onboarding procedure for smartphone app respondents. The provision of unique log-in credentials may have inhibited participant mobility and the possibility to take advantage of this key feature of mobile internet technology.

3:30 - 4:30A 3.2: Scales and Questions
Session Chair: Florian Keusch, University of Mannheim, Germany

Measuring income (in)equality: comparing questions with unipolar and bipolar scales in a probability-based online panel

Jan Karem Höhne1,2, Dagmar Krebs3, Steffen Kühnel4

1University of Mannheim, Germany; 2RECSM-Universitat Pompeu Fabra, Spain; 3University of Gießen, Germany; 4University of Göttingen, Germany

Relevance & Research Question: In social science research, questions with unipolar and bipolar scales are commonly used methods in measuring respondents’ attitudes and opinions. Compared to other rating scale characteristics, such as scale direction and length, scale polarity (unipolar and bipolar) and its effects on response behavior have been rarely addressed in previous research. To fill this gap in the survey literature, we investigate whether and to what extent fully verbalized unipolar and bipolar scales influence response behavior by analyzing observed and latent response distributions and latent thresholds of response categories.

Methods & Data: For this purpose, we conducted a survey experiment in the probability-based German Internet Panel (N = 2,427) in March 2019 and randomly assigned respondents to one of the following two groups: the first group received four questions on income (in)equality with a five-point, fully verbalized unipolar scale (i.e., agree strongly, agree somewhat, agree moderately, agree hardly, agree not at all). The second group received the same four questions on income (in)equality with a five-point, fully verbalized bipolar scale (i.e., agree strongly, agree somewhat, neither agree nor disagree, disagree somewhat, disagree strongly).

Results: The results reveal substantial differences between the two rating scales. They show significantly different response distributions and measurement non-invariance. In addition, response categories (and latent thresholds) of unipolar and bipolar scales are not equally distributed. The findings show that responses to questions with unipolar and bipolar scales differ not only on the observational level, but also on the latent level.

Added Value: Both rating scales vary with respect to their measurement properties, so that the responses obtained using each scale are not easily comparable. We therefore recommend not considering unipolar and bipolar scales as interchangeable.

Designing Grid Questions in Smartphone Surveys: A Review of Current Practice and Data Quality Implications

Gregor Čehovin, Nejc Berzelak

University of Ljubljana, Slovenia

Relevance & Research Question: Designing grid questions for smartphone surveys is challenging due to their complexity and potential increase in response burden. This paper comprehensively reviews the findings of scientific studies on several data quality and response behavior indicators for grid questions in smartphone web surveys: satisficing, missing data, social desirability, measurement quality, multitasking, response times, subjective survey evaluation, and comparability between devices. This framework is used to discuss different grid question design approaches and their data quality implications.

Methods & Data: Experimental studies investigating grids in smartphone surveys were identified using the DiKUL bibliographic harvester that includes over 135 bibliographic databases. The string “’mobile web survey’ AND (grid OR scale OR matrix or table)” returned 55 results. After full-text evaluation, 35 papers published in English between 2012 and 2018 were found eligible for extraction of findings regarding the eight groups of data quality and response behavior indicators.

Results: Grid questions tend to increase self-reported burden and satisficing behavior. The incidence of missing data increases with the number of items per page and per grid. The comparisons between smartphones and PCs yield largely mixed results. While completion times are decisively longer on smartphones, the grid format has little to no effect on response times compared to item-by-item presentation. Differences in satisficing are modest and observations about the relationship between missing data and device type are mixed. No effects of device type on socially desirable responding were detected, while differences in measurement quality are mostly limited to worse input accuracy and biased estimates on smartphones due to noncoverage and nonresponse error. Mixed are also the findings about differences in multitasking.

Added Value: Data quality remains a salient issue in web surveys, as well as in the context of the visual syntax that defines the design of survey questions on different devices. This review of current practice offers insights into data quality implications of the principles for designing grid questions in smartphone web surveys and their comparability to web questionnaires on PCs. The critical elaboration of findings also provides a guidance for future experimental research and usability evaluation of web questionnaires.

The effects of forced choice, soft prompt and no prompt option on data quality in web surveys - Results of a methodological experiment

Johannes Lemcke, Stefan Albrecht, Sophie Schertell, Matthias Wetzstein

Robert Koch Institut, Germany

Relevance & Research Question: In survey research item nonresponse is regarded as an important criterion for data quality among other quality indicators (e.g. breakoff rate, straightlining etc.) (Blasius & Thiessen, 2012). This originates from the fact that, as with the unit nonresponse rate, persons who do not answer a specific item can systematically differ from those who do. In online surveys this threat can be countered by using the possibility of prompting after item non-response. In this case prompting means a friendly reminder displayed to the respondent, uniquely inviting him to give his answer. If the respondent does not want to answer it is possible to move on in the questionnaire (soft prompt). The forced choice option however requires a response on every item.

There is still a research gap on the effects of prompting or forced choice options in web surveys on data quality. Tourangeau (2013) also comes to the following conclusion: ‘More research is needed, especially on the potential trade-off between missing data and the quality of the responses when answers are required’.

Methods & Data: We conducted a methodological experiment using a non-probability sample recruited over a social network platform in January 2019. To test the different prompting options we implemented three experimental groups (forced choice, soft prompt, no prompt) (Total N = 1,200). Besides item-nonresponse rate we used the following data quality indicators: breakoff rate, straightlining behavior, duration time, tendency to give social desirable answers and self-reported interest.

Results:The results show a higher breakoff rate for specific questions where forced choice was applied. Furthermore a higher item nonresponse rate was found for the no prompt option. Overall only small effects were found. Final results on the different effects on the data quality will be presented at the conference.

Added Value: We found little to no evidence on the impact of prompt options on data quality. However, we found that soft prompt tends to lead to lower item nonresponse compared to no prompt.

3:30 - 4:30B 3: Smartphone and Sensors as Research Tools
Session Chair: Stefan Oglesby, data IQ AG, Switzerland

How does (work related) smartphone usage correlate with levels of exhaustion

Georg-Christoph Haas1,2, Sabine Sonnentag2, Frauke Kreuter1,2,3

1Institut für Arbeitsmarkt- und Berufsforschung der Bundesagentur für Arbeit (IAB), Germany; 2University of Mannheim, Germany; 3University of Maryland, United States of America

Relevance & Research Question: Smartphones make digital media and other digital means of communication constantly available to individuals. This constant availability may have a significant impact on individuals exhaustion levels. In addition, being available often brings social pressure (e.g., "telepressure") at work or at home that may lead to a further increase in exhaustion at the end of the day. On the other side, constant connectivity may enable frequent contact to one’s social networks what might decrease exhaustion. We examine whether employees perceive "being available" as a burden or as a resource in their daily work.

Methods & Data: We use a combination of data from a probability based population panel from Germany (Panel Study Labour Market and Social Security -- PASS) and a research app (IAB-SMART), which passively collected smartphone data (e.g. location, app usage) and administered short daily surveys. Since app participants (N=651) were recruited from PASS, we are able to link both data sources. The PASS data provides us with sociodemographic variables, e.g. age, education, gender etc. and background information, which enables us to calculate population weights. From the passively collected app data, we can construct a series of predictors like daily smartphone usage and instant switches between apps. The level of exhaustion is measured by a survey question, which was daily repeated for seven days every three months, i.e., we have one to 14 measures per individual. Considering several selection processes within the data collection, we end up with an analysis sample of 163 individuals with 693 days that we use in a multilevel regression model.

Results: Our analysis is in an early stage. Therefore, we are not able to share results at the time of submitting this abstract.

Added Value: First, we assess if and how daily smartphone usage correlate with levels of exhaustion for individuals. Second, our analysis shows how a combination of survey and passive data can be used to answer a substantial question. Third, we share our experience of how to feature engineer variables from unstructured mobile phone data to valid variables that may be used in a variety of field in general online research.

The quality of measurements in a smartphone-app to measure travel behaviour for a probability sample of people from the Netherlands

Peter Lugtig1, Danielle Mccool1,2, Barry Schouten2,1

1Utrecht University, The Netherlands; 2Statistics Netherlands, The Netherlands

Relevance & Research Question:

Smartphone apps are starting to be commonly used to measure travel behaviour. The advantage of smartphone apps is that they can use location sensors in mobile phones to keep track of where people go at what time at relatively high precision. In this presentation, we report on a large fieldwork test conducted by Statistics Netherlands and Utrecht University in November 2018 and present on the quality of travel data using hybrid estimation using passive data and a diary-style smartphone app.

Methods & Data: A random sample of about 1900 individuals from the Dutch population register was invited by letter to install an app on their smartphone for a week. The app then tracked people's location for a week continuously. Based on an algorithm the app divided each day into “stops” and “tracks” (trips), which were fed back to respondents in a diary-style list separately for every day. Respondents were then asked to provide details on stpops and trips in the diary.


Having both sensor data and survey data allows us to investigate measurement error in stops, trips and details about these in some detail. A few types of errors may occur:

1) False positives: a stop was presented to a respondent that wasn’t a stop (and by definition also a track connecting this stop to another one).

2) False negatives: stops were missing from the diary (often because a respondent forgot the phone, or GPS tracking was not working properly).

How can we identify false positives and negative? How did respondents react to false positives, and how can we correct for this in estimates of travel behaviour?

Added Value: We will discuss each type of error, their size,and the context in which they occurred. Finally, we will discuss the overall impact of both false positive and false negatives and discuss their overall impact on the statistics of interest. We conclude with a discussion of how to generally move forward in combining sensor and survey data for tracking studies for social science, market research and official statistics

Data privacy concerns as a source of resistance to participate in surveys using a smartphone app

Caroline Roberts1,2, Jessica Herzing1,2, Daniel Gatica-Perez3,4

1University of Lausanne, Switzerland; 2FORS, Switzerland; 3EPFL, Switzerland; 4Idiap Research Institute, Switzerland

Relevance & Research Question: ---Early studies investigating willingness to participate in surveys involving smartphone data collection apps – and particularly, to consent to passive data collection – have identified concerns relating to data privacy and the security of shared personal data as an important explanatory variable. This raises important practical and theoretical challenges for survey methodologists about how best to design app-based studies in a way that fosters trust and the implications for data quality. We address the following research questions: 1) How do data privacy concerns vary among population subgroups, and as a function of internet and smartphone usage habits? 2) To what extent do expressed data privacy concerns predict stated and actual willingness to participate in an app-based survey involving passive data collection?---

Methods & Data: ---The data were collected in an experiment embedded in a three-wave probability-based, general population election study conducted in Switzerland in 2019. At wave 1, half the sample was assigned to an app-based survey, and the other half to a browser-based survey; at wave 2, the browser-based respondents were invited to switch to the app. At wave 1, respondents in both groups were asked about their attitudes to sharing different types of data and about their data privacy and security concerns. The quantitative findings are complemented with findings from user experience research.---

Results: ---Consistent with other studies, preliminary results show statistical differences in levels of concern about data privacy and the degree of comfort sharing different data types across subgroups (e.g. based on age, sex and response device) and confirm that privacy concerns are an important predictor of actual participation in a survey using an app.--

Added Value: ---Given the often weak relationship between attitudes and behaviours, and the apparent paradox between privacy attitudes and actual online data sharing behaviours, the possibility to assess how data privacy concerns affect actual participation in an app-based study of the general population is of great value. We propose avenues for future research seeking to reduce public resistance to participate in smartphone surveys involving both active and passive data collection.---

3:30 - 4:30C 3: Campaigning and Social Media
Session Chair: Pirmin Stöckle, University of Mannheim, Germany

Cross-Platform Social Media Campaigning: Comparing Strategic Political Messaging across Facebook and Twitter in the 2016 US Election

Michael Bossetta1, Jennifer Stromer-Galley2, Jeff Hemsley2

1Lund University, Sweden; 2Syracuse University, United States of America

Relevance & Research Question: This study is the first in the American context to compare political candidates’ social media communication across multiple social media platforms. This topic is relevant, as the large majority of digital political communication studies only focus on one social media platform. We therefore ask two research questions:

RQ1: Do political campaigns broadcast the same messages across multiple social media accounts, or does campaign messaging differ depending on the platform?

RQ2: What explains the similarity or difference in political campaigning across social media platforms?

Methods & Data: We combine three types of computational analysis – fuzzy string matching, automated content analysis, and machine learning classification – to compare the Facebook and Twitter posts of Hillary Clinton and Donald Trump during the 2016 U.S. Election.

Results: Our results show a relatively high degree of content recycling across platforms. At the highest level, over 60% of Clinton’s Facebook posts were also present on Twitter, whereas approximately 1/4 of the Trump campaigns posts were recycled across the two platforms. We do, however, find key strategic differences relating to how this content was conveyed to electorate. Our machine learning algorithm categorized posts by topic issues and message type, and we found the latter to be a significant predictor of platform differentiation through chi-squared tests. That is, candidates promoted the same policy issues across platforms, but the strategic intent behind their messages differed. Most notably, the Clinton campaign messaged Hispanic audiences in Spanish solely on Facebook. The Trump campaign promoted livestreams predominantly on Facebook, while reserving Twitter for broadcasting information relating to mass media interviews.

Added Value: The added value of the study is two-fold. First, while the state-of-the-art suggests candidates use different platforms for different messaging, we find a relatively high degree of content recycling across platforms. Moreover, we go beyond the existing literature and uncover what explains differences in cross-platforms posts. It is not the policy content of messages, but rather the strategic motivations that campaigns perceive in light of the audiences on each social media platform.

No need to constantly innovate: Interesting lessons from two election campaigns within a year

Yaron Ariel, Dana Weimann-Saks, Vered Elishar Malka

Academic College of Emek Yezreel, Israel

Relevance and research question: During more then a decade now, political candidates have been using central social networks as their leading platforms in election campaigns, constantly trying to improve their performance and enhance their influence over potential voters. In 2019 only, Israel has seen two general elections. Comparing patterns of online platforms' political usage between these two campaigns reveals some surprising changes: within a few months, innovative components that were used in the first election campaign were abandoned in the second (e.i. using live television broadcasts on candidates' Facebook accounts). Could these changes indicate a broader phenomenon? Can we detect evidence of a decline in exposure to innovative platforms already in the first campaign?

Methods and data: Four consecutive surveys were passed during the last month before the first 2019 Israeli general elections, with 520-542 respondents participated in each. The samples represented the Israeli voters' population and were transmitted via an online panel to match the actual distribution of the population. The questionnaire included 50 questions on voting trends, news exposure patterns, and political content exposure in traditional and new media, most of them on a Likert scale.

Results: A One-Way Analysis of Variance was conducted to examine overall exposure to online political content. No significant difference was found at the four-time points examined [F (3, 2153) =0.271, p> 0.05]. More specifically, a Kruskal-Wallis test was performed to examine whether there was a statistically significant difference in the use of various social media applications. There were no significant differences in the consumption of Facebook, Twitter, and Telegram apps; however, a change in exposure to political content was detected on Instagram [Kruskal-Wallis H = 9.42, df=3, p < 0.05] with decreased of the mean rank score.

Added value: The findings of the current study suggest that politicians' attempts to be innovative in online media are not necessarily effective. Despite politicians' efforts, there is no detectable increase in exposure to new platforms as Election Day approaches.

The Sequencing Method: Analyzing Election Campaigns with Prediction Markets

Oliver Strijbis

University of Zurich, Switzerland

Relevance & Research Question:

What are the effects of campaigns on voting behavior? Despite a long tradition in research on the question we know surprisingly little about it. A main reason is that the analysis of polls of polls—the most important method for the comparative analysis of campaign effects—is confronted with formidable methodological problems. In this paper, I propose and apply an alternative method for the analysis of campaign effects on voting behavior, which is based on prediction markets.

Methods & Data:

The proposed method to analyze campaign effects in elections and direct democratic votes is based on real money online prediction markets with automatic market scoring rules. In contrast to the current use of prediction markets, I propose to let traders bet on the probabilities according to which sequences of vote shares will be the outcome of a vote. I use original data that I have collected in the context of 16 direct democratic votes in Switzerland.


I demonstrate that on average direct democratic campaigns in Switzerland have substantial mobilizing effects and make up for about 5% of Yes-vote shares. Furthermore, campaign effects vary in predictable ways between ballots.

Added Value:

I illustrate this "sequencing method" with the first time–series cross–section analysis of direct democratic campaigns. Since this is the first paper to quantify the total effect of campaigns for direct democratic decisions it has a major impact on our understanding of direct democracy.

3:30 - 4:30D 3: GOR Best Practice Award 2020 Competition III
Session Chair: Alexandra Wachenfeld-Schell, GIM & DGOF, Germany
Session Chair: Otto Hellwig, respondi AG & DGOF, Germany

The presentations in this session will be in German.

Beyond the Real Voice of the Customer: Emotion measurement with Artificial Intelligence in advertising research

Malte Freksa1, Sandra Vitt2, Holger Lütters3, Dima Feller4, Kim Rogers1

1GapFish GmbH, Germany; 2RTL Mediengruppe, Germany; 3HTW Berlin, Germany; 4Pangea Labs, Germany

The important role of emotions in advertising is undoubted and researchers are experimenting with different ways to measure emotion. According to linguist scientist Sendlmeier “voice conveys our emotional state of mind in the most differentiated way”. With digital speech assistant technology and artificial intelligence new opportunities for research arise.

In the study voice data collection followed up by a high-end automatized emotion data analytics process was implemented in an online research design. The goal of this approach is a) to examine whether this approach is implementable from a technological point of view (realization)

b) to critically analyze the emotion analytic outcome from a research perspective (validation).

An online-representative sample was recruited: participants were confronted with an experimental setting of online video campaigns, audio ads and Instagram ads. After completing a classic item battery for advertising research, participants had to answer questions with their microphone from the device they were using. The device agnostic approach allows to include Desktop, Laptop, Tablet and Smartphone in the research design. The audio data is analyzed with two API AI approaches:

a) automatic transcription from voice into text (“what”)

b) emotional analysis of the tonality of the voice (“how”).

As results the approach offers the content and the analysis of 21 emotional facettes out of the audiofile for each participant.

This research design was effective from a technical perspective: 859 participants (out of 3760 starters) could be analyzed including emotional profiles from each answer. The voice analytic approach shows interesting divergence from the classic answer patterns. Further research approaches should focus in detail on the validity of the emotional scores from audio interactions.

The study combined for the first time an automatized data collection and analytics API approach of voice data with automation in transcription and emotional impact measurement. The promising results clearly show the power of artificial intelligence driven research approaches which will change the landscape of research very soon.

Monetization of customer value in the rail business: Improving yield, revenues and customer relationship at the same time is possible - the case of WESTbahn in Austria

Andreas Krämer1,2, Gerd Wilger2, Thomas Posch3

1University of Applied Sciences Europe, Germany; 2exeo Strategic Consulting AG, Germany; 3WESTbahn Management GmbH, Austria

Relevance & Research Question:

Since the market entry of the WESTbahn in December 2011, Austrian rail passengers have the choice been between two railway companies on the Western route (Vienna - Salzburg). WESTbahn pursues among others the goal of attracting long-term customers through a very good range of rail services at very low prices (Koroschetz 2014). In this context, the question arises whether further growth in demand is realistic even if WESTbahn better meets customers' willingness to pay thanks to a differentiated ticket structure.

Methods & Data:

In order to ensure a holistic market and customer perspective, different empirical studies were conducted in 2019 and later linked together: First. a representative study focusing on travelers on the Western Line, second a customer survey (offline) during the train journey, and third a survey of ticket buyers on Secondary data and information of sales and revenue management systems were used to validate the survey results.


While the market study supports the hypothesis that the railways as a whole have growth potential on the Western Line, it has become apparent that the price as a determinant plays a central role for future growth. To understand the opportunities for shifts in demand within the rail system, customer segmentation is essential, which describes the affinity of rail customers for WESTbahn and OEBB. In the case of the existing WESTbahn customers, there was a considerable spread in willingness to pay, which could be used to differentiate the ticket structure.

Added Value:

Since the fall of 2018, several changes have been made in WESTbahn's price and revenue management (prices have been partly raised, partly reduced, resulting in a stronger price differentiation). Customer surveys supported the project at all stages (conception, testing, implementation, monitoring). As a result, WESTbahn not only continued to grow through demand gains, but also achieved a change in the ticket mix, price levels and double-digit sales growth. At the same time, WESTbahn achieves top marks in terms of customer satisfaction and the intention to recommend.

How to identify future trends in the automotive industry at an early stage of development by relying on access panel surveys?

Patrick Schlickmann1, Jim Walker1, Heiko Rother2, Hauke Witting2

1SKOPOS GmbH & Co. KG, Germany; 2Asahi Kasei Europe GmbH, Germany

Relevance & Research Question: Due to growing environmental requirements, increasing autonomy and changes in mobility behavior, the automotive industry is facing great challenges. Suppliers are under pressure to identify and implement new developments and customer requirements at an early stage in order to keep up with the highly competitive automotive market.

One of the areas in which Asahi Kasei specializes are surfaces and acoustics for vehicle interiors. SKOPOS supports them in establishing which requirements and wishes customers will have regarding vehicle interiors of the future resulting from the changing role of car sharing and autonomous driving.

Answering these questions confronted us with the challenges of not limiting the participants in their thinking about future developments whilst simultaneously leading them towards products from the Asahi Kasei product range.

Methods & Data: We chose a standard quantitative online approach - but with a twist. We presented car drivers and those open to car sharing with a primarily quantitative questionnaire, assessing their mobility behavior and the evaluation of (future) car features regarding their own and shared cars. The twist: We also implemented open questions based on our experience with online research communities, to maximize involvement and effort leading to better and more creative output.

Results: Cleanliness inside the car, especially in car sharing, is the most important factor when it comes to the interior of a car. The overall usability of features within the interior of a car is more important than premium surfaces. Finally, participants responded to open questions with longer and more extensive answers compared to similar automotive studies.

Added Value: The holistic and individualistic approach we followed with our research has proven to be very useful for Asahi Kasei and their business problem. Beginning with the development of the questionnaire, the analysis of the collected data and finally the consultation based on the results: Despite limited budget and time constraints, we were able to lay the groundwork for Asahi Kasei’s future developments of car interiors, enabling them to present the results to current and future customers. In addition, we successfully introduced them to the wonderful world of market research!

4:30 - 5:30Virtual Get Together
Date: Friday, 11/Sep/2020
10:00Track A: Survey Research: Advancements in Online and Mobile Web Surveys
10:00Track B: Data Science: From Big Data to Smart Data
10:00Track C: Politics, Public Opinion, and Communication
10:00Track D: Digital Methods in Applied Research
10:00 - 11:20A 4: Device Effects
Session Chair: Bella Struminskaya, Utrecht University, Netherlands, The

Layout and Device Effects on Breakoff Rates in Smartphone Surveys: A Systematic Review and a Meta-Analysis

Mirjan Schulz1, Bernd Weiß1, Aigul Mavletova2, Mick P. Couper3

1GESIS Leibniz Institute for the Social Sciences, Germany; 2Higher School of Economics (HSE) Moscow, Russia; 3Michigan Population Studies Center (PSC), United States of America

Relevance & Research Question: Online survey participants increasingly complete questionnaires on their smartphones. However, a common finding in survey research is that survey respondents using mobile devices break off more often than participants using a computer. Previous research has revealed numerous aspects that potentially affect the breakoff rates. These aspects can be divided into two sections: layout features and survey related conditions. Layout features are, e.g., screen-optimized designs, ecessities to scroll, and matrix questions. The survey related conditions involve the invitation mode, reminders, compulsion for a certain device, etc. So far, the literature shows heterogeneous influences of these effects on breakoff rates. This brings us to our research question: How effective are different measures of optimizing surveys for smartphones to reduce breakoff rates of smartphone respondents?

Methods & Data: To answer this question, we collected research results regarding measurement on smartphone optimization and device effects from more than 50 papers and a variety of conference presentations published between 2007 and August 2019. By conducting a systematic review and a meta-analysis, we tested which of these predictors lower the breakoff rates in mobile web surveys. We hypothesize that mobile optimized surveys are more user-friendly, which in turn increases survey enjoyment and lowers survey burden. Consequently, lowering the survey burden leads to lower breakoff rates. We aim to examine which measures are helpful to optimize surveys for mobile devices.

Results & added Value: Based on our findings, we will present best practices from the current state of research to sustainably reduce breakoff rates in mobile web surveys. We build upon earlier findings of a meta-analysis from Mavletova and Couper (2015), add new empirical evidence, and expand their analytical framework. Our preliminary results so far show that a smartphone-optimized layout decreases breakoff rates. The final results will be available at the beginning of 2020.

Samply: A user-friendly web and smartphone application for conducting experience sampling studies

Yury Shevchenko1, Tim Kuhlmann1,2, Ulf-Dietrich Reips1

1University of Konstanz, Germany; 2University of Siegen, Germany

Relevance & Research Question:

Running an experience sampling study via smartphones is a complex undertaking. Scheduling and sending mobile notifications to participants is a tricky task because it requires the use of native mobile applications. In addition, the existing software solutions often restrict the number of possible question types. To solve these problems, we have developed a free web application that runs in any browser and can be installed on mobile phones. Using the application, researchers can create their studies, schedule notifications, and monitor users' reactions. The content of notifications is fully customizable and may include links to studies created with external survey services.

Methods & Data:

We have conducted several empirical studies to test the application and its features, such as creating different types of notifications schedules and logging participants’ interactions with notifications. First pilot testing was carried out in student projects that conducted different surveys (e.g. happiness, stress, sleep quality, dreaming) with a schedule from several days up to one week. The second study was our own experience sampling survey with a university sample that was completed during one week with notifications sent seven times a day in the two-hours intervals. We also plan a third study with online samples, the results of which will be presented at the conference.


In the first pilot study (8 projects, n = 63), we analyzed the response rate of the participants based on the logging of interactions with notifications. In addition, the design and functionality of the web application was improved following a usability survey with application users. In the second study (n = 23) we analyzed how the type of participant’s device (i.e., mobile phone) is related to the response rate. Additionally, we investigated the relationship between the interaction with notifications and the response rate in the experience sampling survey. In the third study, we plan to repeat the analysis for the sample recruited online.

Added Value:

Our application provides a direct and easy way to run experience sampling studies. It has an open-source code and is available at

The effect of layout and device on measurement invariance in web surveys

Ines Schaurer1, Katharina Meitinger2, David Bretschi1

1GESIS Leibniz Institute for the Social Sciences, Germany; 2Utrecht Universit, The Netherlands

Relevance & Research:

As the majority of online surveys nowadays are mixed-device studies of personal desktop computers (PC) and smartphones, the layout needs to be adapted to both device types. A lot of well-established constructs are usually presented in the matrix format. However, matrixes are not recommended for the use in smartphone surveys. Therefore, matrix questions are a challenge for all mixed-device studies. So far, the majority of studies that investigate the effects of layout and device on data quality have focused on indicators such as nonresponse and satisficing strategies. In our experimental study we focus on the combined effect of devices and layouts on measurement invariance.

Methods & Data:

In an experimental study we assessed the comparability of different constructs across device and layout combinations. We varied the two factors device (desktop vs. mobile device) and layout (optimized for desktop vs. optimized for smartphones vs. build-in adaptive layout), resulting in six groups of layout-device combinations. We included 5 well-established constructs with different numbers of scale points that are usually presented in a matrix format.

In October 2018 respondents from an online access panel in Germany were randomly invited to one of the six experimental groups. We applied quota sampling regarding age, sex, and education. Overall 3096 respondents finished the survey.

The experimental design allows us to examine whether the different layout settings have an impact on the perceived range of response scales and the presentation of multiple question as one conceptional unit. We evaluate whether layout and device have an impact on mean levels and whether the latent constructs are comparable across groups by the means of structural equation modelling.

Results: We find that layout and device do not impact mean levels of the constructs and we find a high level of comparability across experimental groups (scalar invariance).

Added Value:

This study provides evidence on the effect of layout choices on measurement invariance, depending on the device used. Furthermore, it offers information about comparability of results in mixed-device studies and practical guidance for designing mixed-device studies.

Measuring respondents’ same-device multitasking through paradata

Tobias Baier, Marek Fuchs

TU Darmstadt, Germany

Relevance & Research Question: As a self-administered survey mode, Web surveys allow respondents to temporarily leave the survey page and switch to another window or browser tab. This form of sequential multitasking has the potential to disrupt the response process and may reduce data quality due to respondents' distraction (Krosnick 1991). Browser data indicating respondents leaving the survey page allow to non-reactively measure respondents’ multitasking. To investigate whether page-switching respondents produce lower data quality, one has to consider how to identify and delimit this group based on the time they do not spent on the survey page. Given that very short page-switching events might occur due to slips or unintentional behavior they might not be harmful to the response process. According, the aim of this paper is to discuss the adequate time threshold to classify respondents as multitaskers.

Methods & Data: For analyses reported in this paper, two Web surveys among members of a non-probability online panel (n=1,653; n=1,148) and a Web survey among university applicants (n=1,125) conducted in 2018 were used. To measure multitasking the JavaScript tool SurveyFocus (Höhne & Schlosser 2018) was implement. The prevalence of page-switching is computed using different time thresholds (< 2 sec, < 5 sec, < 10 sec). Item-nonresponse, degree of differentiation in matrix questions and characters to open-ended questions serve as measures of data quality.

Results: Preliminary analyses indicate that 15 to 33 percent of respondents multitask at least once in the survey. Previous results on all page switchers also indicate that these respondents do not produce lower data quality. However, so far we did not differentiate between respondents with short or long time absent. The analyses presented in this paper will show whether these results change when different time thresholds are applied. Furthermore, we will investigate whether page-switching respondents differ in their characteristics, their device used and completion time depending on the time they spent absent.

Added Value: Paradata on page-switching provides an opportunity to measure respondents’ multitasking unobtrusively. This paper addresses the challenge to identify multitasking respondents based upon this data to investigate the relationship of multitasking and data quality.

10:00 - 11:20B 4: Digitalization Driving Methodical Innovation
Session Chair: Florian Keusch, University of Mannheim, Germany

Using Census, Social Security and Tax data to impute the complete Australian income distribution

Nicholas Biddle, Dinith Marasinghe

Australian National University, Australia

Relevance & Research Question: Economists/governments are deeply interested in the income distribution, the level of movement across the income distribution, and how observable characteristics predict someone's position on the distribution. These topics are answered in different countries using a combination of cross-sectional surveys, panel studies, and administrative data. Australia has been well served by sample surveys on the income distribution, but these are limited for relatively small population groups or for precise points on the distribution. Australian researchers have made limited use of administrative data. Not because the administrative data doesn't exist, but because of privacy and practical challenges with linking individuals and making that data available to external researchers. In this paper, we apply machine learning and standard econometric techniques to develop synthetic estimates of the Australian income distribution, validate this data against high quality survey data, use this administrative dataset to measure movement across the income distribution longitudinally, and measure ethnic disparities (by Indigeneity and ancestry)

Methods & Data: The dataset used in this paper has at its core individually linked medical, cros-sectional Census (i.e. survey), social security and tax data for 6 financial years. None of this data alone is complete for all parts of the income distribution, but combined can generate high quality estimates. Broadly, we generate a continuous cross-sectional income estimate from Census bands in 2011, test various machine learning algorithms to predict income using observed tax and social security data in 2011, use parameter estimates from the algorithms to estimate income in the following 5 financial years (based on demographic, tax and social security data for those years), validate against survey data, and then analyse.

Results: We show that certain algorithms perform far better than others, and that we are able to generate highly accurate predictions that match survey data at the national level. We then derive new insights into income inequality in Australia.

Added Value: We outline a methodology and set of techniques for when income data needs to be combined across multiple sources, demonstrate a productive link between ML and econometric techniques, and shed new light on the Australian income distribution.

How to find potential customers on district level: Civey's innovative methodology of Small Area Estimation through Multilevel Regression with Poststratification

Janina Mütze, Charlotte Weber, Tobias Wolfram

Civey, Germany

Relevance & Research Question: Reliable market research is the basis for making the right decisions. Market researchers understand customer interests or the perception of existing products. However, the question of how and where potential customers can be reached is difficult to answer precisely. To solve this problem, Civey has developed Small Area Estimation through multilevel regression with poststratification in a live system. Thus, customers recognize potential leads even in the smallest geographical areas such as districts (“Landkreise”).

Methods & Data: The basis for this is a MRP model (Multilevel Regression with Poststratification), which Civey has implemented for real-time calculations. Data is collected online on over 25,000 websites. This way, over fifteen million opinions are collected each month. With one million verified and active users monthly, Civey has established Germany's largest open access panel.

Based on a two-stage process developed by Civey, which combines hierarchical logistic regression models and poststratification with variable selection by LASSO, real-time applications of MRP are possible to provide Small Area Estimations. In addition to the user-based information, the model also accounts for publicly available auxiliary information on district level.

Results: The model can be used to predict the probability that a certain person will give a particular answer for any combination of sociodemographic information. The model "learns" based on all information available. This model-based approach enables fast valid results even in the smallest geographical areas.

Added Value: After a brief introduction to the methodology, Civey provides unique insights into their results. This includes interesting evaluations of potential customers in the automotive market, but also amusing examples to show the variety and depth of data that this innovation allows.

Platform moderated data collection: Experiences of combining data sources through a crowd science approach.

Michael Weinhardt, Isabell Stamm, Johannes Lindenau

TU Berlin, Germany

Relevance & Research Question: The central idea of crowd-science is to engage a wide base of potential contributors who are not professional scientists into the process of conducting and/ or analyzing research data (z.B. Franzoni & Sauermann, 2014). Crowd science carries the potential to lift data treasures or to analyze data that is too large for a small research team, but at the same time too unstandardized for computational research methods. While such approaches have been used successfully in the natural sciences and the digital humanities, they are rare in the social sciences. Hence, we know only very little about the particular challenges of this approach, its fit to certain research questions or types of data (Scheliga et al 2018).

Methods & Data: In this talk, we report and reflect about our crowd-science approach that we used to utilize data on the social relationships among entrepreneurial groups (Ruef 2010). Starting from a core data set based on administrative data (Weinhardt and Stamm 2019), we designed a crowd science task that asks participants to research information on company websites and in news articles on predefined cases of entrepreneurs in order to enrich our overall data set. To implement this task, we set up our own crowd science platform that moderated task distribution and the collection of the researched information. In order to qualify the crowd, in our case students in the social sciences across Germany, for this task we offered a 45 min online training on the methodology of process-generated data. After completion, participating students could engage in the research task, and by doing so, collect points and win prizes.

Results: We discuss the methodological challenges, from extracting and combining the information from the different sources as well as pragmatic challenges from setting up a multi-purpose online platform to finding and motivating participants.

Added Value: These insights and reflections advance the methodological discussion on crowd science as digital method and initiate a discourse on the potentials and shortcomings of combining data sources via platform moderated data collection.

The Combination of Big Data and Online Survey Data: Displaying of Train Utilization on and its Implications

Andreas Krämer1,3, Christian Reinhold2

1University of Applied Sciences Europe, Germany; 2DB Fernverkehr AG, Germany; 3exeo Strategic Consulting AG, Germany

Relevance & Research Question:

In Germany, the utilization of trains in the long-distance traffic has risen in the last 10 years from about 44% (2008) to 55% (2018). Further demand growth is stipulated by the German government for the coming years. The goal is to double the number of passengers by 2030. While demand has so far primarily been controlled by a Revenue Management system (saver fare and super saver fare), the question arises whether controlling and smoothing demand is also possible through non-price measures.

Methods & Data:

Based on forecast data, capacity utilization for each journey is estimated. Using these data, a display system was developed (4 icons), which provides customer information on the expected utilization of a single train connection on After a concept phase, qualitative research as well as A/B testing was performed. Finally, in April 2019, the display system was introduced on all major distribution channels. Recently, ticket buyers have been surveyed: here, one study focused on ticket buyers ( Jan.-Oct 2019, n=>10.000), the other study surveyed visitors of who did not buy a train ticket (Oct. 2019, n=2.000).


By using a multi-source multi-method approach, there are clear and consistent indicators for several positive effects of the utilization forecast icons: first, there is a shift in demand towards less utilized trains (thus achieving the goal of demand smoothing), secondly, seat reservation quota is increased and thirdly, the information leads to a comfort improvement for the travelers. However, it can also be seen that in time windows with overall high train utilization, sometimes a loss of customers takes place.

Added Value:

On the one hand, the combination of big data, experimental design and online surveys generates the database for displaying icons (load forcast) at the same level as train connections and fares on, while on the other hand, during the period of market introduction (as of May 2019), key information can be obtained leading to a 360-degree perspective, generating deep insights into the effects for Deutsche Bahn as well as for railway customers. Furthermore, starting points for optimizing the displayed icons are identified.

10:00 - 11:20C 4: Gender and Ethnicity
Session Chair: Simon Munzert, Hertie School, Germany

Ethnic perspective in e-government use and trust in government: A test of social inequality approaches

Dennis Rosenberg

University of Haifa, Israel

Relevance & Research Question: Keywords: E-government, ethic affiliation, social inequality, trust in government.

Studies in the field of digital government have established the existence of a two-way association between e-government use and trust in government. Yet to date, no study has examined the interactive effect of ethnic affiliation and e-government use on trust in government or the interactive effect of ethnic belonging and trust in government on e-government use. The current study investigated these effects by means of social inequality approaches outlined in Internet sociology studies.

Methods & Data: Keywords: Social survey, categorical regression.

This study has used the data from the 2017 Israel Social Survey. The findings were received from the multivariate categorical (logistic and ordinal) regression models.

Results: Keywords: Ethnic minorities, trust-use interaction.

The study found that Arabs from small localities with varying levels of trust in government (except for those with the highest level) are less likely to use e-government than Israeli Jews with the same levels of trust, yet they are more likely than Israeli Jews to have some degree of trust in government. Arabs from large localities differ from Israeli Jews in terms of e-government use only when they have some degree of trust in government, but they do not differ from Israeli Jews regarding the trust itself. Except for variations in predicted probabilities, no differences were found between the two Arab groups with respect to either of the criteria.

Added Value: Keywords: Locality size, social stratification.

The results provide support for the social stratification approach and in general provide justification for treating disadvantaged minorities according to the size of their residential localities.

Gender Portrayal on Instagram

Dorian Tsolak, Simon Kuehne

Bielefeld University, Germany

Relevance & Research Question:

In the recent decade, social media has been identified as an important source of digital trace data, reflecting real world behaviour in an online environment. Many researchers have analyzed social media data, often text messages, to make inferences about peoples attitudes and opinions. Yet many such opinions and attitudes are not saliently expressed, but remain implicit. One example are gender role attitudes, that are hard to measure using textual data. In this regard, images posted on social media such as Instagram may be better suited to analyze the phenomenon. Existing research has shown that men and women differ in how they portray themselves when being photographed (Goffman 1979, Götz & Becker, 2019, Tortajada et al., 2013). Our study is concerned with the question how images from social media containing gender self-portrayal can be harnessed as a measure of gender role attitudes.

Methods & Data:

We rely on about 800,000 images collected from Instagram in 2018. We present a new approach to quantify gender portrayal using automated image processing. We use a body pose detection algorithm to identify the 2-dimensional skeletons of persons within images. We then cluster these skeletons based on the similarity of their body pose.


As a result we obtain a number of clusters which can be identified as gender typical poses. Examples of typical female body poses include S-shaped body poses reflecting sexual appeal, the feminine touch (touching the own body or hair) implying insecurity, or asymmetric body posture representing fragility. Typical male body poses include the upper body facing the camera square to show strength, or a view aimed into the distance signifying pensiveness.

Added Value:

The (self)-portrayal of women and men has been an active field of research across various disciplines including sociology, psychology and media studies, but has usually been analyzed by qualitative means using small, manually labeled data sets. We provide an automated approach that allows for a quantitative measurement of gender role attitudes within pictures by examining gender portrayal via body poses. Our results contribute to a better understanding of online/social media gender reproduction mechanisms.

Practicing Citizenship and Deliberation online The Socio-Political Dynamic of Closed Women's Groups on Facebook

Vered Elishar-Malka, Yaron Ariel, Dana Weimann-Saks

Yezreel Valley College, Israel

Relevance & Research Question: The importance of deliberative processes to democracy has been studied for a long time now. As people discuss actual issues, share ideas, and try to change their minds in a friendly, open-minded environment, they become active, aware citizens. The flourishing of Social networking sites has encouraged scholars to examine their potential contribution to deliberative processes, as they enable an abundance of opportunities to deliberate. The current study has examined the inner dynamic of closed Israeli women's groups A quantitative content analysis was conducted (coders reliability = 0.73) to examine 1070 random posts and analysis of the profile of the original post contributors (including some indicators that measured the posts’ entire threads) that were written during December 2017-January 2018. All posts derive from a large and well- known closed Israeli women's group on Facebook (with over 100, 000 members). on Facebook to identify deliberative processes among them.

Methods & Data:

Results: An overwhelming majority of posts (89%) included dialogical elements. Furthermore, in most (94%) of the posts, authors' names, profile pictures, and Facebook's full profile were overt. A positive correlation was found between the level of personal exposure and the depth of discourse that followed the user's initial post (r = .214, p <.001). Although most popular topics of the posts were health (15%), motherhood (13%), relationships with partners (12%), and sexuality (9%), many posts were dedicated to political issues. In these posts, group members were freely discussing actual-political issues in a non-judgmental environment, opening themselves to other ideas and points of view

Added ValueThis study highlights the vital role that closed women's groups on Facebook may play in their members' lives, not only in social and psychological aspects, but also in the sense of practicing deliberative interactions, and therefore strengthening the vital sense of being empowered citizens.:

10:00 - 11:20D 4: Deeper Understanding with Predictive Analytics
Session Chair: Stefan Oglesby, data IQ AG, Switzerland

Opinion Analysis using AI: Live demo

François Erner, Denis Bonnay

respondi SAS, France

Relevance & Research Question

As a way to reduce survey length, or even to replace surveys, we have been involved in passive data (web navigation) collection for a couple of years. Passive data is relevant to the description of online behaviour, but declarative data is still needed to interpret and explain behavior; as one could hope to directly infer individual attitudes from internet behavior. Due to the recent advances in natural language, such automated analysis of contents and attitudes is no longer an elusive dream. As an experiment, we have thus used BERT (Google's deep learning based language model) and further proprietary deep learning techniques in order to try and analyze opinions, based on online media consumption.

More precisely, is it possible to instantly, without asking anything, combine passive data and BERT and get a deep understanding of the audience of any website? For example, is it possible to get the specific attitude of visitors to towards ecology? Our talk will be based on the presentation of a prototype of an online tool/dashboard. Its objective will be to share the promises and the challenges of this usage of AI.

Methods and Data

The data we use is based on 7000 respondents (from France, Germany, UK) who agreed to install a tracking software. For 347 days on average, we continuously collected the navigation data (urls visited and / or apps used) for each of them. Data is analyzed via BERT properly trained. Realtime vizualisation of the results powered by Tableau.


The quality of results relies on the ability of our neutral network to accurately categorize words in a consistent semantic field. Some results are pretty impressive: without having been trained on these particular fields, “Ronaldo” is associated to football and “parenting” is related to family life. But some are disappointing: “psoriasis” is associated to medicine in general (not even to dermatology only). We will discuss these results and will try to explain them.

Added Value

It is a work in progress, at this stage, the main benefit is to present and discuss concrete applications of AI in market research.

Using Google to look into the future

Raphael Kneer

Swarm Market Research AI GmbH, Germany

We were wondering: If we find out, how many people have been looking for a specific thing (or basically just words) on the internet in the past, would we be able to calculate their interest in the future, too?

The use of Artificial Intelligence in combination with existing technologies has been repeatedly discussed lately. We were interested in combining AI with traditional trend research and developed Pythia, a tool which forecasts culture and consumer trends. How? By examining Google search data and other sources on new trends, evaluating and structuring them individually. The neural networks analyze huge amounts of data and are trained on search data from the past decade. The trend research tool was created to obtain insights that could be used to find new products, improve them and present them more effectively to have a positive impact on product development by using trend forecasts.

We know what you will be needing to sell and how to interact with your customer in the future.

As of today, Pythia can forecast the latest culture and consumer trends of the next 18 months with 95 percent probability in over 50 countries.

The results and experiences with cooperating companies have supported our initial goal to successfully AI with traditional trend research. In an early cooperation with our Co-Founder Rossmann, Pythia suggested “CBD”, "Ingwer Shots" and many more trending topics in Germany. CBD products have been strong performers in their online shop ever since. The tool also proved to be useful for enhancing polls: 10 days prior to the election of the SPD federal chairman, Pythia predicted the correct result.

Want to know what's going to happen within your business? Ask Pythia.

Old but still sexy - Predictive Analytics with Conjoint Analysis

Philipp Fessler

Link Institut, Switzerland

When we talk about predictive analytics, we should not leave aside a method that has been around for what feels like ages (i.e. at times when the term predictive analytics was not even born yet...), but whose predictive power is still one of the best that the market research toolbox has to offer: conjoint analysis. Its value can be seen simply from the fact that it is still one of the most relevant methods of price and product research and is used globally.

In contrast to what is commonly known as predictive analytics, however, conjoint is not based on existing data, but on data collected in decision-making experiments within the framework of surveys.

As an indirect method, it is free of inflation of pretensions and scale effects, and as a reflection of a real decision situation, it is also able to cover behavioural economics effects.

If we assume that there are essentially three variants of predictive analytics (predictive models, descriptive models and decision models), conjoint analysis even includes all three.

But Conjoint not only helps us to develop better products, but can also help to determine the pricing strategy and improve communication and marketing.

11:20 - 11:30Break
11:30 - 11:50GOR Award Ceremony
11:50 - 12:40Keynote 2

Studying Social Interactions and Groups Online

Milena Tsvetkova

London School of Economics and Political Science, United Kingdom

No man is an island and no online user is alone. All human activity is embedded in social context and structure and the rise of social media has made this fact more pertinent to online research. On the one hand, the size and composition of the group individuals interact in, the structure of interactions, and collective or other-based incentives affect individual perceptions, behavior, and outcomes. On the other hand, beyond individual outcomes, group outcomes such as segregation and the unequal distribution of resources matter too. However, analyzing social interactions and groups involves a new set of methodological challenges related to gathering data, reducing data heterogeneity, and addressing the non-independence of observations. In this talk, I will present recent work that uses online surveys, experiments, and digital trace data to study social perception, social interactions, and group effects in context as diverse as social media, wikis, online gaming, and crowdsourced contests.

Milena Tsvetkova is an Assistant Professor in the Department of Methodology at the London School of Economics and Political Science. She completed her PhD in Sociology at Cornell University in 2015. Prior to joining LSE, she was a Postdoctoral Researcher in Computational Social Science at the Oxford Internet Institute, University of Oxford. Milena’s research interests lie in the fields of computational and experimental social science. In her research, she uses large-scale web-based social interaction experiments, network analysis of online data, and agent-based modeling to investigate fundamental social phenomena such as cooperation, social contagion, segregation, and inequality. Her work has been sponsored by the US National Science Foundation and Germany’s Volkswagen Foundation, published in high-impact disciplinary and general science journals such as New Media and Society, Nature Scientific Reports, and Science Advances, and covered by The New York Times, The Guardian, and Science, among others.

12:40 - 1:00Break
1:00 - 2:00A 5.1: Recruitment and Nonresponse
Session Chair: Bella Struminskaya, Utrecht University, Netherlands, The

A Systematic Review of Conceptual Approaches and Empirical Evidence on Probability and Nonprobability Sample Survey Research

Carina Cornesse1, Annelies G. Blom1, David Dutwin2, Jon A. Krosnick3, Edith D. de Leeuw4, Stéphane Legleye5, Josh Pasek6, Darren Pennay7, Benjamin Philipps7, Joseph W. Sakshaug8,1, Bella Struminskaya4, Alexander Wenz1,9

1University of Mannheim, Germany; 2NORC, University of Chicago, United States of America; 3Stanford University, United States of America; 4Utrecht University, The Netherlands; 5INSEE, France; 6University of Michigan, United States of America; 7Social Research Center, ANU, Australia; 8IAB, Germany; 9University of Essex, United Kingdom

Relevance & Research Question: There is an ongoing debate in the survey research literature about whether and when probability and nonprobability sample surveys produce accurate estimates of a larger population. Statistical theory provides a justification for confidence in probability sampling, whereas inferences based on nonprobability sampling are entirely dependent on models for validity. This presentation systematically reviews the current debate and answers the following research question: Are probability sample surveys really (still) more accurate than nonprobability sample surveys?

Methods & Data: To examine the current empirical evidence on the accuracy of probability and nonprobability sample surveys, we collected results from more than 30 published primary research studies that compared around 100 probability and nonprobability sample surveys to external benchmarks. These studies cover results from more than ten years of research into the accuracy of probability and nonprobability sample surveys from across the world. We synthesize the results from these studies, taking into account potential moderator variables.

Results: Overall, the majority of the studies in our research overview found that probability sample surveys were more accurate than nonprobability sample surveys. None of the studies found the opposite. The remaining studies led to mixed results: for example, probability sample surveys were more accurate than some but not all examined nonprobability sample surveys. In addition, the majority of the studies found that weighting did not sufficiently reduce the bias in nonprobability sample surveys. Furthermore, neither the survey mode nor the participation propensity seems to moderate the difference in accuracy between probability and nonprobability sample surveys.

Added Value: Our research overview contributes to the ongoing discussion on probability and nonprobability sample surveys by synthesizing the existing published empirical evidence on this topic. We show that common claims about the rising quality of nonprobability sample surveys for drawing inferences to the general population have little foundation in empirical evidence. Instead, we show that it is still advisable to rely on probability sample surveys when aiming for accurate results.

Introducing the German Emigration and Remigration Panel Study (GERPS): A New and Unique Register-based Push-to-Web Online Panel Covering Individual Consequences of International Migration

Jean Philippe Decieux1, Marcel Erlinghagen1, Lisa Mansfeld1, Nikola Sander2, Andreas Ette2, Nils Witte2, Jean Guedes Auditor2, Norbert Schneider2

1University of Duisburg-Essen, Germany; 2Federal Institute for Population Research, Germany


With the German Emigration and Remigration Panel Study (GERPS) we established a new and unique longitudinal data set to investigate consequences of international migration from a life course perspective. This task is challenging, as internationally mobile individuals are hard to survey for different reasons (e.g. sampling design and approach, contact strategy, panel maintenance).


GERPS is funded by the German Research Foundation (DFG) and surveys international mobile German citizens (recently emigrated abroad or recently re-migrated to Germany) in four consecutive waves within a push- to- web online panel design. Based on a probability sample, GERPS elucidates the individual consequences of cross-border mobility and concentrates on representative longitudinal individual data.

Research question

This paper introduces the aim, scope and design of this unique push-to-web online panel study which has the potential for analyzing the individual consequences of international migration along four key dimensions of social inequality: employment and income, well-being and life satisfaction, family and partnership as well as social integration.


We will mainly reflect the effectiveness of our innovative study design (register-based sampling, contacting individuals all over the world and motivate them to follow a stepwise push-to-web panel approach). Up to now we successfully conducted two waves (W1: N=12.059; W2: N=7.438) and our 3rd wave is currently in the field. Due to the information available in the population registers, in W1 we had to recruit our respondents postally, aiming to “push” them to a web survey. However, during the following waves we had been able to manage GERPS as online-only panel.

Added Value

These results can be very helpful to international researchers in the context of surveying mobile populations or researchers aiming to implement a push- to- web survey.

Comparing the participation of Millennials and older age cohorts in the CROss-National Online Survey panel and the German Internet Panel

Melanie Revilla1, Jan K. Höhne2,1

1RECSM-Universitat Pompeu Fabra Barcelona, Spain; 2University of Mannheim, Germany

Relevance & Research Question: Millennials (born between 1982 and 2003) witnessed events during their lives that differentiate them from older age cohorts (Generation X, Boomers, and Silents). Thus, one can also expect that Millennials’ web survey participation differs from that of older cohorts. The goal of this study is to compare Millennials to older cohorts on different aspects that are related to web survey participation: participation rates, break-off rates, smartphone participation rate, survey evaluation, and data quality.

Methods & Data: We use data from two probability-based online panels covering four countries: 1) the CROss-National Online Survey (CRONOS) panel in Estonia, Slovenia, and the UK and 2) the German Internet Panel (GIP). We use descriptive and regression analyses to compare Millennials and older age cohorts regarding participation rates, break-off rates, rates of surveys completed with a smartphone, survey evaluation (using two indicators: rate of difficult surveys and rate of enjoyed/liked surveys) and data quality (using two indicators: rate of non-substantive responses and rate of selecting the first answer category).

Results: We find a significantly lower participation rate for Millennials than for older cohorts and a higher break-off rate for Millennials than for older cohorts in two countries. Smartphone participation is significantly higher for Millennials than for Generation X and Boomers in three countries. Comparing Millennials and Silents, we find that Millennials’ smartphone participation is significantly higher in two countries. There are almost no differences regarding survey evaluation and data quality across age cohorts in the descriptive analyses. However, we find some age cohort effects in the regression analyses. These results suggest that it is important to develop tailored strategies to encourage Millennials’ participation in online panels.

Added Value: While ample research exists that posits age as a potential explanatory variable for survey participation and break-off, only a small portion of this research focuses on online panels and even less consider age cohorts. This study builds on Bosch et al. (2018), testing some of their hypotheses on Millennials and older cohorts, but it also extends their research by testing new hypotheses and addressing some of their methodological limitations.

1:00 - 2:00A 5.2: Push2web and Mixed Mode
Session Chair: Otto Hellwig, respondi AG & DGOF, Germany

Push-to-web Mode Trial for the Childcare and early years survey of parents

Tom Huskinson, Galini Pantelidou

Ipsos MORI, United Kingdom

Relevance & Research Question:

The Department for Education (in England) sought to understand whether survey estimates for the Childcare and early years survey of parents (CEYSP), a random probability face-to-face survey of around 6,000 parents per year, and an Official Statistic, could be collected using a push-to-web methodology.

Methods & Data:

The face-to-face questionnaire was adapted to follow "Mobile First" principles, using cognitive and usability testing with parents. Three features of the push-to-web survey were experimentally manipulated to explore the optimal design: incentivisation (a £5 gift voucher conditional on completion, vs a tote bag enclosed in the invitation mailing, vs no incentive); provision of a leaflet in the invitation mailing (leaflet, vs no leaflet); and survey length (15 vs 20 minutes).

Survey materials were designed following the Tailored Design Method, using an invitation letter, a reminder letter, and a final reminder postcard.


The overall response rate to the push-to-web survey was 15.2%, which compares with 50.9% for the most recent face-to-face CEYSP. Of the three experimental treatments, only incentivisation had a significant impact on response: the tote bag increased the response rate by 4.4 percentage points vs no incentive, and the £5 gift voucher increased the response rate by 9.3 percentage points vs no incentive.

A comparison of the responding push-to-web sample profile against that of the most recent face-to-face survey found the push-to-web sample to be biased in certain ways. Parents responding to the push-to-web survey were more highly educated, with higher incomes and levels of employment, lived more often in couple (vs lone parent) families, and lived in less deprived areas of the country. The offer of a £5 gift voucher tended to reduce these biases, whereas the provision of the tote bag tended to exacerbate these biases.

Despite these biases, the push-to-web survey produced similar estimates to the most recent face-to-face survey for certain simple, factual questions. However, greater differences arose for questions relating to parents’ attitudes and intentions.

Added Value:

The survey contributes to our understanding of expected response rates to Government-sponsored push-to-web surveys, and the extent and nature of non-response bias in such surveys.

Using responsive survey design to implement a probability-based self-administered mixed-mode survey in Germany

Tobias Gummer, Pablo Christmann, Sascha Verhoeven, Christof Wolf

GESIS Leibniz Institute for the Social Sciences, Germany

Relevance & Research Question: Due to rising nonresponse rates and costs, self-administered modes seem a viable alternative to traditional survey modes. However, when planning such a survey in Germany, we identified a lack of evidence on effective incentive strategies and mode choice sequence. Setting up adequate pre-testing was not viable due its costs.

Responsive survey designs (RSD) promise a solution by collecting data across multiple phases. Knowledge gained in prior phases of a survey is used to adjust the survey design in later phases to optimize outcomes and efficiency. Yet, there is a research gap on practical applications of RSD and especially on whether RSD outperform the use of static design (SD) that does not adjust. We address this research gap by comparing outcomes and costs between a RSD and several SDs.

Methods & Data: We drew on a self-administered mixed-mode survey with a RSD that was conducted as part of the German EVS (N~3,200). In the first phase, incentives (5€ prepaid vs. 10€ postpaid) and mode choice sequence (sequential vs. simultaneous) were experimentally varied (2x2). In the second phase, the survey was conducted in the best performing design (5€ prepaid, simultaneous). Our probability sample was randomized across phases and experimental groups. Based on the experiments, we calculated what response rates, risk of nonresponse bias, and survey costs would have been when using SDs instead of a RSD.

Results: Our RSD helped mitigate risks of design decisions: response rate was 10%-points higher and survey costs 13%-points lower compared to the worst SDs. However, because the RSD included four experimental groups that varied in outcomes it did not outperform all SDs. The RSD’s response rate was 4%-points lower and its costs 2%-points higher compared to the best SDs.

Added Value: Our study adds to the sparse knowledge about the feasibility of running RSDs in practice. We show how RSD can be used to conduct a survey under uncertain outcome conditions. Moreover, we highlight that RSDs are faced with an optimizing problem when keeping the learning phases as small as possible but large enough to gain insights.

The feasibility of moving postal to push-to-web: looking at the impact on response rate, non-response bias and comparability

Laura Thomas, Eileen Irvin, Joanna Barry

Ipsos MORI, United Kingdom

Relevance & Research Question:

In response to declining survey response rates and a focus on increasing inclusivity, the push-to-web mixed-mode methodology is emerging as a high-quality alternative to postal surveys. Through the NHS Adult Inpatient Survey, part of the English NHS Patient Survey Programme owned by the Care Quality Commission, we are conducting a pilot testing the feasibility of moving a postal (paper-only) survey online through push-to-web methods, and the impact on non-response bias. The pilot will provide insight about the comparability of these methods, through testing a classic postal survey approach alongside a sequential push-to-web, mixed-mode approach (involving paper and SMS reminders).

Methods & Data:

Through the NHS Adult Inpatient Survey, a sample of eligible patients were invited to take part in a non-incentivised survey. Patients were randomly assigned to one of three conditions:

1. Control group (n = 5,221) receive three paper mailings with questionnaires included, as in the current survey design.

2. Experimental group 1 (n = 3,480) receive four paper mailings (with a paper questionnaire included in the third and fourth mailings), and an SMS reminder after each mailing without a paper questionnaire.

3. Experimental group 2 (n = 3,480) receive four paper mailings (with a paper questionnaire included only in the third mailing), and an SMS reminder after each mailing without a paper questionnaire.

Analysis will review overall response rate, percentage completing online, representativeness by key demographic groups and responses to key survey questions for each group. This will provide insight into the cost implications and feasibility of maintaining trends following a move to mixed-methods.


Fieldwork is ongoing and final results will be available in January 2020. However, preliminary results are encouraging and suggest relatively similar response rates between the control and the experiment groups.

Added Value:

Although previous studies have shown the effectiveness of push-to-web approaches, this pilot provides direct comparability between a non-incentivised, multi-mode contact, push-to-web approach and a classic postal approach on a large-scale survey. The pilot will also provide insight into the feasibility of moving a paper survey online and consider the potential impact on trends and cost effectiveness.

1:00 - 2:00B 5: New Types of Data
Session Chair: Florian Keusch, University of Mannheim, Germany

Unlocking new technology – 360-degree images in market research

Evamaria Wittmann

Ipsos, Germany

Relevance and Research Question:

Using 360-degree images in research studies presents a lot of benefits for researchers, clients, as well as consumers: it allows us to present more realistic concepts and products for evaluation – and it gives respondents the ability to examine products and concepts in more detail, and in a more realistic context, and thus hopefully increase respondent engagement.

Methods & Data:

In an experimental design, we compared the responses and behavior of respondent being exposed to traditional images (i.e. static/front-facing) vs. 360-degree concepts (N=600 completes). We focused on engagement metrics (direct engagement and passive in survey measures) and measured the possible impact of 360-degree images on the overall survey data.


We will show that unsurprisingly, respondents showed a positive reaction on the new way of displaying concepts / products; in particular, we will highlight how engagement measures increased. We will also discuss the impact on data we observed, and we will present our recommendations on whether or not to we believe replacing traditional images with 360-degree images would impact benchmarks or trends.

Added Value:

This research is examining the impact of the new 360-degree technology on survey data and gives an outlook on how it can be adapted to serve market research needs.

A new experiment on the use of images to answer web survey questions

Oriol J. Bosch1,2, Melanie Revilla2, Daniel Qureshi3, Jan Karem Höhne3,2

1London School of Economics and Political Science, United Kingdom; 2Universitat Pompeu Fabra, Spain; 3University of Mannheim, Germany

Relevance & Research Question: Taking and uploading images may provide richer and more objective information than text-based answers to open-ended survey questions. Thus, recent research started to explore the use of images to answer web survey questions. However, very little is known yet about the use of images to answer web survey questions and its impact on four aspects: break-off, item nonresponse, completion time, and question evaluation. Besides, no research has explored the effect of adding a specific motivational message encouraging participants to upload images, nor of the device used to participate, on these four aspects. This study addresses three research questions: 1. What is the effect of answering web survey questions with images instead of text on these four aspects? 2. What is the effect of including a motivational message on these four aspects? 3. How PCs and smartphones differ on these four aspects?

Methods & Data: We conducted a web survey experiment (N = 3,043) in Germany using an opt-in access online panel. Our target population was the general German population aged between 18-70 years living in Germany. Half of the sample was required to answer with smartphones and the other half with PCs. Within each device group, respondents were randomly assigned to 1) a control group answering open-ended questions with text, 2) a first treatment group answering open-ended questions with images, and 3) a second treatment group answering with images but prompted with a motivational message.

Results: Overall, results show higher break-off and item nonresponse rates, as well as lower question evaluation for participants answering with images. Motivational messages slightly reduce item nonresponse. Finally, participants completing the survey with a PC present lower break-off rates but higher item nonresponse.

Added Value: To our knowledge, this is the first study that experimentally investigates the impact on break-off, item nonresponse, completion time, and question evaluation of asking respondents to answer open-ended questions with images instead of text. We also go one step further by exploring 1) how motivational messages may improve respondent’s engagement with the survey and 2) the effect of the device used to answer on these four aspects.

Artificial Voices in Human Choices

Carolin Kaiser, René Schallner

Nuremberg Institute for Market Decisions, Germany

Relevance & Research Question:

Today, most recent available voice assistants talk with non-emotional tone. However, with technology becoming more humanoid, this is about to change. From a marketing perspective, this is especially interesting, as the voice assistant’s emotional tone may affect consumers’ emotions which play an important role while shopping. For example, happy consumers tend to seek more variety in product choice and are more likely to engage in impulse buying. Against this background, we explore how the tone of a voice assistant impacts consumers’ shopping behavior.

Methods & Data:

We develop a deep learning model to synthesize speech in German with three different emotional tones: excited, happy and uninvolved. Listening tests with two experts, 120 university students, and 224 crowd workers are performed to ensure that people perceive the synthesized emotional tone. Afterwards, we conduct lab experiments, where we ask 210 participants to interact with a prototypical voice shopping interface talking in different emotional tones and we measure their emotion and shopping behavior.


Listening tests confirmed very good quality of synthesized emotional speech. Experts recognized the emotion category with almost perfect accuracy of 98%, university students with 90% and crowd workers without any German skills still achieved an accuracy of 71%. The lab experiment shows that the tone of voice impacts participants’ valence and arousal which in turn impact their trust, product satisfaction, shop satisfaction and impulsiveness of buying.

Added Value:

In human-human-interaction, people often catch the emotion of other people. With the increasing use of voice assistants, the question arises whether people also catch the expressed emotion of voice assistants. Several studies manipulating voices found that the same social mechanisms prevalent in human-human-interactions also exist in human-computer-interactions. However, there is also research showing that people interact differently with computers than humans. For example, they are more likely to accept unfair offers by computers than by humans. Considering the contradicting evidence, this study aims to shed light on emotional contagion in the interaction between voice assistants and consumers. This is especially important since voice assistants may potentially reach and impact a huge number of consumers in contrast to one single human shop assistant.

1:00 - 2:00D 5: UX Research vs Market Research?
Session Chair: Florian Tress, Norstat Group, Germany

The convergence of user research and market research - The best of both worlds?!

Christian Graf1, Thorsten Wilhelm2

1UXessible GbR, Germany; 2eresult GmbH, Germany

The relationship between user research (user research as part of the user experience design) on the one hand and market research on the other hand has been repeatedly discussed lately. While one part of the community tends to emphasize differences, the other one clearly sees overlaps. We were interested in the subjective reality of professionals with market-research background and user experience research background and how members of each group see their contributions to each phase in a standard development process (early idea gathering, conceptualisation, implementation, market entry and operations). In each of those phases different questions must be answered to ensure the success of the to-be-product/service with the customers. The hypothesis was that each group does not regard its contribution at the same level to each phase, but that they complement each other depending on the phase.

As of today, we collected 37 answers from two groups (user researchers or market researchers) with a qualitative online 10 item questionnaire with open and closed questions. The data collection and processing is ongoing.

The primary results support our assumption. The contribution of the both groups qualifies to be complementing, i.e. when one group sees its contribution as high, the other group regard its contribution as low, and vice-versa. This might be interpreted as if both groups see its contribution as very distinct. Nevertheless, the other answers show that both groups share similar methods, where user research is often more qualitative and market research more quantitative, but not exclusively.

From the results, we propose a structured combination of market research methods (often quantitative) and user research methods (often qualitative) depending on the phases in the product development. Based on the findings we urge every product team to ensure an approach with mixed methods (this should be a no-brainer today) and a mixed team of heterogeneous mindsets, i.e. people coming more from market research and people from user research. The results could be interpreted in a different way too: the transition from market researcher to user researcher and vice-versa might only be a question of the mindset. This is future research.

Do Smartphone app diaries work - for researchers and participants?

Zacharias de Groote

Liveloop GmbH, Germany

Mobile diary apps are one of the latest developments in the field of research diaries. They allow participants to provide spontaneous and in-the-moment feedback with their smartphones. By utilising mobiles for digital qualitative research, diary apps represent the next step in closing the gap between participant and researcher.

By providing the opportunity to gather valuable insights on physical and digital product and service usage, smartphone diary apps are especially promising for user research (as part of UX). They allow to identify users’ needs and desires, their usage patterns as well to collect installation, setup and usage feedback over the course of time.

Like other feedback channels, the response behaviour and input quality of smartphone diaries rely heavily on consumers’ motivation to participate and contribute. Engaging participants in a digital diary without the social glue of a community, with little moderator response and without reactions by community peers to their contributions can be quite challenging, especially on the long run. In fact, there appears to be little evidence on how qualitative smartphone diary studies perform as long-term projects with regard to participant engagement.

We will present first results on participant motivation and satisfaction derived from a long-term User experience smartphone diary study with a duration of more than 12 months. We observed relatively high satisfaction rates with the feedback process and the research mode across the user base, as well as low drop-out and non-response rates over the course of the study.

The results show that it is possible to engage consumers and collect insights over a longer period of time in a digital app diary model – making smartphone diaries an interesting alternative to conduct and accompany In-home-Use-Tests and other product and service research models for Market and User experience research the like.

CoCreation in Virtual Worlds for complex questions and technologies

Markus Murtinger

AIT Austrian Institute of Technology & USECON, Austria

One of the most relevant differences between User Experience (UX) research and market research for us is the creative involvement of the participants in the design of the study settings. UX research is usually the beginning of a user centered innovation approach and provides essential inputs to the future design process. CoCreation methods are an essential part of this research phase to collect sticky information, to uncover user needs & ideas and to consider these results in the further creative process.

Co-Creation can be considered to be a subset or contemporary form of Participatory Design (PD) while using tools and techniques that engender people’s creativity, which is in part motivated by a belief in the value of democracy to civic, educational, and commercial settings.

New technologies (such as virtual reality, artificial intelligent, robotics, etc.) make it possible to reduce the cost of carrying out CoCreation and, moreover, they offer easy access for a broad group of users for collaboration. The focus is particularly on virtual and augmented reality technologies for the implementation of these studies and these technologies provides new possibilities to transform CoCreation into an engaging digital playground for serious collaboration. For example, participants could meet from any point in the world in a virtual workshop setting and work together on topics. Enhanced interaction methods combined with simulation or AI empower non-experts to work on a professional design level for resolving complex challenges. Furthermore, the results of the CoCreation process is immediately available and editable in the virtual world and could be shared on the internet for widespread user involvement.

We will present innovative approaches from ongoing research projects and how virtual CoCreation methods could be used and what we can expect from the future technologies and possibilities. On the one hand we will introduce our H2020 Research Project SHOTPROS and the involvement of Law Enforcement Agencies in the user centered design process. And on the other hand, we will show ideas and approaches with virtual reality in the domain of architecture and urban planning. Especially VR for participatory planning offer a completely new medium to walk through virtual worlds providing a high level of immersion and presence. This will completely change participation: Citizens no longer look at content but become part of the virtual world which is perceived as real and enable people to interact with and feel connected to the world.

2:00 - 2:10Break
2:10 - 3:20Plenary: Online Data Collection During Times of Corona - A Data Quality Perspective
Session Chair: Bernad Batinic, JKU Linz, Austria
Additional speakers: tba.

Support for COVID-19 research through Global Surveys

Frauke Kreuter1,2, Katherine Morris3

1University of Maryland; 2University of Mannheim; 3Facebook

In this talk, Frauke Kreuter and Katherine Morris will discuss the worldwide COVID-19 Symptom Tracker Survey that Facebook has launched in the spring of 2020. The presentation will cover methodological aspects of conducting world-wide rapid surveys, challenges in adapting instruments to different cultural and language contexts, and opportunities to combine the data collected in the COVID-19 Symptom Tracker Survey with other COVID-19 data resources. This presentation will focus on the topics such as: (a) why this kind of survey might provide better data quality than the existing snowball samples, (b) what it takes to field a global survey, (c) how to organize data transmission, and (d) envisioned use of the Survey results.

The YouGov COVID-19 Monitor

Lydia Pauly

YouGov, United Kingdom

The YouGov COVID-19 Monitor is one of the first globally syndicated, dedicated data trackers for the pandemic, launching in February 2020. The tracker itself covers 26 countries across MENA, APAC, Europe, and the American countries. The aims of the Monitor are threefold: to provide freely accessible data to academics and health organisations; to inform the public through visualisation tools; and lastly, providing consumer behaviour and economic recovery data to clients via our own reporting tool, Crunch.

The following presentation will cover the methodology of the YouGov COVID-19 Monitor, looking at the advantages that online research provides, and considerations for the field that this project has highlighted. In particular, the presentation will focus on the issue of data quality by demonstrating three key needs: the need for a maintained, international array of panels; the need for a regular, fast-paced fieldwork cadence via online survey methods; and the need for a centralised team with connections to local researchers in each region. The presentation will also briefly discuss the relationship between data quality and the narratives that can be drawn from international data.

3:20 - 3:30Break
3:30 - 4:30A 6.1: Panels and Data Quality
Session Chair: Bella Struminskaya, Utrecht University, Netherlands, The

Evaluating data quality in the UK probability-based online panel

Olga Maslovskaya1, Gabi Durrant1, Curtis Jessop2

1University of Southampton, United Kingdom; 2NatCen Social Research, United Kingdom

Relevance and Research Question: We live in a digital age with high level of use of technologies. Surveys have also started adopting technologies for data collection. There is a move towards online data collection across the world due to falling response rates and pressure to reduce survey costs. Evidence is needed to demonstrate that the online data collection strategy will work and produce reliable data which can be confidently used for policy decisions. No research has been conducted so far to assess data quality in the UK NatCen probability-based online panel. This paper is timely and fills this gap in knowledge. This paper aims to compare data quality in NatCen probability-based online panel and non-probability-based panels (YouGov, Populus and Panelbase). It also compares NatCen online panel to the British Social Attitude (BSA) probability-based survey on the back of which NatCen panel was created and which collects data using face-to-face interviews.

Methods and Data: The following surveys are used for the analysis: NatCen online panel, BSA Wave 18 data as well as data from YouGov, Populus and Panelbse non-probability-based online panels.

Various absolute and relative measures of differences will be used for the analysis such as mean average difference and Duncan dissimilarity Index among others. This analysis will help us to investigate how sample quality might impact on differences in point estimates between probability and non-probability samples.

Results: The preliminary results suggest that there are differences in point estimates between probability- and non-probability-based samples.

Added value: This paper compares data quality between “gold standard” probability-based survey which collects data using face-to-face interviewing, probability-based online panel and non-probability-based online panels. Recommendations will be provided for future waves of data collection and new probability-based as well as non-probability-based online panels.

Building 'Public Voice', a new random sample panel in the UK

Joel Williams

Kantar, United Kingdom

Relevance & Research Question:

The purpose of this paper is to describe the building of a new random sample mixed-mode panel in the UK ('Public Voice'), focusing on its various design features and how each component influenced the final composition of the panel.

Methods & Data:

The Public Voice panel has been built via a combination of two recruitment methods: (i) face-to-face interviewing, and (ii) web/paper surveying. So far as possible, measurement methods have been unified, including the use of a self-completion section within the face-to-face interview for collecting initial opinion and (potentially) sensitive data. The same address sample frame was used for both methods. For this initial phase, the objective was to recruit to the panel c.2,400 individuals, split evenly by method.


The response rates to the two recruitment survey methods were aligned with expectations (c.40% for the interview survey, c.8% for the web/paper survey) as were the observable biases. Presenting the panel up front (an experimental manipulation) did not lower the web/paper recruitment survey response rate compared to introducing it at the end of the survey. Respondent agreement to join the panel was much higher than expected in the web/paper survey (>90%). Contact details were of generally high quality in the face-to-face and web modes but less so in the paper mode. [More results to come]

Added Value:

This paper adds to the evidence base for what works when building survey panels with a probabilistic sample base. In particular, the use of a dual-design recruitment method is novel.

Predictors of Mode Choice in a Probability-based Mixed-Mode Panel

David Bretschi, Bernd Weiß

GESIS Leibniz Institute for the Social Sciences, Germany

Relevance & Research: Even with a growing number of Internet users in Germany, a substantial proportion of respondents with Internet access still chose to participate in the mail mode, when given a choice. We know little about the characteristics of those reluctant respondents, as most survey designs do not allow to measure potential predictors of the mode choice process before individuals make a decision. This study aims to fill this gap by investigating which personal characteristics of respondents in a mixed-mode panel are related to their willingness to respond via the web mode.

Methods & Data: We use data from multiple waves of the GESIS Panel, a probability-based mixed-mode panel in Germany (N=5,700). In October/November 2018, a web-push intervention motivated around 20 percent of 1,896 panelists previously using the mail mode to complete the survey via the web mode. We measured potential predictors of mode choice in waves before the intervention. These predictors include indicators of web-skills, web usage, attitudes to the Internet, and privacy concerns. Our study design allows us to investigate how those predictors are associated with mode choice of panelists who switched to the web and those who refused to do so.

Results: Preliminary results suggest that web-skills and web usage are important predictors of mode choice. In contrast, general privacy concerns do not seem to affect the decision to respond via the web mode, but attitudes towards the Internet do.

Added Value: This study will provide new insights into how the characteristics of respondents predict their decision to participate in web surveys. Learning more about the mode choice process and response propensities of web surveys is important to develop effective web-push methods for cross-sectional and longitudinal studies.

3:30 - 4:30A 6.2: Cognitive Processes
Session Chair: Otto Hellwig, respondi AG & DGOF, Germany

Using survey design to encourage honesty in online surveys

Steve Wigmore, Jon Puleston

Kantar, United Kingdom

Relevance & Research Question:

There can be multiple reasons why data collected in online surveys may differ from the “truth”. Surveys which do not collect data from smartphones for example will include bias from a skewed sample that does not reflect the modern world. The way that individual questions are asked may be subject to inherent biases and some respondents may find survey experience itself frustrating or confusing which will impact their willingness to answer truthfully.

Methods & Data:

This paper will discuss key psychological motivations for respondents to answer surveys truthfully even when this requires them to make more of an effort for the same financial incentive. What drives individuals to tell the truth and how can survey design help to reward such honesty. We will look at number of questioning techniques that reflect real-life decision making and make it easier to for respondents to answer truthfully. Conversely, we will also examine methods for validating data to reduce overclaim from aspirational respondents.


By conducting a number of research-on-research surveys on the Kantar panel we have seen the direct impact of asking questions across a range of subjects and countries to encourage honesty in data collection and also to validate or trap respondents who are prepared to answer dishonestly. We will present the results of this research and provide some key learnings which can be used directly in online questionnaires.

Added Value:

Many research companies and end-clients use the results of online research as an import part of their insight generation process or tracking studies. By using the techniques that will be presented in this paper they should be assured that we will be collecting higher quality and more honesty respondents from more engaged respondents. This is something that we would encourage anyone involved in the design of online surveys to take some consideration of.

What Is Gained by Asking Retrospective Probes after an Online, Think-Aloud Cognitive Interview

William Paul Mockovak

U.S. Bureau of Labor Statistics, United States of America

Relevance & Research Question: Researchers have conducted cognitive testing online through the use of web-based probing. However, Lenzer and Neuert (2017) mention that, of several possible cognitive interviewing techniques, they applied only one technique: verbal probing. They also suggest that given the technical feasibility of creating an audio and screen recording of a web respondent’s answering process, future studies should look into whether web respondents can be motivated to perform think-aloud tasks while answering an online questionnaire. Using an online instrument to guide the process, this study demonstrated that unmoderated, think-aloud cognitive interviewing could be successfully conducted online, and that the use of retrospective probes after the think-aloud portion was completed resulted in additional insights.

Methods & Data: Think-aloud cognitive interviewing, immediately followed by the use of retrospective web-based probing, was conducted online using a commercially available online testing platform and separate software for displaying survey instructions and questions. Twenty-five participants tested 9 questions dealing with the cognitive demands of occupations. Videos lasting a maximum of 20 minutes captured screen activity and each test participant’s think-aloud narration. A trained coder used the video recordings to code the think-aloud narration and participants’ answers to the retrospective web-based probing questions.

Results: 25 cognitive interviews were successfully conducted. A total of 41 potential problems were uncovered, with 78% (32) identified in the think-aloud section, and an additional 22% (9) problems identified in the retrospective, web-based probing section. The types of problems identified dealt mostly with comprehension and response-selection issues. Findings agreed with results from a field test of the interviewer-administered questions, with findings from both studies used to revise the survey questions.

Added Value: A think-aloud online test proved successful at identifying problems with survey questions. Moreover, it was easier, faster, and less expensive to conduct the online think-aloud testing and retrospective web-based probing. Online and field testing yielded similar results. However, online testing had the advantage that respondent problems could be shared using videos. And online results had the additional advantage of providing clearer examples of respondent problems, which were then available for use as examples in interviewer training and manuals.

Investigating the impact of violations of the left and top means first heuristic on response behavior and data quality in a probability-based online panel

Jan Karem Höhne1,2, Ting Yan3

1University of Mannheim, Germany; 2RECSM-Universitat Pompeu Fabra, Spain; 3Westat, United States of America

Relevance & Research Question: Online surveys are an established data collection mode that use written language to provide information. The written language is accompanied by visual elements, such as presentation forms and shapes. However, research has shown that visual elements influence response behavior because respondents sometimes use interpretive heuristics to make sense of the visual elements. One such heuristic is the “left and top means first” (LTMF) heuristic, which suggests that respondents tend to expect that a response scale consistently runs from left to right or from top to bottom.

Methods & Data: In this study, we build on the experiment on “order of the response options” by Tourangeau, Couper, and Conrad (2004) and extend it by investigating the consequences for response behavior and data quality when response scales violate the LTMF heuristic. We conducted an experiment in the probability-based German Internet Panel in July 2019 and randomly assigned respondents to one of the following two groups: the first group (n = 2,346) received options that followed in a consistent order (agree strongly, agree, it depends, disagree, disagree strongly). The second group (n = 2,341) received options that followed in an inconsistent order (it depends, agree strongly, disagree strongly, agree, disagree).

Results: The results reveal significantly different response distributions between the two experimental groups. We also found that inconsistently ordered response options significantly increase response times and decrease data quality in terms of criterion validity. These findings indicate that order discrepancies confuse respondents and increase the overall response effort in terms of response times. They also affect response distributions reducing data quality.

Added Value: We recommend presenting response options in a consistent order and in line with the design strategies of the LTMF heuristic. Otherwise, this may affect the outcomes of survey measures and thus the conclusions that are drawn from these measures.

3:30 - 4:30A 6.3: Attrition and Response
Session Chair: Florian Keusch, University of Mannheim, Germany

Personalizing Interventions with Machine Learning to Reduce Panel Attrition

Alexander Wenz1,2, Annelies G. Blom1, Ulrich Krieger1, Marina Fikel1

1University of Mannheim, Germany; 2University of Essex, United Kingdom

Relevance & Research Question: This study compares the effectiveness of individually targeted and standardized interventions in reducing panel attrition. We propose the use of machine learning to identify sample members with high risk of attrition and to target interventions on an individual level. Attrition is a major concern in longitudinal surveys since it can affect the precision and bias of survey estimates and costs. Various efforts have been made to reduce attrition, such as using different contact protocols or incentives. Most often, these approaches have been standardized, treating all sample members in the same way. More recently, this standardization has been challenged in favor of survey designs in which features are targeted to different sample members. Our research question is: Can personalized interventions make survey operations more effective?

Methods & Data: We use data from the German Internet Panel, a probability-based online panel of the general population in Germany, which interviews respondents every two months. They receive study invitations via email and a 4€ incentive per survey completed. To evaluate the effectiveness of different interventions on attrition, we implemented an experiment in 2018 using a standardized procedure. N = 4,710 sample members were randomly allocated to one of three experimental groups, and within each group were treated in the same way: Group 1 received an additional 10€ incentive, Group 2 received an additional postcard invitation while Group 3 served as control group.

Results: Preliminary results suggest that the standardized interventions were only effective for sample members interviewed for the first time (postcard significantly reduced the attrition rate by 3%-points; incentive no effect), but not for those in subsequent waves. In a further analysis, we conduct a counterfactual simulation investigating the effect of these interventions if 1) only people with high attrition propensities were targeted, and 2) these people received the treatment that was predicted to be most effective for them.

Added Value: This study provides novel evidence on the effectiveness of using personalized interventions in reducing attrition. In 2020, we will develop prescriptive models in addition to the predictive models for actually targeting panel members during fieldwork under a cost-benefit framework.

Now, later, or never? Using response time patterns to predict panel attrition

Isabella Luise Minderop, Bernd Weiß

GESIS Leibniz Institute for the Social Sciences, Germany

Relevance & Research Question:

Keeping respondents who have a high likelihood to attrite from a panel in the sample is a central task for (online) probability panel data infrastructures. This is especially important when respondents at risk of dropping out are notably different from other respondents. Hence, it is key to identify those respondents and prevent them from dropping out. Previous research has shown that response behavior in previous waves, e.g., response or nonresponse, is a good predictor of next wave’s response. However, response behavior can be described in more detail, by, for example, taking paradata such as time until survey return into account. Until now, time until survey return has mostly been researched in cross-sectional contexts, which offer no opportunity to study panel attrition. In this innovative study, we investigate whether (a) respondents who return their survey late more often than others and (b) respondents who show changes in their response behavior over time are more likely to attrite from a panel survey.

Methods & Data:

Our study relies on data from the GESIS Panel which is a German bi-monthly probability-based mixed-mode panel (n = 5,000). The GESIS Panel includes data collected in web and mail mode. We calculated the days respondents required to return the survey from online and postal time stamps. Based on this information, we distinguish early, late and nonresponse. Further, we identify individual response patterns by combining this information across multiple waves. We calculated the relative frequency of late responses and the changes in a response pattern.


Preliminary results show that the likelihood to attrite increases by 0.16 percentage points for respondents who always return their survey late compared to those who always reply early. Further, respondents who change their response timing each wave are 0.43 percentage points more likely to attrite.

Added Value:

The time until survey return is an easily available paradata. We show that the frequency of late responses as well as the changes in response time patterns predict attrition just as good as previously used models that include survey evaluation or available time, which might not always be available.

A unique panel for unique people. How gamification has helped us to make our online panel future-proof

Conny Ifill, Robin Setzer

Norstat Deutschland GmbH, Germany

Relevance & Research Question: For many years, online panels have been struggling with every time lower response rates and shorter membership durations in average. The responses to this threatening challenge are manifold. Simply put, panels either have to lower the quality standards to sustain a high recruitment volume or they have to increase the loyalty and activity rate of its then costlier recruited members. We decided to invest into the longevity of our member’s base by relaunching out panel in 18 European countries and introducing game mechanics to our panelists.

Methods & Data: We have strictly followed a research-based process to identify the motivation and pain-points of our panel members. With the help of focus groups and iterative user testing, we successively developed a panelist centric platform that included a new visual design, new functions for the user and game mechanics to better engage with our members. An integral part of the whole project was (and still is) accompanying research. Among the KPIs we continuously monitor over time are panel composition (i.e. demographics), panel performance (e.g. churn rate, response rate) and panelist satisfaction.

Results: Our first results are very promising. We see that all target groups increased their activity and loyalty level. To our satisfaction, especially hard to reach segments (i.e. young men) experienced a significant boost. As a result, our panel has become more balanced and better performing than before.

The evaluation of this transition is ongoing, especially as we are still introducing new features and making smaller adjustments to existing functions. We are planning to share the current status of this long-term project with the audience of the conference.

Added Value: While comparability of data is a very high value in research, the dynamic nature of digitalization requires us to adapt the method from time to time. Our case shows that research methodology can evolve without compromising its quality standards. We believe that this is partly because the whole process was based on and accompanied by research.

4:30 - 5:00Virtual Farewell Drinks

Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: GOR 20
Conference Software - ConfTool Pro 2.6.127
© 2001 - 2019 by Dr. H. Weinreich, Hamburg, Germany