Conference Agenda

Session
A 3.2: Scales and Questions
Time:
Thursday, 10/Sep/2020:
3:30 - 4:30

Session Chair: Florian Keusch, University of Mannheim, Germany

Presentations

Measuring income (in)equality: comparing questions with unipolar and bipolar scales in a probability-based online panel

Jan Karem Höhne1,2, Dagmar Krebs3, Steffen Kühnel4

1University of Mannheim, Germany; 2RECSM-Universitat Pompeu Fabra, Spain; 3University of Gießen, Germany; 4University of Göttingen, Germany

Relevance & Research Question: In social science research, questions with unipolar and bipolar scales are commonly used methods in measuring respondents’ attitudes and opinions. Compared to other rating scale characteristics, such as scale direction and length, scale polarity (unipolar and bipolar) and its effects on response behavior have been rarely addressed in previous research. To fill this gap in the survey literature, we investigate whether and to what extent fully verbalized unipolar and bipolar scales influence response behavior by analyzing observed and latent response distributions and latent thresholds of response categories.

Methods & Data: For this purpose, we conducted a survey experiment in the probability-based German Internet Panel (N = 2,427) in March 2019 and randomly assigned respondents to one of the following two groups: the first group received four questions on income (in)equality with a five-point, fully verbalized unipolar scale (i.e., agree strongly, agree somewhat, agree moderately, agree hardly, agree not at all). The second group received the same four questions on income (in)equality with a five-point, fully verbalized bipolar scale (i.e., agree strongly, agree somewhat, neither agree nor disagree, disagree somewhat, disagree strongly).

Results: The results reveal substantial differences between the two rating scales. They show significantly different response distributions and measurement non-invariance. In addition, response categories (and latent thresholds) of unipolar and bipolar scales are not equally distributed. The findings show that responses to questions with unipolar and bipolar scales differ not only on the observational level, but also on the latent level.

Added Value: Both rating scales vary with respect to their measurement properties, so that the responses obtained using each scale are not easily comparable. We therefore recommend not considering unipolar and bipolar scales as interchangeable.



Designing Grid Questions in Smartphone Surveys: A Review of Current Practice and Data Quality Implications

Gregor Čehovin, Nejc Berzelak

University of Ljubljana, Slovenia

Relevance & Research Question: Designing grid questions for smartphone surveys is challenging due to their complexity and potential increase in response burden. This paper comprehensively reviews the findings of scientific studies on several data quality and response behavior indicators for grid questions in smartphone web surveys: satisficing, missing data, social desirability, measurement quality, multitasking, response times, subjective survey evaluation, and comparability between devices. This framework is used to discuss different grid question design approaches and their data quality implications.

Methods & Data: Experimental studies investigating grids in smartphone surveys were identified using the DiKUL bibliographic harvester that includes over 135 bibliographic databases. The string “’mobile web survey’ AND (grid OR scale OR matrix or table)” returned 55 results. After full-text evaluation, 35 papers published in English between 2012 and 2018 were found eligible for extraction of findings regarding the eight groups of data quality and response behavior indicators.

Results: Grid questions tend to increase self-reported burden and satisficing behavior. The incidence of missing data increases with the number of items per page and per grid. The comparisons between smartphones and PCs yield largely mixed results. While completion times are decisively longer on smartphones, the grid format has little to no effect on response times compared to item-by-item presentation. Differences in satisficing are modest and observations about the relationship between missing data and device type are mixed. No effects of device type on socially desirable responding were detected, while differences in measurement quality are mostly limited to worse input accuracy and biased estimates on smartphones due to noncoverage and nonresponse error. Mixed are also the findings about differences in multitasking.

Added Value: Data quality remains a salient issue in web surveys, as well as in the context of the visual syntax that defines the design of survey questions on different devices. This review of current practice offers insights into data quality implications of the principles for designing grid questions in smartphone web surveys and their comparability to web questionnaires on PCs. The critical elaboration of findings also provides a guidance for future experimental research and usability evaluation of web questionnaires.



The effects of forced choice, soft prompt and no prompt option on data quality in web surveys - Results of a methodological experiment

Johannes Lemcke, Stefan Albrecht, Sophie Schertell, Matthias Wetzstein

Robert Koch Institut, Germany

Relevance & Research Question: In survey research item nonresponse is regarded as an important criterion for data quality among other quality indicators (e.g. breakoff rate, straightlining etc.) (Blasius & Thiessen, 2012). This originates from the fact that, as with the unit nonresponse rate, persons who do not answer a specific item can systematically differ from those who do. In online surveys this threat can be countered by using the possibility of prompting after item non-response. In this case prompting means a friendly reminder displayed to the respondent, uniquely inviting him to give his answer. If the respondent does not want to answer it is possible to move on in the questionnaire (soft prompt). The forced choice option however requires a response on every item.

There is still a research gap on the effects of prompting or forced choice options in web surveys on data quality. Tourangeau (2013) also comes to the following conclusion: ‘More research is needed, especially on the potential trade-off between missing data and the quality of the responses when answers are required’.

Methods & Data: We conducted a methodological experiment using a non-probability sample recruited over a social network platform in January 2019. To test the different prompting options we implemented three experimental groups (forced choice, soft prompt, no prompt) (Total N = 1,200). Besides item-nonresponse rate we used the following data quality indicators: breakoff rate, straightlining behavior, duration time, tendency to give social desirable answers and self-reported interest.

Results:The results show a higher breakoff rate for specific questions where forced choice was applied. Furthermore a higher item nonresponse rate was found for the no prompt option. Overall only small effects were found. Final results on the different effects on the data quality will be presented at the conference.

Added Value: We found little to no evidence on the impact of prompt options on data quality. However, we found that soft prompt tends to lead to lower item nonresponse compared to no prompt.