B07: Opportunities and Challenges of Digitalization
The Variable Harmonization Hub: A case study in Big Data and digital documentation
1GESIS - Leibniz Institute for the Social Sciences, Germany; 2Heinrich-Heine-Universität Düsseldorf
Relevance & Research Question: To adhere to the scientific method, researchers combining many datasets must document the harmonization of variables in clear, transparent and replicable ways. At GOR 2018, we presented on: the NRW-Innovativ projects, filling a lacuna in communications studies by creating a harmonized dataset (since 1954) of media use in Germany using the Media-Analysis-Data; the complex process of large-scale data variable harmonization; and how the GESIS software CharmStats Pro assisted in documenting the study, question and variable metadata required for full documentation and in generating recode commands and full reporting documents. This year we will showcase a pre-release of the EU-funded Variable Harmonization Hub (Hub) website, an online repository for storing, searching and downloading digital data harmonization, using the Media-Analysis-Data as a use case.
Methods & Data: We will review the theoretical and practical influence the digital harmonization software, CharmStats Pro, had on the design of the Hub, how the search and download features work, the submission and acceptance for publication project (with a doi). The presentation will include a live demonstration of the Hub using the Media-Analysis-Data as a use case.
Results: The purpose of the Hub is to be an online, digital repository for social science harmonization documentation for long-term preservation, and to make the sometimes complex process of data harmonization both transparent and replicable.
Added Value: The methodological approach of this use case can be counted as big data digital documentation and variable harmonizing for research.
Data Literacy in the Age of Insight Democratization
Norstat Deutschland GmbH, Germany
Relevance & Research Question:
The age of data democratization is characterized by more and more people having access to data, charts and dashboards. However, it is not always clear if these people have the required skills and abilities to draw the right conclusions from reading them. Consequently, companies run the risk of their employees drawing wrong conclusions and making bad business decisions. Our study assesses the extent of data illiteracy and tries to identify best practices for sharing data with broad and potentially unqualified audiences.
Methods & Data:
We conducted an online survey with 800 German white-collar professionals. Our study covered four different areas: In the first part, we collected baseline figures for the verbal expression of statistical probabilities. We then asked the respondents to evaluate different method descriptions and guess the data quality of the corresponding studies. The third part consisted in a monadic test design to evaluate the readability of different chart layouts. In the last part, respondents could give feedback on the comprehensibility of different dashboard design patterns.
We have not analysed all the data yet, but first insights indicate that many people feel uncomfortable with interpreting charts and have little knowledge about methodology and survey quality. It is also against this backdrop, that many of them draw factually incorrect conclusions or feel uncertain about their interpretations. However, we could also see some design patterns, that improved the accuracy of and confidence in drawing own conclusions for the common user. As stated, a thorough analysis of the mostly open feedback has yet to be made and we expect to identify more best practice principles.
The democratization of insights is a trend that probably will not be stopped. However, we can shape this trend and help to improve the readability of charts. In a broader sense, this may also help to make research become more centric towards the various users of data and insights.
Using Publicly Available Data to Examine Potential Cultural Influence on Concussion Risk in American Football Players
Northern Arizona University, United States of America
Relevance & Research Question: Although concussion risk is widely understood to be a serious health threat, the rate at which American football players report potential concussion symptoms remains low. Low reporting is attributed to a "football culture" that reinforces attitudes and behavior associated with non-reporting. We hypothesized that football programs more invested in "football culture” would have a lower number of concussion reports during the 2017 season. Given the difficulty of assessing cultural influences with traditional self-report measures, we examined whether "program investment and success" (RISC) could be measured using several sources of publicly available data. We hypothesized that RISC would be inversely related to concussion reports.
Methods & Data: Data were collected for all 130 NCAA Division I football programs between September 2017 and January 2018. We identified eight indicators that we theorized would reflect historical success (historic win-loss record, alumni going on to NFL), fan engagement (Twitter, Instagram, and Facebook followers), and financial investment (stadium size, coach salary, ticket price). We also collected 2017 success (win-loss record) and injury data (statfox.com).
Results: Some football program data were publicly unavailable and MI methods were used to replace missing data. RISC indicators were standardized and found to significantly load onto a single hypothesized factor (range: .68 to .88). Items were combined to create factor called RISC, which was reliable (a = .92) and normally distributed. Although there was a small effect, RISC was not significantly associated with concussion injury (r = .15, p < .09). RISC was positively related head and neck injury (r = .30, p < .001) and knee injury (.28, p < .001).
Added Value: Results suggest that RISC may be associated with player increased risk for head and neck injury. The relatively weaker association of RISC to concussion injury is of interest and the subject of follow-up research. We demonstrate the advantages and disadvantages of using publicly available data to provide insight into concussion risk factors attributable to larger, cultural and organizational influence that are typically difficult to measure directly.