Conference Agenda

Session Overview
B02: Text Mining and NLP
Thursday, 07/Mar/2019:
10:45 - 11:45

Session Chair: Florian Keusch, Universität Mannheim & DGOF, Germany
Location: Room 158
TH Köln – University of Applied Sciences


Towards the Human-Machine-Symbiosis: Artificial Intelligence as a Support for Natural Language Clustering

Marc Egger, André Lang

Insius, Germany

Over the past years, the amount of available natural language speech data has risen tremendously. People use Social Media Channels to express their opinions in natural language text or interact with digital assistants like Amazon Alexa or Apple Siri via their voice. Most recently, market researchers have also proposed novel survey designs, where participants can provide answers via natural speech interaction. This opens up a novel and vast data source to draw rich insights from the unfiltered voice of consumers/participants. However, due to the amount of available natural language data, the velocity of its creation, next to the requirements on the insights’ depth, novel analysis methodologies are required. On the one hand, “human-powered” qualitative analysis methods offer deep insights but lead to tremendous manual efforts and thus are hardly applicable in big data scenarios. On the other hand, automated “machine-powered” methods (e.g. using natural language processing) can be applied in big data scenarios but only offer shallow insights due to the complexity of human language.

Considering Lickliders` (1960) vision on human-machine-symbiosis, we claim to the construction of novel analysis methodologies where human and machine work hand-in-hand to perform better in cooperation than individually. The research at hand therefore proposes a novel human-machine cooperative methodology to derive the most important topics from large text collections by semi-automated clustering of natural language concepts.

We apply our methodology on a data set of ~20.000 texts and describe a software artifact to illustrate the research process. The artifact implements the method and allows researchers to cluster automatically elicited concepts and phrases to more abstract topics. Our methodology proposes an additional “relevance-feedback loop” that utilizes an artificial neural net for suggesting concepts that might be clustered next. As an evaluation of our methodology, we compare the results of human-machine clustered concepts to those that were elicited automatically.

Our initial results show that the cooperation of machine and human clearly leads to more rapid insights than manual qualitative approaches, while also offering deeper insights than purely automated approaches. Furthermore, our research reveals that a linear process and a “definition-of-done” is necessary for human-machine-cooperative scenarios.

Impact evaluation by using text mining and sentiment analysis

Cathleen M. Stuetzer, Marcel Jablonka, Stephanie Gaaw

TU Dresden, Germany

Relevance & Research Question: Web surveys in higher education are particularly important for assessing the quality of academic teaching and learning. Traditionally, mainly quantitative data is used for quality assessment. Increasingly, questions are being raised about the impact of attitudes of the individuals involved. Therefore, especially the analysis of open-ended text responses in web surveys offers the potential for impact evaluation. Despite the fact that qualitative text mining and sentiment analysis are being introduced in other research areas, these instruments are still slowly gaining access to evaluation research. On the one hand, there is a lack of methodological expertise to deal with large numbers of text responses (e.g. via semantic analysis, linguistically supported coding, etc.). On the other hand, deficiencies in interdisciplinary expertise are identified in order to be able to contextualize the results. The following contribution aims to address these issues.

Methods & Data: An annual online survey of lecturers regarding the quality of academic teaching and learning was conducted within a selected university in Germany between 2013/14 and 2017/18. Information regarding the open-ended question of what is particularly important in the teaching process were extracted by using text mining methods and evaluated by using sentiment analysis.

Results: The results of the analysis of the text data of 791 respondents (lecturers) show their different attitudes towards the quality of teaching. This will be merged with results of the annual quantitative online survey of students (n=6.615, between 2013/14 and 2018/19) regarding the question what is actually conveyed in teaching processes. Comprehensive results are work in progress.

Added Value: The presentation will show how this case study contributes to the field of impact evaluation and reveals methodological implications for the development of text mining and sentiment analysis in evaluation processes.

Stuetzer-Impact evaluation by using text mining and sentiment analysis-221.pdf

What to expect from open-ends?

Eva Wittmann, Sara Wilkinson, Cecile Carre


Relevance and Research Question:

Open-ends have been a constant in survey-based research, as they add qualitative insights and enrich quantitative data by encouraging self-expression for respondents. But research in recent years has fundamentally changed: it is getting more global, technology is getting increasingly diverse, and increasingly embedded in our daily lives, and consumers are becoming less engaged by the idea of participating in research projects.

Yet researchers and clients still expect to gain insights via traditional open-ends. This presentation aims to examine where we currently stand with open-ends and whether technology and structural settings are changing this outcome.

Methods & Data:

To review the status quo on open-ends, we have analyzed 40 Projects from 4 countries, approx. 15K open-ends. For additional insights on how respondents use new technology (e.g. audio or video), we ran an adhoc project giving respondents from Ipsos Panels the option to use these new technologies to replace traditional open-ends.


Our analysis shows how that differences in open-end quality can be linked to specific factors such as culture, gender and age or even device used. Those factors accounted for differences between 10%-40% in response quality. Opening open-ends to new technologies (such as video or audio) suggests that these quality indicators can be improved, if we can get audiences to more widely embrace these new tools.

Added Value:

This presentation will give a holistic summary of the current situation of open-ends in online research and an empirically-based outlook on how technology can change the future of this type of questions.

Wittmann-What to expect from open-ends-152.pptx