T1: GOR Thesis Award 2020 Competition I
Using Artificial Neural Networks for Aspect-Based Sentiment Analysis of Laptop Reviews
Catholic University Eichstätt-Ingolstadt, Germany
RELEVANCE & RESEARCH QUESTION
On e-commerce websites such as Amazon, customers readily comment on product highlights and flaws. This provides an important opportunity for companies to collect customer feedback. Fine-grained analyses of customer reviews can support managerial decision-making, especially in marketing. The vast amount of user-generated content necessitates the application of automated analysis techniques. One method of computationally processing unstructured text data is sentiment analysis, which examines people’s opinions, evaluations, emotions, and attitudes towards products, services, organizations, or other topics (Liu 2012, p. 1). Past studies have primarily focused on document-level or sentence-level sentiment analysis. However, for practical applications, there is a substantial need for finer-grained analyses to determine what exactly customers like or dislike – thus, for aspect-based sentiment analysis (ABSA). However, the implementation of ABSA is challenging.
The need for ABSA in marketing, the limitations of traditional sentiment analysis methods, and recent progress in the field of artificial neural networks make the latter’s application to ABSA a relevant research topic.
The objectives of this thesis are to
̶ propose and evaluate a novel model architecture for ABSA that combines gated recurrent units and convolutional neural networks;
̶ apply the proposed model to laptop reviews to gain insight into customer requirements and satisfaction;
̶ discuss limitations of the proposed model, also from a market research perspective.
METHODS & DATA
ABSA is divided into the subtasks of aspect term extraction and aspect sentiment classification. One neural network was trained to predict for each word of every sentence whether the word is an aspect. All aspects and their corresponding sentences were passed on to a second neural network, which predicts whether the sentiment expressed regarding the aspect is positive, negative, or neutral. Building on previous research, this thesis suggests combining two artificial neural network types, namely gated recurrent units and convolutional neural networks.
The proposed model was trained using the SemEval 2014 laptop review dataset (SemEval 2014). In addition, the thesis author manually annotated laptop reviews. The combined dataset consists of 5,165 sentences, totaling approximately 72,900 words. To evaluate model performance compared to previous research, the proposed model was trained only on the original SemEval training set and tested on the SemEval test set.
The evaluation results (Fig.1, 2) are promising. Without using sentiment lexica, handcrafted rules or manual feature engineering, the proposed system achieves competitive results on the benchmark dataset. It is especially effective at extracting aspects.
Model F1 score
Proposed model 81.63%
Filho and Pardo (2014) 25.19%
Pontiki et al. (2014) 35.64%
Toh and Wang (2014) 70.41%
Chernyshevich (2014) 74.55% †
Liu, Joty, and Meng (2015) 74.56%
Poria, Cambria, and Gelbukh (2016) 77.32%
Xu et al. (2018) 77.67%
Fig. 1: Performance on aspect term extraction on the SemEval test set
†: trained on twice as much training data, use of an additional training set
To ensure comparability, only the performance of models with publicly available word embeddings are reported in Fig. 1. With domain-specific word embeddings and a set of linguistic patterns, Poria, Cambria, and Gelbukh (2016) reached an F1 score of 82.32%, which appears to be the current state of the art for this task.
Model | Accuracy | Macro F1 score
Proposed model | 68.45% | 63.92%
Pontiki et al. (2014) | 51.37% | n/a
Negi and Buitelaar (2014) | 57.03% | n/a
Wang et al. (2016) | 68.90% | n/a
Wagner et al. (2014) | 70.48% | n/a
Kiritchenko et al. (2014) | 70.48% | n/a
Tang et al. (2016) | 71.83% | 68.43%
Chen et al. (2017) | 74.49% | 71.35%
Fig. 2: Performance on aspect sentiment classification on the SemEval test set
Sentiment misclassifications can be grouped into three types of mistakes: predicting the opposite sentiment, predicting a strong sentiment instead of neutrality, and predicting neutrality instead of a strong sentiment. For marketers who interpret the model predictions, the first type of mistake would be the most severe. The third type of mistake was most common. It is arguably the least severe mistake and shows that the proposed system tends to be conservative in its predictions.
Overall, model performance is encouraging, especially because the model used only two features. This is in sharp contrast to traditional methods. Moreover, no specialized knowledge of linguistics was needed to develop the proposed system. In addition, it does not use any sentiment lexica, which is especially beneficial when considering languages other than English.
A case study in the laptop domain illustrates how and to what degree the proposed ABSA system is useful for practical purposes in market research.
The paper’s contributions are
̶ provision of a labeled dataset for ABSA, which could enhance other models;
̶ provision of refined annotation guidelines that consider marketing needs;
̶ proposal and implementation of a system that combine gated recurrent units and convolutional neural networks;
̶ performance evaluation of the system;
̶ error analyses, which can help practitioners to interpret the model output and may allow academics to improve future models;
̶ model outputs that summarize the customer opinions voiced in unlabeled and unstructured reviews;
̶ some insight into customer satisfaction and preferences (regarding the case study laptops), which might facilitate decision-making in marketing;
̶ guidance on why, how, and under what limitations to use ABSA, especially for marketing purposes.
With only words and part-of-speech tags as inputs, the proposed system achieves competitive results on the benchmark dataset. Sentiment lexica, handcrafted rules or manual feature engineering are not required. The system can be readily used to analyze English customer reviews of laptops. Given appropriate training data, the approach may also be applicable to other product categories and languages.
ABSA offers a structured representation of the most frequently mentioned positive and negative aspects in customer reviews. Moreover, it does so in a timely manner. The output can help to determine what reviewers like and dislike about a product. Given a large amount of review text, ABSA provides a detailed picture of customer satisfaction and can stimulate product improvements. It can also support marketers in inferring the reviewers’ reasons to purchase the product and the purposes for which they use it. Moreover, ABSA can complement traditional marketing research, especially as a preliminary study or by providing up-to-date information. In short, it can help companies to understand customers.
Data Sharing for the Public Good? A Factorial Survey Experiment on Contextual Privacy Norms
University of Mannheim, Germany
RELEVANCE & RESEARCH QUESTION:
Individual data that are collected when using smartphone applications or other digital technologies may not only be used to improve recommendations or products provided to the user. Many of these data may at the same time be employed for a public use, for instance when data collected in navigation apps are used by municipalities for urban planning. Against this background, the advancement of smart technologies opens new possibilities for the provision and maintenance of public goods, such as public health care and infrastructure development. However, such practices entail considerable ethical and social challenges, particularly with respect to privacy violations. From empirical research on privacy norms, we know that individual perceptions of which data transmissions are acceptable heavily depend on situational characteristics and their interactions. Thus, while individuals may accept data uses from which they receive an immediate personal benefit, it is unclear under which conditions using the same data for a public benefit is considered appropriate as well. To investigate this issue, the present paper draws on and advances the application of the theoretical framework of privacy as contextual integrity (Nissenbaum 2004, 2018) that conceptualizes the context dependence of privacy norms. It proposes concrete situational characteristics that impact the forming of these norms: data type, involved actors, and the terms of data transmission. These characteristics interact, meaning that in most cases no single situational feature can explain norms on its own. The question is: Do individuals hold different norms of appropriateness for private and public benefit data uses, and on which concrete situational characteristics does acceptance depend?
METHODS & DATA:
A factorial survey experiment (“vignette study”) was employed in a German online non-probability sample with 1,504 respondents to compare how personal normative beliefs of appropriateness of specific data transmissions are affected by concrete situational characteristics. The vignettes, i.e. situations shown to the respondents, varied along the parameters of data type, data recipient, and use for a private or public benefit. The investigated data types were health, location, and energy use data, while the recipient was either a public administration or a private company. Moreover, general privacy concerns, perceived sensitivity of the three data types, as well as trust in public and private entities were measured to account for possible moderating effects.
Results of linear regression analyses show that whether respondents perceive a public benefit use of their data as appropriate – compared to a personal benefit use – strongly depends on the concrete data type scenario at hand. In line with the notion of contextual integrity, considerable interactions of situational characteristics are present, i.e. the effects of use vary with data type and recipient. Particularly, public benefit uses are more accepted when public instead of private actors use the data. These findings show that the acceptability of a public benefit use is context dependent and support a contextual conceptualization of privacy norms.
Moreover, normative beliefs can partly be explained by individual characteristics. Interestingly, in two out of three data type scenarios, general privacy concerns decreased acceptance, suggesting that more general privacy sentiments are not always obliterated by concrete situational factors. Perceived sensitivity of a given data type and general privacy concerns strongly contribute to the explanation of variance in normative beliefs, i.e. they appear as major influential factors. However, interactions between a given data type and its sensitivity as well as the interaction between recipient and trust in the recipient were miniscule. Therefore, the data do not allow the inference that higher trust or higher perceived sensitivity strongly altered the impact of recipients or data types.
The present study contributes to an extension of the application of the contextual integrity framework for privacy norms and offers first insights into conditions under which using data for public benefit uses may be deemed appropriate. It suggests to cautiously design data transmissions of individual data for public benefit uses, particularly as the advancement of possibly invasive technologies promises improvements of the provision of public goods. No one-fits-all preference for public benefit uses of individual data exists but, importantly, public actors are preferred recipients for such a data use. This study paves the way for future investigations of data sharing for the public good and argues to further investigate interindividual differences as drivers of normative beliefs. Furthermore, research and practice will profit from examining additional data sharing scenarios as well as behavioral implications of privacy norms for public benefit use contexts.
Nissenbaum, Helen (2004): Privacy as Contextual Integrity, Washington Law Review 79(1): 119–157.
Nissenbaum, Helen (2018): Respecting Context to Protect Privacy. Why Meaning Matters, Science and Engineering Ethics 24(3): 831–852.