Confirming the Non-compositionality of Idioms for Sentiment Analysis


An idiom is defined as a non-compositional multiword expression, one whose meaning cannot be deduced from the definitions of the component words. This definition does not explicitly define the compositionality of an idiom’s sentiment; this paper aims to determine whether the sentiment of the component words of an idiom is related to the sentiment of that idiom. We use the Dictionary of Affect in Language augmented by WordNet to give each idiom in the Sentiment Lexicon of IDiomatic Expressions (SLIDE) a component-wise sentiment score and compare it to the phrase-level sentiment label crowdsourced by the creators of SLIDE. We find that there is no discernible relation between these two measures of idiom sentiment. This supports the hypothesis that idioms are not compositional for sentiment along with semantics and motivates further work in handling idioms for sentiment analysis.

In Proceedings of the Joint Workshop on Multiword Expressions and WordNet
Alyssa Hwang
Alyssa Hwang
PhD Student

I am a first-year PhD student in the Department of Computer and Information Science at the University of Pennsylvania. I am particularly interested in the intersections of Natural Language Processing, Linguistics, and Psychology, especially expanding NLU resources for nonstandard English. I am supported by the NSF Graduate Research Fellowship Program. I earned my BS in Computer Science at Columbia University, where I conducted research and wrote an undergraduate thesis with Prof. Kathleen McKeown.