COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > Learning to Detect Stance and Represent Emojis
Learning to Detect Stance and Represent EmojisAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Kris Cao. In this two-part talk, I will first introduce our work on stance detection (EMNLP 2016) and then on learning emoji representations (SocialNLP@EMNLP 2016, best paper). Stance detection is the task of classifying the attitude expressed in a text towards a target such as Hillary Clinton to be ‘positive’, ‘negative’ or ‘neutral’. Previous work has assumed that either the target is mentioned in the text or that training data for every target is given. This paper considers the more challenging version of this task, where targets are not always mentioned and no training data is available for the test targets. We experiment with conditional LSTM encoding, which builds a representation of the tweet that is dependent on the target, and demonstrate that it outperforms encoding the tweet and the target independently. Performance is improved further when the conditional model is augmented with bidirectional encoding. We evaluate our approach on the SemEval 2016 Task 6 Twitter Stance Detection corpus achieving performance second best only to a system trained on semi-automatically labelled tweets for the test target. When such weak supervision is added, our approach achieves state-of-the-art results. Many current natural language processing applications for social media rely on representation learning and utilize pre-trained word embeddings. There currently exist several publicly-available, pre-trained sets of word embeddings, but they contain few or no emoji representations even as emoji usage in social media has increased. In this paper we release emoji2vec, pre-trained embeddings for all Unicode emojis which are learned from their description in the Unicode emoji standard. The resulting emoji embeddings can be readily used in downstream social natural language processing applications alongside word2vec. We demonstrate, for the downstream task of sentiment analysis, that emoji embeddings learned from short descriptions outperforms a skip-gram model trained on a large collection of tweets, while avoiding the need for contexts in which emojis need to appear frequently in order to estimate a representation. This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsCambridge Centre for Political Thought Assessment Principles Meeting the Challenge of Healthy Ageing in the 21st Century Humanitarian events Graham Storey Lecture Trinity Hall Fairtrade FortnightOther talksCafe Synthetique: Synthetic Biology Industry Night Deterministic RBF Surrogate Methods for Uncertainty Quantification, Global Optimization and Parallel HPC Applications Debtors’ schedules: a new source for understanding the economy in 18th-century England Animal Migration Oncological imaging: introduction and non-radionuclide techniques Emissions and Chemistry of air pollution in London and Beijing: a tale of two cities. |