Document Type

Dissertation

Degree

Doctor of Philosophy (PhD)

Major/Program

Computer Science

First Advisor's Name

Dr. Mark A Finlayson

First Advisor's Committee Title

Committee chair

Second Advisor's Name

Dr. Monique Ross

Second Advisor's Committee Title

committee member

Third Advisor's Name

Dr. Armando Barreto

Third Advisor's Committee Title

committee member

Fourth Advisor's Name

Dr. Fahad Saeed

Fourth Advisor's Committee Title

committee member

Keywords

Natural Language Processing, Machine Learning, Emotion Detection, Animate Being's Emotion Detection

Date of Defense

3-30-2022

Abstract

Identifying emotions as expressed in text (a.k.a. text emotion recognition) has received a lot of attention over the past decade. Narratives often involve a great deal of emotional expression, and so emotion recognition on narrative text is of great interest to computational approaches to narrative understanding. The meaning and impact of narratives is strongly bound up with the emotions expressed therein. Emotions may be experienced by characters in a story (which may include the narrator), by a story-external narrator, or by the reader. There has been so far two separate streams of work relevant to this observation: (1) emotion detection, and (2) detection of animate beings. These two streams have not yet been combined to attempt to identify the emotions experienced by animate beings in the text. In this dissertation, I use the two streams to construct a computational framework for detecting the emotions experienced by animate beings in a given text. In the first step, I design a high-performing approach to emotion recognition in narrative text and carefully implement and characterize the technique, exploring a design space of three different noise cancellation or dimension reduction techniques (NMF, PCA, or LDA), exploring various hyper-parameter settings. My experiments indicate that NMF performed best, with an overall F1 of 0.809. In the second step, I identify and improve an emotion lexicon to be used for my animate beings emotion detection system. There have been several attempts to create an

accurate and thorough emotion lexicon in English, which identifies the emotional content of words. Of the several commonly used resources, the NRC emotion lexicon has received the most attention due to its availability, size, and its choice of Plutchik’s expressive 8-class emotion model. In this work, I identify a large number of troubling entries in the NRC lexicon, where words that should in most contexts be emotionally neutral, with no affect (e.g., lesbian, stone, mountain), are associated with emotional labels that are inaccurate, nonsensical, pejorative, or, at best, highly contingent and context-dependent (e.g., lesbian labeled as DISGUST and SADNESS, stone as ANGER, or mountain as ANTICIPATION). I describe a procedure for semi-automatically correcting these problems in the NRC, which includes disambiguating POS categories and aligning NRC entries with other emotion lexicons to infer the accuracy of labels. I demonstrate via an experimental benchmark that the quality of the resources is thus improved. Joshuan Jimenez, a graduate student in the Cognac lab, assisted me with the manual part. In the third step, to develop my animate being emotion detection system, I and Joshuan Jimenez provide the ABBE corpus—Animate Beings Being Emotional—a new double annotated corpus of texts that captures this key information for one class of emotion experiencer, namely, animate beings in the world described by the text. Such a corpus is useful for developing systems that seek to model or understand this specific type of expressed emotion. Our corpus contains 30 chapters, comprising 134,513 words, drawn from the Corpus of English Novels, and contains 2,010 unique emotion expressions attributable to 2,227 animate beings. The emotion expressions are categorized according to Plutchik’s 8-category emotion model, and the overall inter-annotator agreement for the annotations was 0.83 Cohen’s Kappa, indicating excellent agreement. Finally, I demonstrate an emotion detection system based on a non-neural machine learning classifier to identify the emotions expressed as being experienced by animate beings. I use Plutchik’s emotion model (JOY, SADNESS, ANGER, FEAR, SURPRISE, AN

TICIPATION, TRUST, and DISGUST), as well as the Revised NRC Emotion Lexicon developed in Step two. I train my model and evaluate my results using ABBE that has been annotated for animate beings, emotions, and the connections between them in the previous step. The system achieves an overall micro F1 of 0.76 when using gold-standard animate beings, and 0.60 when relying on computed animate beings, showing that this task is more challenging than expected.

Identifier

FIDC010516

Share

COinS
 

Rights Statement

Rights Statement

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).