Automatic Extraction of Narrative Structure from Long Form Text

Joshua Daniel Eisenberg, Florida International University


Automatic understanding of stories is a long-time goal of artificial intelligence and natural language processing research communities. Stories literally explain the human experience. Understanding our stories promotes the understanding of both individuals and groups of people; various cultures, societies, families, organizations, governments, and corporations, to name a few. People use stories to share information. Stories are told –by narrators– in linguistic bundles of words called narratives. My work has given computers awareness of narrative structure. Specifically, where are the boundaries of a narrative in a text. This is the task of determining where a narrative begins and ends, a non-trivial task, because people rarely tell one story at a time. People don’t specifically announce when we are starting or stopping our stories: We interrupt each other. We tell stories within stories. Before my work, computers had no awareness of narrative boundaries, essentially where stories begin and end. My programs can extract narrative boundaries from novels and short stories with an F1 of 0.65. Before this I worked on teaching computers to identify which paragraphs of text have story content, with an F1 of 0.75 (which is state of the art). Additionally, I have taught computers to identify the narrative point of view (POV; how the narrator identifies themselves) and diegesis (how involved in the story’s action is the narrator) with F1 of over 0.90 for both narrative characteristics. For the narrative POV, diegesis, and narrative level extractors I ran annotation studies, with high agreement, that allowed me to teach computational models to identify structural elements of narrative through supervised machine learning. My work has given computers the ability to find where stories begin and end in raw text. This allows for further, automatic analysis, like extraction of plot, intent, event causality, and event coreference. These tasks are impossible when the computer can’t distinguish between which stories are told in what spans of text. There are two key contributions in my work: 1) my identification of features that accurately extract elements of narrative structure and 2) the gold-standard data and reports generated from running annotation studies on identifying narrative structure.

Subject Area

Artificial intelligence|Robotics|Computer science

Recommended Citation

Eisenberg, Joshua Daniel, "Automatic Extraction of Narrative Structure from Long Form Text" (2018). ProQuest ETD Collection for FIU. AAI27736481.