Document Type

Dissertation

Degree

Doctor of Philosophy (PhD)

Major/Program

Computer Science

First Advisor's Name

Mark Finlayson

First Advisor's Committee Title

Committee Chair

Second Advisor's Name

Giri Narasimhan

Second Advisor's Committee Title

Committee Member

Third Advisor's Name

Leonardo Bobadilla

Third Advisor's Committee Title

Committee Member

Fourth Advisor's Name

Monique Ross

Fourth Advisor's Committee Title

Committee Member

Fifth Advisor's Name

Selcuk Uluagac

Fifth Advisor's Committee Title

Committee Member

Keywords

Natural Language Processing, Temporal Algebra, Temporal Reasoning, Information Extraction

Date of Defense

9-19-2022

Abstract

Narratives contain a lot of temporal information. To capture the temporal information in texts, natural language processing researchers developed TimeML, the temporal markup language to annotate temporal information. Temporal graphs can be derived directly from TimeML annotations and can reveal partial ordering of events and times. However, for many purposes, a global order (timeline) is more useful.

The first component of my work focused on timeline extraction from TimeML annotations. Prior approaches have presented machine learning-based systems, which have certain limitations such as imperfect scores, ignoring subordinated relations, and being unable to handle all types of temporal relations. I addressed these issues and presented a constraint satisfaction problem-based solution that achieved state-of-the-art performance.

One way to generate TimeML annotation in texts is to perform manual annotation. However, manual annotations contain human-made errors. In the second component of my work, I built a system to detect errors in the gold-standard annotations and to help users fix them. I tested the system on the TimeBank corpus and provided corrections for the entire corpus.

Another way to generate TimeML annotations is to use automatic annotation systems. In the third component of my work, I developed a novel suite of methods to evaluate the performance of automatic annotators that measures the information loss during the automatic annotation process. I presented eight metrics and evaluated four state-of-the-art automatic annotation tools.

In the last component, I successfully implemented a duration extraction system. This work resulted in a large dataset that contains hundreds of thousands of possible event durations. Combining this work with the timeline extraction system, I was able to extract the duration of entire narratives.

Identifier

FIDC010854

Share

COinS
 

Rights Statement

Rights Statement

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).