Document Type

Dissertation

Degree

Doctor of Philosophy (PhD)

Major/Program

Computer Science

First Advisor's Name

Ning Xie

First Advisor's Committee Title

Committe Chair

Second Advisor's Name

Sundaraja Sitharama Iyengar

Second Advisor's Committee Title

Commitee Member

Third Advisor's Name

Shu-Ching Chen

Third Advisor's Committee Title

Commitee Member

Fourth Advisor's Name

Jaime Leonardo Bobadilla

Fourth Advisor's Committee Title

Commitee Member

Fifth Advisor's Name

Debra VanderMeer

Fifth Advisor's Committee Title

Commitee Member

Keywords

Deep Learning, Natural Languague Processing, Sentiment Analysis, Sequence Tagging, Paraphrase Isentification, Convolutional Neurtal Network, Long Short Term Memory, Visualization, Text Data, Deep Neural Network

Date of Defense

11-16-2018

Abstract

As the web evolves even faster than expected, the exponential growth of data becomes overwhelming. Textual data is being generated at an ever-increasing pace via emails, documents on the web, tweets, online user reviews, blogs, and so on. As the amount of unstructured text data grows, so does the need for intelligently processing and understanding it. The focus of this dissertation is on developing learning models that automatically induce representations of human language to solve higher level language tasks.

In contrast to most conventional learning techniques, which employ certain shallow-structured learning architectures, deep learning is a newly developed machine learning technique which uses supervised and/or unsupervised strategies to automatically learn hierarchical representations in deep architectures and has been employed in varied tasks such as classification or regression. Deep learning was inspired by biological observations on human brain mechanisms for processing natural signals and has attracted the tremendous attention of both academia and industry in recent years due to its state-of-the-art performance in many research domains such as computer vision, speech recognition, and natural language processing.

This dissertation focuses on how to represent the unstructured text data and how to model it with deep learning models in different natural language processing

viii

applications such as sequence tagging, sentiment analysis, semantic similarity and etc. Specifically, my dissertation addresses the following research topics:

  • In Chapter 3, we examine one of the fundamental problems in NLP, text classification, by leveraging contextual information [MLX18a];

  • In Chapter 4, we propose a unified framework for generating an informative map from review corpus [MLX18b];

  • Chapter 5 discusses the tagging address queries in map search [Mok18]. This research was performed in collaboration with Microsoft; and

  • In Chapter 6, we discuss an ongoing research work in the neural language sentence matching problem. We are working on extending this work to a recommendation system.

Identifier

FIDC007701

Share

COinS
 

Rights Statement

Rights Statement

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).