Document Type
Dissertation
Degree
Doctor of Philosophy (PhD)
Major/Program
Computer Science
First Advisor's Name
Tao Li
First Advisor's Committee Title
Committee chair
Second Advisor's Name
Shu-Ching Chen
Second Advisor's Committee Title
Comittee member
Third Advisor's Name
Sundaraja Sitharama Iyengar
Third Advisor's Committee Title
Comittee member
Fourth Advisor's Name
Ning Xie
Fourth Advisor's Committee Title
Comittee member
Fifth Advisor's Name
Debra VanderMeer
Fifth Advisor's Committee Title
Comittee member
Keywords
Data Mining, Textual Understanding, Domain Adaptation, Text Summarization, Deep Learning, Learn to Rank, Metric Learning
Date of Defense
10-4-2017
Abstract
More than ever, information delivery online and storage heavily rely on text. Billions of texts are produced every day in the form of documents, news, logs, search queries, ad keywords, tags, tweets, messenger conversations, social network posts, etc. Text understanding is a fundamental and essential task involving broad research topics, and contributes to many applications in the areas text summarization, search engine, recommendation systems, online advertising, conversational bot and so on. However, understanding text for computers is never a trivial task, especially for noisy and ambiguous text such as logs, search queries. This dissertation mainly focuses on textual understanding tasks derived from the two domains, i.e., disaster management and IT service management that mainly utilizing textual data as an information carrier.
Improving situation awareness in disaster management and alleviating human efforts involved in IT service management dictates more intelligent and efficient solutions to understand the textual data acting as the main information carrier in the two domains. From the perspective of data mining, four directions are identified: (1) Intelligently generate a storyline summarizing the evolution of a hurricane from relevant online corpus; (2) Automatically recommending resolutions according to the textual symptom description in a ticket; (3) Gradually adapting the resolution recommendation system for time correlated features derived from text; (4) Efficiently learning distributed representation for short and lousy ticket symptom descriptions and resolutions. Provided with different types of textual data, data mining techniques proposed in those four research directions successfully address our tasks to understand and extract valuable knowledge from those textual data.
My dissertation will address the research topics outlined above. Concretely, I will focus on designing and developing data mining methodologies to better understand textual information, including (1) a storyline generation method for efficient summarization of natural hurricanes based on crawled online corpus; (2) a recommendation framework for automated ticket resolution in IT service management; (3) an adaptive recommendation system on time-varying temporal correlated features derived from text; (4) a deep neural ranking model not only successfully recommending resolutions but also efficiently outputting distributed representation for ticket descriptions and resolutions.
Identifier
FIDC003998
Recommended Citation
Zhou, Wubai, "Data Mining Techniques to Understand Textual Data" (2017). FIU Electronic Theses and Dissertations. 3493.
https://digitalcommons.fiu.edu/etd/3493
Rights Statement
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).