Doctor of Philosophy (PhD)
First Advisor's Name
First Advisor's Committee Title
Second Advisor's Name
Second Advisor's Committee Title
Third Advisor's Name
Third Advisor's Committee Title
Fourth Advisor's Name
Fourth Advisor's Committee Title
Fifth Advisor's Name
Fifth Advisor's Committee Title
Sixth Advisor's Name
Sixth Advisor's Committee Title
Anomaly Detection, Sequential Data, Deep Learning, Long Short Term Memory, Optimization
Date of Defense
Anomaly Detection has been researched in various domains with several applications in intrusion detection, fraud detection, system health management, and bio-informatics. Conventional anomaly detection methods analyze each data instance independently (univariate or multivariate) and ignore the sequential characteristics of the data. Anomalies in the data can be detected by grouping the individual data instances into sequential data and hence conventional way of analyzing independent data instances cannot detect anomalies. Currently: (1) Deep learning-based algorithms are widely used for anomaly detection purposes. However, significant computational overhead time is incurred during the training process due to static constant batch size and learning rate parameters for each epoch, (2) the threshold to decide whether an event is normal or malicious is often set as static. This can drastically increase the false alarm rate if the threshold is set low or decrease the True Alarm rate if it is set to a remarkably high value, (3) Real-life data is messy. It is impossible to learn the data features by training just one algorithm. Therefore, several one-class-based algorithms need to be trained. The final output is the ensemble of the output from all the algorithms. The prediction accuracy can be increased by giving a proper weight to each algorithm's output. By extending the state-of-the-art techniques in learning-based algorithms, this dissertation provides the following solutions: (i) To address (1), we propose a hybrid, dynamic batch size and learning rate tuning algorithm that reduces the overall training time of the neural network. (ii) As a solution for (2), we present an adaptive thresholding algorithm that reduces high false alarm rates. (iii) To overcome (3), we propose a multilevel hybrid ensemble anomaly detection framework that increases the anomaly detection rate of the high dimensional dataset.
Previously Published In
J. Soni, N. Prabakar and H. Upadhyay, "Behavioral Analysis of System Call Sequences Using LSTM Seq-Seq, Cosine Similarity and Jaccard Similarity for Real-Time Anomaly Detection," 2019 International Conference on Computational Science and Computational Intelligence (CSCI), 2019, pp. 214-219, doi: 10.1109/CSCI49370.2019.00043.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Soni, Jayesh, "Anomaly Detection in Sequential Data: A Deep Learning-Based Approach" (2022). FIU Electronic Theses and Dissertations. 5052.
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).