Document Type

Dissertation

Degree

Doctor of Philosophy (PhD)

Major/Program

Computer Science

First Advisor's Name

S. S. Iyengar

First Advisor's Committee Title

Committee Chair

Second Advisor's Name

Niki Pissinou

Second Advisor's Committee Title

Committee Member

Third Advisor's Name

Jean Andrian

Third Advisor's Committee Title

Committee Member

Fourth Advisor's Name

N. R. Sunitha

Fourth Advisor's Committee Title

Committee Member

Fifth Advisor's Name

Deng Pan

Fifth Advisor's Committee Title

Committee Member

Sixth Advisor's Name

Leonardo Bobadilla

Sixth Advisor's Committee Title

Committee Member

Keywords

Click Fraud, Online Advertisement, In-app Advertisement, Machine Learning

Date of Defense

9-30-2019

Abstract

Click Fraud is the fraudulent act of clicking on pay-per-click advertisements to increase a site’s revenue, to drain revenue from the advertiser, or to inflate the popularity of content on social media platforms. In-app advertisements on mobile platforms are among the most common targets for click fraud, which makes companies hesitant to advertise their products. Fraudulent clicks are supposed to be caught by ad providers as part of their service to advertisers, which is commonly done using machine learning methods. However: (1) there is a lack of research in current literature addressing and evaluating the different techniques of click fraud detection and prevention, (2) threat models composed of active learning systems (smart attackers) can mislead the training process of the fraud detection model by polluting the training data, (3) current deep learning models have significant computational overhead, (4) training data is often in an imbalanced state, and balancing it still results in noisy data that can train the classifier incorrectly, and (5) datasets with high dimensionality cause increased computational overhead and decreased classifier correctness -- while existing feature selection techniques address this issue, they have their own performance limitations. By extending the state-of-the-art techniques in the field of machine learning, this dissertation provides the following solutions: (i) To address (1) and (2), we propose a hybrid deep-learning-based model which consists of an artificial neural network, auto-encoder and semi-supervised generative adversarial network. (ii) As a solution for (3), we present Cascaded Forest and Extreme Gradient Boosting with less hyperparameter tuning. (iii) To overcome (4), we propose a row-wise data reduction method, KSMOTE, which filters out noisy data samples both in the raw data and the synthetically generated samples. (iv) For (5), we propose different column-reduction methods such as multi-time-scale Time Series analysis for fraud forecasting, using binary labeled imbalanced datasets and hybrid filter-wrapper feature selection approaches.

Identifier

FIDC007836

ORCID

0000-0001-9606-0128

Previously Published In

  • G. S. Thejas, Kianoosh G. Boroojeni, Kshitij Chandna, Isha Bhatia, S. S. Iyengar, and N. R. Sunitha. 2019. Deep Learning-based Model to Fight Against Ad Click Fraud. In Proceedings of the 2019 ACM Southeast Conference (ACM SE '19). ACM, New York, NY, USA, 176-181. DOI: https://doi.org/10.1145/3299815.3314453
  • G. S. Thejas, S. R. Joshi, S. S. Iyengar, N. R. Sunitha and P. Badrinath, "Mini-Batch Normalized Mutual Information: A Hybrid Feature Selection Method," in IEEE Access, vol. 7, pp. 116875-116885, 2019. doi: 10.1109/ACCESS.2019.2936346

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS
 

Rights Statement

Rights Statement

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).