Document Type



Doctor of Philosophy (PhD)


Computer Science

First Advisor's Name

Bogdan Carbunar

First Advisor's Committee Title

Committee chair

Second Advisor's Name

Mark Finlayson

Second Advisor's Committee Title

Committee member

Third Advisor's Name

Naphtali Rishe

Third Advisor's Committee Title

Committee member

Fourth Advisor's Name

Leonardo Bobadilla

Fourth Advisor's Committee Title

Committee member

Fifth Advisor's Name

Wensong Wu

Fifth Advisor's Committee Title

Committee member


search rank fraud, opinion spam, crowdturfing, fake review, misinformation, De-Anonymization, App Store Optimization

Date of Defense



Search rank fraud, i.e., the posting of large numbers of fake activities for products hosted in commercial peer-opinion services such as those provided by Google, Apple, Amazon, seeks to give the illusion of grassroots engagement, and boost financial gains, promote malware and even assist censorship efforts. Search rank fraud continues to be a significant problem, after years of investment from service providers and the academic community. In this thesis we envision that knowledge of the authentic capabilities, behaviors and strategies employed by empirically validated workers, will enable us to develop solutions that efficiently manage and contain search rank fraud, by detecting, classifying and neutralizing its effects. We posit that to be effective, fraud detection and classification efforts need to involve the organizations and individuals who contribute to search rank fraud.

In this thesis we therefore engaged with professional workers to (1) collect ground truth knowledge and evaluate defenses, (2) develop fraud detection and classification solutions that adapt to rater strategy changes, and (3) attribute fraud to the organizations that posted it. More specifically, we first performed qualitative and quantitative investigations with professional workers, concerning activities they performed on Google Play. We reveal findings concerning various aspects of worker capabilities and behaviors, including novel insights into their working patterns. We confirm the existence of power workers who control many devices and user accounts, and also the emergence of organic workers, i.e., almost-regular users who occasionally promote products from the devices and accounts that they also use for personal purposes.

In a second contribution we develop RacketStore, a framework to capture detailed insights about how Google Play Store users use their devices and the apps installed therein. We use RacketStore to develop and evaluate the first solutions that disentangle organic from federated fraud, and from honest behaviors. Specifically, we use data collected from installations of RacketStore on 803 devices to show that features that model the user interaction with a device can be used to distinguish devices controlled by organic workers from those of power workers and regular users of the Google Play service.

In a third contribution we introduce a fraud de-anonymization approach to disincentivize fraud perpetrated by power workers: attribute user accounts used to promote apps to the human workers in crowdsourcing sites, who control them. We model fraud de-anonymization as a maximum likelihood estimation problem and develop a graph based deep learning approach to predict ownership of account pairs by the same fraudster. We introduce the first cheating-resistant fraud de-anonymization validation protocol, that transforms human fraud workers into ground truth, performance evaluation oracles.

The success of the approach proposed in this thesis suggests that the next generation of fraud detection and prevention solutions will benefit from the integration of validated professional workers into the problem modeling, solution design and evaluation processes.





Rights Statement

Rights Statement

In Copyright. URI:
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).