IIIT Hyderabad Publications |
|||||||||
|
Towards Trustworthy Digital Ecosystem: From Fair Representation Learning to Fraud DetectionAuthor: Arvindh A 2019111010 Date: 2024-05-08 Report no: IIIT/TH/2024/41 Advisor:Ponnurangam kumaraguru AbstractTwo critical challenges arise in the interconnected realm of online platforms: the need to ensure equitable representation for entities and the emphasis on identifying deceptive practices. This work aims to address both by introducing CAFIN to decrease the disparity in the representations of GNNs, and by detecting and studying anomalies on the Google Play Store reviewer network for fraud prevention, moving in the direction of reshaping digital ecosystems to be both fair and trustworthy. Unsupervised Representation Learning on graphs is gaining traction due to the increasing abundance of unlabelled network data and the compactness, richness, and usefulness of the representations generated. In this context, the need to consider fairness and bias constraints while generating the representations has been well-motivated and studied to some extent in prior works. One major limitation of most of the prior works in this setting is that they do not aim to address the bias generated due to connectivity patterns in the graphs, such as varied node centrality, which leads to a disproportionate performance across nodes. In our work, we aim to address this issue of mitigating bias due to inherent graph structure in an unsupervised setting. To this end, we propose CAFIN, a centrality-aware fairness-inducing framework that leverages the structural information of graphs to tune the representations generated by existing frameworks. We deploy it on GraphSAGE (a popular framework in this domain) and showcase its efficacy on two downstream tasks - Node Classification and Link Prediction. Empirically, CAFIN consistently reduces the performance disparity across popular datasets (varying from 18% to 80% reduction in performance disparity) from various domains while incurring only a minimal cost of fairness. Google Play Store’s policy forbids the use of incentivized installs, ratings, and reviews to manipulate the placement of apps. However, there still exist apps that incentivize installs for other apps on the platform. To understand how install-incentivizing apps affect users, we examine their ecosystem through a socio-technical lens and perform a mixed-methods analysis of their reviews and permissions. Our dataset contains 319K reviews collected daily over five months from 60 such apps that cumulatively account for over 160.5M installs. We perform qualitative analysis of the reviews to reveal various types of dark patterns that developers incorporate in install-incentivizing apps, highlighting their normative concerns at both user and platform levels. Permissions requested by these apps validate our discovery of dark patterns, with over 92% apps accessing sensitive user information. We find evidence of fraudulent reviews on install-incentivizing apps, following which we model them as an edge stream in a dynamic bipartite graph of apps and reviewers. Our proposed reconfiguration of a state-of-the-art microcluster anomaly detection algorithm yields promising preliminary results in detecting this fraud. We discover highly significant lockstep behaviors exhibited by reviews that aim to boost the overall rating of an install-incentivizing app. Upon evaluating the 50 most suspicious clusters of boosting reviews detected by the algorithm, we find (i) near-identical pairs of reviews across 94% (47 clusters), and (ii) over 35% (1,687 of 4,717 reviews) present in the same form near-identical pairs within their cluster. We also discuss how fraud is intertwined with labor and poses a threat to the trust and transparency of Google Play Full thesis: pdf Centre for C2S2-Precog |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |