Introduction
Online retailers are constantly battling against fraudulent activities and policy abuse. Even prominent retailers like Wayfair are not immune to these kinds of attacks. “Incident” policy abuse is a particularly costly attack. These fraudsters make false claims that an “incident” affected their order (e.g., their item was either never delivered, or arrived in a damaged state). To support their claims, fraudsters may resort to manipulating and reusing images that show product damage, or even returning the wrong item altogether. The ultimate goal of these fraudsters is to keep the original product while securing a free replacement and/or refund on their purchase.
Ultimately, Wayfair will identify and deactivate abusive accounts, but fraudsters are not easily deterred and tend to quickly abandon flagged accounts and create new ones. Detecting these individuals – referred to as account hoppers – is the focus of this blog post.
Preventing Policy Abuse
While account hoppers are a small percentage of Wayfair's total active customers, their activities amount to a disproportionately large portion of Wayfair's total incident resolution costs. Consequently, early detection and prevention of these malicious activities can lead to significant cost savings for the business.
With new accounts, there is insufficient order history and behavioral data available to detect potential policy abuse. One approach to overcome this thin data challenge is to identify possible connections between these new accounts and known fraud accounts. Shared attributes, such as names, devices, and payment methods help us make these connections, and collectively form a "knowledge graph."
To identify account hopper fraudsters, we have developed a knowledge graph that connects Wayfair customers based on shared attributes used during order placement and while browsing Wayfair's website. Knowledge graphs can be utilized in various ways for machine learning purposes. For instance, we can develop graph features and feed them into downstream models. An example of such a feature might be the shortest path between a customer and a known fraudster, or the number of neighbors of a given customer. While this approach is robust and explainable, it requires significant manual feature engineering efforts. To avoid this, we have developed a Graph Neural Network (GNN) fraud detection model that operates directly on the graph data structure.
Graph Neural Networks
Graph Neural Networks (GNNs) are a type of neural network designed to work with data that can be represented as a graph. Graphs are a mathematical structure that consists of a set of nodes (also called vertices) connected by edges (also called links or relationships). In our problem, the nodes are customer accounts, and the edges represent shared attributes that “link” accounts together.
Graphs are often dynamic and topologically complex, with multiple types of nodes and edges, and arbitrary sizes with no fixed ordering or reference point. GNNs help us apply machine learning models to graphs by transforming the embedded information into a friendlier form. In particular, GNNs aim to learn representations of the nodes in a graph that take into account the graph's topology and the nodes' features. In other words, GNNs try to capture both the local and global information of a graph by aggregating information from neighboring nodes and updating the node representation accordingly. The output of a GNN is typically a vector representation (also called an embedding) for each node in the graph.
Problems that are solved with GNNs can typically be split into three categories: node-level, edge-level, and graph-level tasks. For a further introduction to GNNs, we refer to the resources listed below [2]. GNNs have been applied to various use cases in industry, including arrival time prediction in Google Maps [3], protein folding [4], and fake news detection [5].
Account Hopper Detection
There are different types of GNNs, but most of them follow a message-passing scheme, where each node sends “messages” (i.e., information about its features) to its neighbors, which are then aggregated and used to update the node's own embedding. For our use-case, whenever we calculate the node representation of a customer, first we combine the embeddings of neighboring customers and then we update the embedding of the target customer.
This process is typically repeated for multiple rounds, allowing the node embeddings to capture increasingly complex relationships between nodes.
We evaluated three different message-passing implementations for our account hopper fraud detection problem: Graph Convolutional Layers (GCN) [6], GraphSage [7], and Graph Attention Layers (GAT) [8]. We found that GraphSage produced the best results for our task.
We model our account hopper fraud detection problem as a node-level binary classification task, where customer nodes are classified as either fraud or not. Our model consists of two GraphSAGE convolutional layers, with Dropout, ReLU non-linearity, and log-softmax output. Two GNN layers ensure that we take the 2-hop neighborhood as input for each node that we classify. We found that adding more than two layers did not provide much benefit, due to the over-smoothing problem with message passing in graphs [9]. The GNN output is a positive or negative class assignment for each node, where positive assignment indicates a probable fraud account.
We also benchmarked our GNN model against a more traditional gradient-boosted fraud detection model using the same set of features and labels. We measured a 10% relative improvement in the Precision-Recall AUC with the GNN model.
Model serving
There are several challenges for serving GNN models. Loading all training or inference data into memory may not be possible so data is often loaded in mini-batches. However, randomly sampling nodes in a graph can result in a large number of isolated nodes without edges between them. To address this issue, the creators of GraphSage introduced the neighborhood sampler [7], which not only samples a set of target nodes but also samples neighboring nodes and edges. This approach simulates the actual computational flow of GNNs and enables graph ML models to generalize to new, unseen data during training. This is a crucial step for using GNN models in real-time.
For real time serving, we would need to obtain the neighborhood for a target node and the features for each neighbor with low latency and high throughput. This is a significant engineering challenge so our initial GNN model only performs batch training and inference multiple times a day. Nonetheless, we are actively working towards real-time serving. Appropriate design, data velocity and infrastructure is essential to address these challenges.
Results
The GNN model has proved to be an effective tool for detecting fraudsters. Thousands of new fraudsters have been accurately detected resulting in millions of dollars in policy abuse savings annually. As our model continues to evolve, improvements are being made to increase the accuracy and number of detected fraudsters while minimizing any negative impact on honest customers. We are further investing in feeding account association graph features into our feature store which can be leveraged by multiple downstream ML tasks.
By leveraging the power of GNNs and continuously improving the underlying graph data structure, Wayfair aims to stay ahead of fraudsters and maintain a safe and secure online marketplace for all its customers.
Conclusion
In conclusion, the use of Graph Neural Networks for account hopper fraud detection in online retail has proved to be highly effective. With the ability to directly operate on graph data structures and capture complex relationships between customers, GNN models have enabled Wayfair to detect fraudsters and prevent fraudulent activity, resulting in significant cost savings for the business. The adoption of GNN models in industry is increasing, as they are applied to a range of problems in various domains. With the continuous improvement of GNN algorithms, we can expect to see further advancements in graph-based machine learning for fraud detection and other applications.
Resources
[1] Grigorieva, Maria. “Graph Database Neo4j”.
[2] A Gentle Introduction to Graph Neural Networks
[3] ETA Prediction with Graph Neural Networks in Google Maps
[4] Highly accurate protein structure prediction with AlphaFold
[5] Fake News Detection on Social Media using Geometric Deep Learning
[6] Semi-Supervised Classification with Graph Convolutional Networks