Payment Fraud
Introduction
This use case demonstrates a simple version of fraud detection, in the context of payment processing. We receive a large set of transactions from the organization, and need to detect fraudulent transactions among them.
Video Walkthrough
Problem Description
(In this context, organization handles the payment processing)
- Payment fraud impacts the bottom line of the organization
- Payment fraud impacts the reputation of the organization. Hence, other stakeholders such as consumer, merchant, investor will be hesitant to interact with the organization
How is payment fraud committed?
- Most common approaches: Use stolen cards or account take over (hacking password)
- Other approaches: Stolen identity: Acquire financial resources (cards) after stealing one’s identity.
Data schema
The following table describes the attributes used to train this model.
# | Column | Type | Description |
---|---|---|---|
1 | id | id | ID column |
2 | device | Categorical | Type of device used |
3 | os | Categorical | Operating system |
4 | channel | Categorical | Operating system |
5 | vertical | Categorical | Merchanise category (e.g. travel, electronics) |
6 | cust_past_txn_hour_01 | Continuous | Indicates whether the customer has transacted at this time (current transaction before) |
7 | dom_travel | Continuous | Number of domestic travels by the customer in the past 12 months |
8 | intl_travel | Continuous | Number of international travels by the customer in the past 12 months |
9 | ip_match | Categorical | Match between location from IP address, and location specified by the customer |
10 | latency_time | Continuous | Time needed the signal to travel from company server to customer device and back (Useful to identify usage of VPN) |
11 | avg_amt_vertical | Continuous | Avg amount for transactions (all customers) for the vertical (in the current transaction) in the recent past |
12 | max_amt_vertical | Continuous | Max amount for transactions (all customers) for the vertical (in the current transaction) in the recent past |
13 | email_domain | Categorical | Type of email domain (com, edu, etc.) |
14 | avg_amt_6m | Continuous | Avg transaction amount for the current customer in past 6 months |
15 | avg_amt_1m | Continuous | Avg transaction amount for the current customer in past 1 months |
16 | ip3_fraud_rate | Continuous | Fraud rate (%) for given IP |
17 | amount | Continuous | Indicates whether the customer has transacted at this time (current transaction before)” |
How to setup in platform
This is available as a sample project in Razorthink AI. Follow the steps below to run inference on this model:
- Sample inference file
payment-fraud-inference.csv
is available in Community Space section of the Workspace. Copy this file to My Space by clicking on the copy-to-my-space button on the right. This file has the same attributes as described above. - Open the sample project named
PAYMENT FRAUD
and run the pipeline named:Inference Pipeline
. - The result of this inference will be saved in My Space in a file named
payment-fraud-result.csv
with a column namedfraud_predicted
. - You can run inference with your own data (ensure that all the attributes listed above are present) by uploading your
CSV
file to My Space and updatingfileName
attribute ofDataSource
andDataTarget
in the run parameters ofInference Pipeline
. You can change the run parameters in the screen prompted when you click on the run icon of the pipeline.
Concepts to know