This topic describes how to use Graph Compute to provide a solution for marketing risk control in the taxi industry.
Scenario
Marketing risk control in the taxi industry involves detecting joint brushing frauds committed by taxi drivers and passengers or by multiple taxi drivers. After a taxi order is complete, the risk control system is called to determine whether the order is a joint brushing fraud committed by a taxi driver and a passenger or by taxi drivers. To implement this scenario, you need to create a graph about relationships among the driver, passenger, and orders, and provide real-time graph query and analysis capabilities.
Before you create the graph, you need to obtain all the mobile numbers and device IDs of the passenger and driver. Then, the risk control system calculates the number of orders between the passenger and driver. If the number of orders exceeds a specific number that you configure, the risk control system determines that the passenger and driver may create fake orders.
The graph is created based on the following flow: mobile numbers or device IDs of the passenger, unique ID (UID) of the passenger, order IDs, UID of the driver, and mobile numbers or device IDs of the driver.
Customer requirements
O&M-free: Create a real-time risk control system.
Data integration: Provide the data integration capability based on the big data platform MaxCompute.
Business requirements: Build and publish a graph schema with ease.
Performance requirements: Ensure the efficiency of real-time data updates and reduce the latency for online access.
Create a graph schema
To hide from fraud detection, passengers and drivers may change mobile numbers, device IDs, and account IDs to commit brushing frauds. After a taxi order is complete, the system obtains the account IDs, mobile numbers, and device IDs of the passenger and driver from the order. To determine whether the order is a joint brushing fraud, the system queries all historical IDs that are used by the passenger and driver based on the mobile numbers or device IDs of the passenger and driver. Then, the system queries the orders that are related to the passenger and driver. The following queries are involved in this process:
1. Query all the mobile numbers and device IDs of the passenger or driver based on the account ID of the passenger or driver.
2. Query the UID of the passenger or driver based on the mobile numbers and device IDs of the passenger or driver.
3. Query the orders that are related to the passenger and driver based on the UIDs of the passenger and driver.
You can build the following suborder entity table and relational tables based on the graph:
Suborder entity table that uses a key-value (KV) index schema
Primary key (PKey): the ID of a suborder.
Value: other properties.
Relational tables that use a PKey-SKey-Value (KKV) index schema
Table used to obtain the ID of an order based on the UID of the passenger
PKey: the UID of the passenger.
Sub-key (SKey): the ID of the order.
Value: other properties.
Table used to obtain the UID of the passenger based on the ID of an order
PKey: the ID of the order.
SKey: the UID of the passenger.
Value: other properties.
Table used to obtain the identities of the passenger based on the UID of the passenger
PKey: the UID of the passenger.
SKey: the KV that consists of the mobile number, device ID, and payment account.
Value: other properties.
Table used to obtain the UID of the passenger based on the identities of the passenger
PKey: the KV that consists of the mobile number, device ID, and payment account.
SKey: the UID of the passenger.
Value: other properties.
Table used to obtain the ID of an order based on the UID of the driver
PKey: the UID of the driver.
SKey: the ID of the order.
Value: other properties.
Table used to obtain the UID of the driver based on the ID of an order
PKey: the ID of the order.
SKey: the UID of the driver.
Value: other properties.
Table used to obtain the identities of the driver based on the UID of the driver
PKey: the UID of the driver.
SKey: the KV that consists of the mobile number, device ID, and payment account.
Value: other properties.
Table used to obtain the UID of the driver based on the identities of the driver
PKey: the KV that consists of the mobile number, device ID, and payment account.
SKey: the UID of the driver.
Value: other properties.
Solution
The solution queries risk control features within milliseconds based on Graph Compute, unifies the methods of developing and verifying risk control features, and improves the feature iteration speed by hundreds of times. This way, the risk control flexibility is enhanced.
The solution uses MaxCompute and Flink as the data sources of Graph Compute. You can query data in real time, aggregate the features over a duration, such as a week, month, or half a year, and add new risk control features within seconds.
Benefits
The risk control system for the taxi industry is built based only on Graph Compute. This simplifies the complexity of system O&M and allows the production and research and development (R&D) teams to focus on business value.
1. The system optimizes the rules for real-time risk control, improves the timeliness of the rules, and prevents asset losses caused by delays.
2. The system shortens the time interval required for a newly added risk control rule to take effect from days down to minutes and minimizes business frauds. This ensures normal operation and promotion.
3. The system helps reduce R&D investment. Only one user is required to create and manage a Graph Compute instance.
4. The new engine of Graph Compute prevents brushing frauds committed by the same person based on the OneID system and builds a risk control mid-end. This empowers global business.
Core competencies
1. Fast iteration of big data
The powerful offline data processing capability can be used to update terabytes of full data within minutes.
Real-time update links support millions of update messages with low latency.
2. Intelligent O&M
The one-stop lifecycle management of Graph Compute instances allows a single user to process graph data with ease.
The powerful intelligent O&M system can automatically optimize storage based on business features and dynamically allocate resources among compute nodes and storage nodes. This makes the entire system more efficient and ensures high availability.
3. Downwards deployment of the computing logic
The computing logic of Gremlin is deployed downwards to the searcher layer to improve computing efficiency.
Scenarios
You need to identify the following types of joint brushing frauds in the taxi industry: joint brushing frauds committed by taxi drivers or by taxi drivers and passengers.
This section describes how to implement the scenarios.
1. Joint brushing frauds committed by taxi drivers
Query all orders related to the mobile number, ID, Identifier for Vendors (IDFV), Identifier for Advertisers (IDFA), and International Mobile Equipment Identity (IMEI) of a driver within the last day.
Query the orders in which passengers are identified as drivers in the preceding queried orders.
Example:
Mobile number of a driver: 138000
IDFA of the driver: hadsjflkhadlsgfag
IMEI of the driver: adfladfnahjahjkdf
Run the following code to implement the queries:
g.E("mobile_138000;idfa_hadsjflkhadlsgfag;imei_adfladfnahjahjkdf").by("driver_identity_union_id")
.outE().by("driver_cp_order") // The orders that are related to the device of the driver.
.inV().by("cp_order") // The details of the orders.
.filter("order_create >=1634221076929") // Query orders within the last day.
.filter(__.outE().by("cp_order_passenger").outE().by("passenger_user_id_identity").outE().by("driver_identity_union_id")) // Query orders in which passengers are identified as drivers.
2. Joint brushing frauds committed by passengers and drivers
Query all orders that are related to the mobile number, IDFV, IDFA, and IMEI of a passenger and the mobile number, ID, IDFV, IDFA, and IMEI of the corresponding driver within the last day.
Example:
Mobile number of a passenger: 138000
IDFA of the passenger: hadsjflkhadlsgfag
IMEI of the passenger: adfladfnahjahjkdf
Mobile number of the corresponding driver: 139000
IDFA of the driver: nmyasdoykk
IMEI of the driver: pdasfhahkjnlad
Run the following code to implement the query:
g.E("mobile_138000;idfa_hadsjflkhadlsgfag;imei_adfladfnahjahjkdf").by("passenger_identity_user_id")
.outE().by("passenger_cp_order") // The IDs of the orders that are related to the device of the passenger.
.inV().by("cp_order") // The details of the orders of the passenger.
.filter("order_create >=1634221076929") // Query orders within the last day.
.where(outE().by("cp_order_driver").outE().by("driver_union_id_identity")
.values("identity").is(P.within("mobile_139000;idfa_nmyasdoykk;imei_pdasfhahkjnlad")))