This topic helps you get started with Graph Compute and describes how to build a graph computing application for friend recommendation and query and analyze multi-hop friend relations based on millions of data records.
What is Graph Compute?
Graph Compute is a graph computing engine that is developed by Alibaba Cloud. The service supports and extends the online analytical processing (OLAP) feature and provides end-to-end graph computing solutions. Based on the self-developed high-performance operators, Graph Compute allows you to import data and explore graph technologies in an efficient manner. Graph Compute integrates the experience of Alibaba in many industries, such as e-commerce, security, and socialization, and provides graph technologies for worldwide enterprises and developers.
Why do you use Graph Compute for friend recommendation?
The data model used for social networking scenarios is a typical graph structure. Graph Compute can provide graph model for business in social networking scenarios and better express data of various relationships. You can use Graph Compute to significantly improve the development efficiency and quality of social networking applications and reduce additional losses caused by data relationship conversions and computing.
For example, graphs can be used to build models for social relationships in scenarios such as searching for friends, recommendation for new users, recommending friends, searching for chat records, and related networking scenarios. Graph technologies can be used to search for friends within a few milliseconds. Graph Compute defines friends as vertices and the relationships between friends as edges. This way, Graph Compute can query data by vertices and edges, and efficiently express and analyze the relationships and query results of data in complex graph structures.
How is the business logic of friend recommendation established?
This section uses a social app for demonstration. The social app is a dating software for young people. It offers features such as virtual avatar, anonymous social networking, instant matching, co-stream party, voice control singing, and interest assessment. This way, users can efficiently find friends who share similar interests and prevent awkward chatting. This app provides the mind matching feature. The technical logic of the app is to use the feature to recommend three-hop friends based on Graph Compute.
Based on the characteristics of the app, the final business logic can be described in the following figure.
Business scenario: A, B, C, and D are three-hop friends.
After three-hop friends are obtained, relation weights among friends are added, and then a sorted list of recommended friends is generated.
How is the graph model of friend recommendation created?
In the business scenario of graph queries, A, B, C, and D are three-hop friends. The structure of the graph model for Graph Compute is developed based on the extraction and analysis of graph models. The following figure shows the structure. In the figure, a user indicates a vertex, which contains the attributes of the user (birthday and gender). An edge indicates the relationship between users. The relation weight can be calculated based on offline computing or analysis.
The user relations are expressed in typical U2U mode. In U2U mode, the system finds active users who share similar interests, and predicts the behavior trend of the current user based on the behaviors of the similar users. This way, the system can recommend friends to users with low activity.
The following table describes the methods for U2U extension.
Method | Advantage | Disadvantage |
1. User-based collaborative filtering | This method provides good user discoverability. | Side information is hard to be imported, such as the basic information of the user. For example, the followed users, interests, school, age, and purchasing power of the user. In addition, sparse data may affect the performance of this algorithm. |
2. Use vectors to express user characteristics and retrieve top k similar users | More side information is imported during the process of exploring user characteristics. | The top k retrieval of similar users has poor robustness and requires a large number of computing resources. |
3. Use clustered vectors to express user groups and retrieve top k similar user groups | Compared with recommending similar users, recommending similar user groups is more stable and requires fewer computing resources. | This method uses a single clustered vector to retrieve users, which may ignore user behaviors across categories. As a result, the recommended friends are only of the same type. |
How is a graph application created?
In the business scenario of graph queries, A, B, C, and D are three-hop friends. The structure of the graph model for Graph Compute is developed based on the extraction and analysis of graph models. The following figure shows the structure. In the figure, a user indicates a vertex, which contains the attributes of the user (birthday and gender). An edge indicates the relationship between users. The relation weight can be calculated based on offline computing or analysis.
1. Create a cluster
Purchase a Graph Compute instance.
2. Define a graph model, and configure vertices and edge tables
Define a graph model | Configure vertices | Configure edge tables |
Example: A MaxCompute table named igraph_mock.vertex_user_demo is used as a vertex table. The vertex table contains 123,074 users. | Example: A MaxCompute table named igraph_mock.edge_relation_demo is used as an edge table. The edge table contains 399,879 relations. |
2.1 Create a graph model
(1) Create a graph.
Specify the graph name and graph description. The graph name is used in Gremlin statements to specify a graph whose configuration information you want to access. In this example, the graph name is user_relation_graph.
(2) Add vertices to create a vertex table.
(3) Add edges to a vertex.
2.2 Configure a vertex
2.3 Configure an edge
2.4 Publish the graph configurations
3. Publish indexes
3.1 Perform multiple data backflows at a time
Graph Compute provides services by using a distributed graph computing engine. Each time the graph configurations are updated and published, a data backflow is performed. Graph Compute allows you to perform multiple data backflows at a time. This facilitates the schema change of online data.
3.2 Perform data backflow on specific tables
4. Query graphs
In the preceding steps, a graph computing application is created and the required data is imported. Now you can query graphs and analyze the query results.
You can use the graph exploration feature to interact or use Gremlin statements to query graphs.
Test examples of friend recommendation
1. Query the information about a user by user ID.
g("user_relation_graph").V("7949635553727122101").hasLabel("users")
The following results are returned:
{
"result": [
{
"data": [
{
"value": [
{
"label": "user",
"gender": "1",
"id": "7949635553727122101",
"starsign": "Aries"
}
],
"labels": [
[]
]
}
],
"error_info": [],
"trace_info": {}
}
],
"error_info": []
}
2. Query the information about multiple users by user ID at a time.
g("user_relation_graph").V("2443269531561029504;4315033251719520021;6045530619721418713;-2441936916298108531;-6187501937134616998;-7902352812594818920;8829494226614398819;-788398966410862160").hasLabel("user")
The following results are returned:
{
"data": [
{
"label": "user",
"gender": "1",
"id": "-2441936916298108531",
"starsign": "Cancer"
},
{
"label": "user",
"gender": "1",
"id": "-6187501937134616998",
"starsign": "Gemini"
},
{
"label": "user",
"gender": "0",
"id": "-788398966410862160",
"starsign": "Capricorn"
},
{
"label": "user",
"gender": "0",
"id": "-7902352812594818920",
"starsign": "Aquarius"
},
{
"label": "user",
"gender": "0",
"id": "2443269531561029504",
"starsign": "Pisces"
},
{
"label": "user",
"gender": "1",
"id": "4315033251719520021",
"starsign": "Pisces"
},
{
"label": "user",
"gender": "1",
"id": "6045530619721418713",
"starsign": "Pisces"
},
{
"label": "user",
"gender": "1",
"id": "8829494226614398819",
"starsign": "Aquarius"
}
],
"error_info": [],
"trace_info": {}
}
3. Query the one-hop friends for a user and sort the results.
g("user_relation_graph").E("7949635553727122101").hasLabel("relation").order().by("score",decr).limit(10).values("to_id")
The following results are returned:
{
"result": [
{
"data": [
"2557390182698651469",
"-5910095803510830870",
"-8777626058260080543",
"-3326503472333052856",
"-5628868613588358018",
"5693972407819734988",
"3169032466213709540",
"-6273932137952248996",
"85024782667881542",
"2490097926641478897"
],
"error_info": [],
"trace_info": {}
}
],
"error_info": []
}
4. Query the three-hop friends whose constellation is Aries for a user.
g("user_relation_graph").E("-2441936916298108531").hasLabel("relation").outE().outE().inV().filter("starsign=\"Aries\"")
5. Query the number of friends of each constellation for the user whose ID is 7949635553727122101.
g("user_relation_graph").E("7949635553727122101").hasLabel("relation").inV().groupCount().by("starsign")
The following results are returned:
{
"result": [
{
"data": [
{
"\"Aquarius\"": "7",
"\"Aries\"": "10",
"\"Cancer\"": "1",
"\"Capricorn\"": "3",
"\"Gemini\"": "3",
"\"Leo\"": "2",
"\"Libra\"": "2",
"\"Pisces\"": "9",
"\"Sagittarius\"": "7",
"\"Taurus\"": "4",
"\"Virgo\"": "2"
}
],
"error_info": [],
"trace_info": {}
}
],
"error_info": []
}
6. Query the constellation scores of friends for the user whose ID is 7949635553727122101.
g("user_relation_graph").withSack(supplier(normal,"0.0"),Splitter.identity,Operator.sum).E("7949635553727122101").hasLabel("relation").sack(Operator.sum).by("score").inV().group().by("starsign").by(sack().sum())
The following results are returned:
{
"result": [
{
"data": [
{
"\"Aquarius\"": "99.44964174192042",
"\"Aries\"": "171.6614835163086",
"\"Cancer\"": "15.8359302136816",
"\"Capricorn\"": "42.44029297266677",
"\"Gemini\"": "34.2056092615523",
"\"Leo\"": "16.46936742222886",
"\"Libra\"": "24.07061392479605",
"\"Pisces\"": "129.85462775218915",
"\"Sagittarius\"": "86.14746036242798",
"\"Scorpio\"": "6.33437208547264",
"\"Taurus\"": "41.80685576411946",
"\"Virgo\"": "33.572172053005"
}
],
"error_info": [],
"trace_info": {}
}
],
"error_info": []
}
7. Query the three-hop friends whose constellation is Aries for the user whose ID is-2441936916298108531, calculate the relation weights among friends, sort the relation weights in descending order, and then obtain the top 10 friends.
g("user_relation_graph").withSack(supplier(normal,"0.0"),Splitter.identity,Operator.sum).E("-2441936916298108531").hasLabel("relation").sack(Operator.sum).by("score").outE().sack(Operator.sum).by("score").outE().sack(Operator.sum).by("score").inV().filter("starsign=\"Aries\"").values("id").barrier().dedup().order().by(sack(),decr).limit(10)
The following results are returned:
{
"result": [
{
"data": [
"-4985651325249407669",
"-966745601209007179",
"300594519935616602",
"8616477414455953382",
"3410211067444088094",
"7361520262922301828",
"4419627442674893942",
"1684980613157243612",
"-3968869064747091877",
"7376060565223003509"
],
"error_info": [],
"trace_info": {}
}
],
"error_info": []
}
Advanced capabilities - high-performance full graph statistics
Business feature: You can collect global statistics on users. Based on the feature of graph query, Graph Compute integrates the inverted query feature.
1. In the data configuration panel of a vertex, set the Index Type parameter to Inverted INDEX, configure field attributes, and then click Submit.
2. Save the graph configurations and publish the graph.
3. Perform O&M operations on the graph.
Test examples
1. Query all the users whose constellation is Aries.
g("user_relation_graph").V().hasLabel("user").indexQuery("{\"match\":{\"starsign\":\"Aries\"},\"config\":{\"seek_count_limit_per_shard\":100000,\"search_count_limit_per_shard\":100000}}")
The following results are returned:
2. Query the constellation distribution of all male users. To ensure performance, only part of data is queried.
g("user_relation_graph").V().hasLabel("user").indexQuery("{\"match\":{\"gender\":\"1\"},\"config\":{\"seek_count_limit_per_shard\":100000,\"search_count_limit_per_shard\":100000}}").groupCount().by("starsign")
The following results are returned:
{
"result": [
{
"data": [
{
"\"Aquarius\"": 825,
"\"Aries\"": 640,
"\"Cancer\"": 923,
"\"Capricorn\"": 848,
"\"Gemini\"": 773,
"\"Leo\"": 821,
"\"Libra\"": 522,
"\"Pisces\"": 2397,
"\"Sagittarius\"": 491,
"\"Scorpio\"": 501,
"\"Taurus\"": 739,
"\"Virgo\"": 520
}
],
"error_info": [
],
"trace_info": {
}
}
],
"error_info": [
]
3. Query the top 100 male users who have most friends and whose constellation is Aries.
g("user_relation_graph").V().hasLabel("user").indexQuery("{\"and\":[{\"match\":{\"gender\":\"1\"}\},\{\"match\":{\"starsign\":\"Aries\"}}],\"config\":{\"seek_count_limit_per_shard\":100000,\"search_count_limit_per_shard\":100000}}").limit(10000).outE().groupCount().by("from_id").unfold().order().by(select(Column.values),decr).limit(100)
The following results are returned:
{
"result": [
{
"data": [
{
"-7032647615083234229": "50"
},
{
"-5722782251601168066": "50"
},
{
"-5335242748220558153": "50"
},
{
"-8676408452255309391": "50"
},
{
"-6047928004364318541": "50"
},
{
"-5344466669668822162": "50"
},
{
"-6190401221243138849": "50"
},
{
"-4827852736929428415": "50"
},
{
"-5307302829373746633": "50"
},
{
"-5475401520922089187": "50"
},
{
"-8017336865667734219": "50"
},
{
"-8152630370117271740": "50"
},
{
"-7004109459328652310": "50"
},
{
"-7817593255334792111": "50"
},
{
"-5272574182922494022": "50"
},
{
"-8523896507246731100": "50"
},
{
"-5938745235467206212": "50"
},
{
"-8438867826624678384": "50"
},
{
"-4405449005585220943": "50"
},
{
"-9126301390881979643": "50"
},
{
"-5382489933968271711": "50"
},
{
"-8841373379965641987": "50"
},
{
"-5302202017699941647": "50"
},
{
"-6248244606214120159": "50"
},
{
"-6711633791333564554": "50"
},
{
"-7315782827574470472": "50"
},
{
"-9031277985752489187": "50"
},
{
"-9022869359443119815": "50"
},
{
"-8369470302541518920": "50"
},
{
"-6142766699127849771": "50"
},
{
"-8655738446138261193": "50"
},
{
"-8646149194767113790": "50"
},
{
"-9097183937680782346": "50"
},
{
"-4641376756135904334": "50"
},
{
"-9081172527938898024": "50"
},
{
"-7137344541799001227": "50"
},
{
"-8491934123275310192": "50"
},
{
"-5656245743850590165": "50"
},
{
"-8448876639884702547": "50"
},
{
"-8921032424157220292": "50"
},
{
"-8886790874757451152": "50"
},
{
"-4170067740591839020": "50"
},
{
"-8412401447690411889": "50"
},
{
"-6935600597565680532": "50"
},
{
"-6465292696804515998": "50"
},
{
"-4580178566813967734": "50"
},
{
"-8337980039891383510": "50"
},
{
"-6750094915823112945": "50"
},
{
"-8271130662107074869": "50"
},
{
"-4313989045148439251": "50"
},
{
"-7072971882028192758": "50"
},
{
"-6081673847669716322": "50"
},
{
"-8215720337033905961": "50"
},
{
"-4511660190549716707": "50"
},
{
"-6954636645089290492": "50"
},
{
"-5470987491723790775": "50"
},
{
"-8693909718856724970": "50"
},
{
"-9172299426900892564": "50"
},
{
"-6142070315717702174": "50"
},
{
"-8683222889153361552": "50"
},
{
"-6061245317590715601": "50"
},
{
"-8680938277196498242": "50"
},
{
"-6382251776295701372": "50"
},
{
"-7953204420754362673": "50"
},
{
"-5390210842719212579": "50"
},
{
"-7836046658486786908": "50"
},
{
"-7151222483126499509": "50"
},
{
"-7824880497332818548": "50"
},
{
"-5914268273571897378": "50"
},
{
"-6703760236487855527": "50"
},
{
"-6251439297928817838": "50"
},
{
"-7764254764054046817": "50"
},
{
"-4147078275545366756": "50"
},
{
"-8961736465346823903": "50"
},
{
"-6966728454894457236": "50"
},
{
"-7670177853018881196": "50"
},
{
"-5753132289681546447": "50"
},
{
"-7607424647216651656": "50"
},
{
"-6860348792160569972": "50"
},
{
"-7081692058360204084": "50"
},
{
"-4284841470230159060": "50"
},
{
"-7050130964435971895": "50"
},
{
"-5460635829467019773": "50"
},
{
"-7486065564917180528": "50"
},
{
"-4775647028251006118": "50"
},
{
"-7438712711008287507": "50"
},
{
"-6655259811991911369": "50"
},
{
"-8414061350601269280": "50"
},
{
"-4138667890938143127": "50"
},
{
"-7428739044579099070": "50"
},
{
"-5783820100141303990": "50"
},
{
"-7104520570549970103": "50"
},
{
"-5351128558805225544": "50"
},
{
"-3919387380195528267": "50"
},
{
"-3924714660733547258": "50"
},
{
"-4056394674602078332": "50"
},
{
"-4062831157333201368": "50"
},
{
"-4812848481660416581": "50"
},
{
"-6513421419955124633": "50"
},
{
"-7028635231904837853": "50"
}
],
"error_info": [],
"trace_info": {}
}
],
"error_info": []
}