The following content is based on a speech presented by an expert at Alibaba Cloud. This article covers the development of the enterprise-level data empowerment system that features a closed loop, accumulation, and sustainability, and also Umeng DataBank.
Our goal is to quickly form a closed loop that covers data at different enterprise touchpoints and then bring together the scattered data to quickly use it to empower business. In this effort, we must consider four key steps. The first is the transformation of businesses into data. For this step, we must consider whether all the touchpoints of an enterprise are authentic and connected. The second is the transformation of data into assets. For this step, we must consider whether data can be managed as assets. The third is the transformation of assets into applications. For this step, we must consider whether an enterprise's assets are effectively applied. The final step is the transformation of applications into value. For this step, we must consider how to leverage data assets to empower businesses. The ultimate purpose of all applications is to boost business growth, customer acquisition, and value production. To achieve these goals, it is essential for us to form a closed loop that accumulates data. Ultimately, the data mid-end and data energy must be sustainable.
The following figure shows how we built a closed-loop, accumulative, and sustainable enterprise-level data empowerment system. The figure shows the enterprise-oriented data bank to be released by Umeng+. Now, let's discuss how data banks and businesses work with each other. Based on cloud infrastructure, such as MaxCompute, Umeng DataBank will continuously help enterprises collect data from various scenarios and touchpoints, perform data governance, purification, model processing, and form various application services. Based on the connection capabilities of UMID, multiple accounts and terminals are normalized, and data can be connected across different terminals, specifically different mobile clients, servers, and client platforms. All of this helps developers gather data assets from all scenarios and touchpoints and manage applications.
Cross-terminal user operations involve two issues. First, does the company's data located on external media flow back to the company? Can the backflow data be applied for another time? Second, are users gathered in a user pool through marketing activities, and can efficient operations be performed for users with different terminals? In fact, in addition to marketing, enterprises have many other user touchpoints, which in China are the TopBuzz (Toutiao), Weibo, and Tiktok (Douyin) accounts. All user asset data must be interconnected to tap into its true value. If you are working on search recommendations, in addition to advanced model algorithms, your company must have a data foundation and collect normalized user behavior data that backflows from each touchpoint. By feeding this data into your search engine, you can make it more intelligent. For example, by incorporating data on served ads into subsequent searches, you can recommend ad content that the customer has interacted with in the past.
Every company needs to build its own data bank. For example, in the Alibaba ecosystem, we have millions of merchants during the Double 11 Shopping Festival, and many brands and merchants have built data banks on Alibaba. Similarly, Umeng+ has been deeply engaged in data intelligence services for nine years. Drawing on its experience of serving millions of Internet enterprises, Umeng+ launched Umeng DataBank for developers and, together with Alibaba Cloud MaxCompute, formed a set of core solutions for users. Data banks are required for solving several problems. Data banks can solve the problem of data asset management and the application of data, which can be expressed in four words: collection, construction, management, and use. Businesses can be transformed into data, and data can be transformed into assets. This process involves data collection and the conversion of terminal data into data assets. Next, these assets are applied. In this process, we push various messages and use marketing to obtain new customers, which includes app push and various operational recommendations. These services all can be provided by data banks.
Umeng DataBank includes three products to help users solve three types of problems. As shown below, the first product is smart data collection (which is U-SDC) and the second is the customer data platform (U-CDP). These products help enterprises accumulate data assets and provide efficient services to business departments, operation teams, and marketing teams. The third product is the data open platform (U-DOP), which integrates and analyzes collected data by using business data in Umeng Cloud. U-DOP provides comprehensive insights into users and more scenario-based application data.
AI and smart engine products are essentially data production and collection products. Collection is the fundamental basis of data quality, whereas the efficiency, quality, and efficiency of data collection are crucial. Data collection requires you to answer several questions. Do you have full control over your company's data tracking? Do you understand how data should be tracked in a given scenario? What kind of data will be generated after tracking? Are the tracking points correct and valid? Tracking is a long-term operation. You must constantly verify that tracking is healthy and ultimately related to the fundamental questions you're concerned with. If data tracking is inappropriate, then all the capabilities based on it, such as AI, will not function properly.
Umeng+'s U-SDC aims to solve these problems. It makes user tracking visible, controllable, and manageable and recommends optimal solutions for user tracking. In this way, user tracking can be intelligently debugged and verified, greatly reducing the cost of tracking and collection and, in turn, fundamentally improving the data quality and ensuring the value and quality of data assets.
After data is collected, the most important thing is to solve the user asset problem. First, user asset management must solve the problems of trust and normalization. Data is created at many touchpoints. Among all the requests sent to an app, many represent fraudulent traffic. So, how can we ensure that devices are trustworthy? Based on the connection capabilities of UMID, multiple accounts and multiple terminals are normalized. By connecting data across different terminals (different mobile client, server, and client platforms), it supports stream conversion and relationship insights. After normalization, we can create an automatic tag production library to ensure efficient tag production in private domains. This empowers business teams to quickly create tags, gain insights, identify target users, and ultimately form operation actions for customers.
As a result, the operations team is dissatisfied, and the technical team lacks a sense of accomplishment because their daily work generally consists of running SQL statements and other tedious and fragmented tasks. Enterprises need to think about how to efficiently solve such situations. Umeng+ hopes that Umeng DataBank will enable the production of preset private domain tags. This way, as long as the technical team does a good job in data tracking, they will not have to do much more work. All products must support the operations. This enables quick configuration and production on the platform, empowers business teams, and allows them to preconfigure private domain tags. Configuration means production. In addition, Umeng DataBank provides a new capability, global domain tags. Private domain tags are only used for customer labeling and insight. Umeng+ will also support global domain tags to identify the interests of different users, so as to understand and label users in more dimensions. In the future, Umeng+ plans to work with other enterprises to jointly establish a tag lab and share the different data of the parties. Integrated computing can be used to improve tag performance and better serve other enterprises.
After collecting data, Umeng+ integrates it with the customer's data. By seamlessly connecting with MaxCompute in the cloud, Umeng supports greater openness and return capabilities.
Seamlessly integrating Umeng+ with the MaxCompute cloud data warehouse not only improves processing performance, but makes it easier and more convenient to use the system. Umeng+ preconfigures all model packages and tables for users and interconnects data. This means the data is ready to use right away.
No matter how powerful an enterprise's containers, databases, and algorithms are, or how intelligent their applications are, it is necessary to go back to the four key steps. First, we must transform businesses into data to manage data collection and quality. Second, we must transform data into assets. This gives the management a clear understanding of user data assets, the number of terminals, the number of touchpoints, the data generated each day, and the number of accumulated users. Third, we must transform assets into applications. Accumulated data should be quickly converted to applications that serve the business team. This way, the business team can better innovate with the help of technology and data and will not have to wait for resources. The most fundamental thing is to build scenarios and closed data loops that cover all touchpoints and business behaviors. Such scenarios and closed loops can accumulate data assets. This is the only way to ensure that the enterprise mid-end and data empowerment are sustainable and that the power of data grows richer and better throughout the data utilization process.
Alibaba Risk Control Brain: Exploration and Practices in Big Data Applications
137 posts | 20 followers
FollowAlibaba Clouder - November 29, 2017
Alibaba Cloud Native Community - June 29, 2022
AliCloud-Data Middle Office - August 25, 2021
Alibaba Cloud Community - August 3, 2023
Alibaba Clouder - April 2, 2020
Alibaba Tech - February 17, 2020
137 posts | 20 followers
FollowAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreConduct large-scale data warehousing with MaxCompute
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreRealtime Compute for Apache Flink offers a highly integrated platform for real-time data processing, which optimizes the computing of Apache Flink.
Learn MoreMore Posts by Alibaba Cloud MaxCompute