How should we build the architecture of a big data platform? This article studies the case of OpSmart Technology to elaborate on the business and data architecture of the IoT for enterprises.
Abstract: How should we design the architecture of a big data platform? Are there any good use cases for this architecture? This article studies the case of OpSmart Technology to elaborate on the business and data architecture of Internet of Things for enterprises, as well as considerations during the technology selection process.
How should we build the architecture of a big data platform? Are there any good use cases for this architecture? This article studies the case of OpSmart Technology to elaborate on the business and data architecture of the Internet of Things for enterprises, as well as considerations during the technology selection process.
Based on the "Internet + big data + airport" model, OpSmart Technology provides wireless network connectivity services on-the-go to 640 million users every year. As the business expanded, OpSmart technology faced the challenge of increasing amounts of data. To cope with this, OpSmart Technology took the lead to build an industry-leading big data platform in 2016 with Alibaba Cloud products.
Below are some tips shared by OpSmart Technology's big data platform architect:
OpSmart Technology's business architecture is shown in the figure above. Our primary business model is to collect data through our own devices, explore value in the data, and then apply the data to our business.
On the data collection layer, we founded the first official Wi-Fi brand for airports in China, "Airport-Free-WiFi", covering 25 hub airports and 39 hub high-speed rail stations nationwide and providing wireless network services on-the-go to 640 million people each year. We also have the nation's largest Wi-Fi network for driving schools and our driving school Wi-Fi network is expected to cover 1,500-plus driving schools by the end of 2017. We are also the Wi-Fi provider of China's four major auto shows (Beijing, Shanghai, Guangzhou, and Chengdu) to serve more than 1.2 million people. In addition, we are also running the Wi-Fi network for 2,000-plus gas stations and 600-plus automobile 4S (sales, spare parts, service, survey) stores across the country.
On the data application layer, we connected online and offline behavioral data for user profiling to provide more efficient and precise advertisement targeting including SSP, DSP, DMP and RTB. We also worked with the Ministry of Public Security to eliminate public network security threats.
OpSmart Technology's big data and advertising platforms also offer technical capabilities for enterprises to help them establish their own big data platforms and improve their operation management efficiency with a wealth of quantitative data.
We abstracted our data architecture, which contains a number of themes as shown in the figure. The subject in the figure can be understood as users, and the object can be understood as things. The subject and object are connected through various forms. Such connections are established in time and space and are completed through computer and telecommunication networks. The subject has its own reflection in the connection network, which can be understood as a virtual identity (Avatars). The object also has its own reflection in the connection network, such as the Wikipedia description of a topic, or a commercialized product or service. These reflections are then packaged by advertisements as an advertising image. All these are object mirrors. The interaction between the subject and the object is actually the interaction between the subject image and the object image, and such interactions leave traces in both time and space.
The individual and group characteristics of the subject and object, as well as the subject-object relationships, all constitute big data. Through in-depth mining and learning, this information will give birth to powerful insights and have immeasurable value to businesses.
This article mainly describes the architecture and features of Alibaba Cloud's general-purpose computing engine MaxCompute.
First, let's see some background information about big data technologies at Alibaba. As shown in the following figure, Alibaba began to establish a network of big data technologies very early, and it was safe to say that Alibaba Cloud was founded to help Alibaba solve technical problems related to big data. Currently, almost all Alibaba business units are using big data technologies. Big data technologies are applied both widely and deeply in Alibaba. Additionally, the whole set of big data systems at Alibaba Group are integrated together.
The Alibaba Cloud Computing Platform business unit is responsible for the integration of Alibaba big data systems and R&D related to storage and computing across the whole Alibaba Group. The following figure shows the structure of the big data platform of Alibaba, where the underlying layer is the unified storage platform - Apsara Distributed File System that is responsible for storing big data. Storage is static, and computing is required for mining data value. Therefore, the Alibaba big data platform also provides a variety of computing resources, including CPU, GPU, FPGA, and ASIC. To make better use of these computing resources, we need unified resource abstraction and efficient management. The unified resource management system in Alibaba Cloud is called Job Scheduler. Based on this resource management and scheduling system, Alibaba Cloud has developed a variety of computing engines, such as the general-purpose computing engine MaxCompute, the stream computing engine Blink, the machine learning engine PAI, and the graph computing engine Flash. In addition to these computing engines, the big data platform also provides various development environments, on which the implementation of many services is based.
Organizations need to invest in appropriate data models to draw insights from them. This article gives an overview of data modeling methods and introduces Alibaba Cloud’s Big Data modeling practices.
The explosive growth of the Internet, smart devices, and other forms of information technology in the DT era has seen data growing at an equally impressive rate. The challenge of the era, it seems is how to classify, organize, and store all of this data.
In a library, we need to classify all books and arrange them on shelves to make sure we can easily access every book. Similarly, if we have massive amounts of data, we need a system or a method to keep everything in order. The process of sorting and storing data is called "data modeling".
A data model is a method by which we can organize and store data. Just as the Dewey Decimal System organizes the books in a library, a data model helps us arrange data according to service, access, and use. Torvalds, the founder of Linux, alluded to the importance of data modeling when he wrote an article on “what makes an excellent programmer”: “Poor programmers care about code, and good programmers care about the data structure and the relationships between data”. Appropriate models and storage environments offer the following benefits to big data:
• Performance: Good data models can help us quickly query the required data and reduce I/O throughput.
• Cost: Good data models can significantly reduce unnecessary data redundancy, reuse computing results, and reduce the storage and computing costs for the big data system.
• Efficiency: Good data models can greatly improve user experience and increase the efficiency of data utilization.
• Quality: Good data models make data statistics more consistent and reduce the possibility of computing errors.
Therefore, it is without question that a big data system requires high-quality data modeling methods for organizing and storing data, allowing us to reach the optimal balance of performance, cost, efficiency, and quality.
A single server-based service for application deployment, security management, O&M monitoring, and more
Elastic and secure virtual cloud servers to cater all your cloud hosting needs.
Learn how to utilize data to make better business decisions. Optimize Alibaba Cloud's big data products to get the most value out of your data.
This course briefly explains the basic knowledge of Alibaba Cloud big data product system and several products in large data applications, such as MaxCompute, DataWorks, RDS, DRDS, QuickBI, TS, Analytic DB, OSS, Data Integration, etc. Students can refer to the application scenarios explained, combine with the enterprise's own business and demand, apply what we have learned to practice.
This topic describes how to migrate the replica set of a user-created MongoDB database to ApsaraDB for MongoDB by using Data Transmission Service (DTS). DTS supports full data migration and incremental data migration.
To migrate all data without service interruption, you can select both full data migration and incremental data migration. You can also use the built-in commands of MongoDB to migrate user-created MongoDB databases. For more information, see Migrate user-created MongoDB databases to Alibaba Cloud by using the built-in commands of MongoDB.
For more information about data migration or synchronization solutions, see Overview.
Data Integration in DataWorks supports the MaxCompute data source. It allows you to write data from other data sources to MaxCompute or write MaxCompute data to other data sources. Leveraging the underlying Tunnel feature, Data Integration achieves the MaxCompute data read and write functions.
You can import and export data in wizard or script mode. This topic details data import and export in wizard mode.
Quickly understand processing of structured data using Python Pandas through hands-on practice.
Learn how to utilize data to make better business decisions. Optimize Alibaba Cloud's big data products to get the most value out of your data.
Alibaba Cloud Sees 50% Increase in Demand for Cloud Technologies During Circuit Breaker Period
2,599 posts | 762 followers
FollowAlibaba Clouder - November 13, 2017
Alibaba Clouder - June 22, 2020
Alibaba Clouder - July 13, 2020
Alibaba Cloud MaxCompute - February 17, 2021
Alibaba Clouder - December 2, 2020
Alibaba Cloud Community - September 17, 2021
2,599 posts | 762 followers
FollowProvides secure and reliable communication between devices and the IoT Platform which allows you to manage a large number of devices on a single IoT Platform.
Learn MoreA platform that provides enterprise-level data modeling services based on machine learning algorithms to quickly meet your needs for data-driven operations.
Learn MoreMore Posts by Alibaba Clouder