Overview
Data Lake is a trending data analytics structure that supports massive data workloads. Alibaba Cloud Data Lake Storage uses Object Storage Service (OSS) as a powerful base to provide central storage for structured, semi-structured, and unstructured data, making it an ideal solution for IoT, gaming, online education, and advertising businesses. OSS works smoothly with mainstream big data ecosystems, such as Hadoop, HIVE, Spark, Presto, and Impala.
Solution Highlights
-
As-Is Data Storage
Ingest and store data centrally from multiple sources since it is to your data lake regardless of data structures, schema, and size
-
Robust Foundation
Guaranteed durability of 99.9999999999% (12 9s), 99.995% SLA, and powerful remote disaster recovery against system failures
-
High Performance Data Processing
Upload and download data in parallel with OSS Append Object functionality, enabling real-time reading while appending new data and enhancing workload analytics efficiency
-
Scalable and Elastic Architecture
Adjust storage and computing resources separately according to business needs based on the architecture that decouples storage and computing resources, lower Total Cost of Ownership (TCO)
Learn more about Data Lake Storage
Contact SalesHow Alibaba Cloud Data Lake Storage Supports Education, Gaming, and Advertising
Your Challenges
Data needed in education scenarios, such as audio/video, images, system logs, and online messages, are stored separately, causing an isolated data island effect and increasing costs in data analytics and O&M.
Our Solution
-
This solution leverages OSS as the unified storage center for data in different formats, which offers easy access to various online education data sources globally to upload education materials, including images, videos, and audio and text files. You can build computing clusters with EMR to connect to big data processing services, such as Spark, Hive, and Presto, seamlessly. You can also analyze test results based on courseware, evaluate the learning quality of each student, and provide customized guidance and precise content recommendations with intelligent algorithms. You can combine this solution with Alibaba Cloud CDN to deliver on-demand courseware to students worldwide with ultra-low latency.
Object Storage Service
An encrypted and secure cloud storage service that can store, process, and access massive amounts of data
Learn MoreMessage Queue for Apache Kafka
A fully-managed Apache Kafka service to help you build data pipelines for your big data analytics quickly
Learn MoreYour Challenges
The gaming industry demands efficient data analytics for timely gaming scenario adjustment and high scalability of storage and computing power for traffic peaks and upgrading requirements.
Our Solution
-
This solution adopts EMR to help you deploy clusters with different types of data processing platforms and systems, such as Hadoop and Hive, for data analytics requirements in different gaming scenarios. You can leverage OSS to archive cold data in more cost-efficient storage and store hot data in highly available instances to optimize resource utilization and performance. The OSS + EMR combination architecture can provide speed, reliability, and cost-efficiency consistent with the Hadoop Distributed File System (HDFS). The decoupled storage and computing instances can be scaled separately to adjust system performance and scale flexibly, simplifying management and O&M while lowering upgrading costs.
Object Storage Service
An encrypted and secure cloud storage service that can store, process, and access massive amounts of data
Learn MoreMessage Queue for Apache Kafka
A fully-managed Apache Kafka service to help you quickly build data pipelines for your big data analytics
Learn MoreDataWorks
A secure environment for offline data development, with powerful Open APIs, to create an ecosystem for redevelopment
Learn MoreYour Challenges
The advertising industry faces constantly changing search traffic and content demands, leading to low performance or resource waste due to resident computing clusters.
Our Solution
-
This solution deploys data processing platforms, such as Hadoop, Hive, and Presto, with highly elastic Kubernetes clusters formed by ECS. You can scale up for traffic peaks during events and promotions or scale down to save costs during low traffic. Query data in various formats are stored in OSS; cold data are archived in OSS Archive, and regularly accessed data are stored in OSS Standard storage for high availability. You can retrieve and manage data with configurable rules, scale the storage capacity for business needs, and optimize storage costs easily. This solution uses Alibaba Cloud Elasticsearch for quick indexing and precise searching of website data and Message Queue for Apache Kafka to monitor website activities and collect real-time statistics. In addition, it uses DLA to process interactive queries and EMR Druid for real-time queries and ad hoc queries. These tasks are processed by the computing clusters (Hadoop, Hive, Presto, etc.), and the results are stored in OSS Standard Storage.
Elastic Compute Service
Elastic and secure virtual cloud servers to cater all your cloud hosting needs
Learn MoreAlibaba Cloud Elasticsearch
A cloud-based Service that offers built-in integrations such as Kibana, commercial features, and Alibaba Cloud VPC, Cloud Monitor, and Resource Access Management
Learn MoreData Lake Analytics
An interactive analytics service that allows you to use standard SQL syntax and BI tools to analyze your data stored in the cloud with low costs
Learn MoreLearn more about Data Lake Storage
Contact SalesSecurity and Compliance
-
CSA STAR
-
ISO 27001
-
SOC2 Type II Report
-
C5
-
MLPS 2.0
-
MTCS