Data Lake Storage

Build a Data Lake with Alibaba Cloud Object Storage Service (OSS) with 99.9999999999% (12 9s) availability, 99.995% SLA, and high scalability

Overview

Data Lake is a trending data analytics structure that supports massive data workloads. Alibaba Cloud Data Lake Storage uses Object Storage Service (OSS) as a powerful base to provide central storage for structured, semi-structured, and unstructured data, making it an ideal solution for IoT, gaming, online education, and advertising businesses. OSS works smoothly with mainstream big data ecosystems, such as Hadoop, HIVE, Spark, Presto, and Impala.

Solution Highlights

  • As-Is Data Storage

    Ingest and store data centrally from multiple sources since it is to your data lake regardless of data structures, schema, and size

  • Robust Foundation

    Guaranteed durability of 99.9999999999% (12 9s), 99.995% SLA, and powerful remote disaster recovery against system failures

  • High Performance Data Processing

    Upload and download data in parallel with OSS Append Object functionality, enabling real-time reading while appending new data and enhancing workload analytics efficiency

  • Scalable and Elastic Architecture

    Adjust storage and computing resources separately according to business needs based on the architecture that decouples storage and computing resources, lower Total Cost of Ownership (TCO)

Learn more about Data Lake Storage

Contact Sales

How Alibaba Cloud Data Lake Storage Supports Education, Gaming, and Advertising

Your Challenges

Data needed in education scenarios, such as audio/video, images, system logs, and online messages, are stored separately, causing an isolated data island effect and increasing costs in data analytics and O&M.

Our Solution

  • This solution leverages OSS as the unified storage center for data in different formats, which offers easy access to various online education data sources globally to upload education materials, including images, videos, and audio and text files. You can build computing clusters with EMR to connect to big data processing services, such as Spark, Hive, and Presto, seamlessly. You can also analyze test results based on courseware, evaluate the learning quality of each student, and provide customized guidance and precise content recommendations with intelligent algorithms. You can combine this solution with Alibaba Cloud CDN to deliver on-demand courseware to students worldwide with ultra-low latency.

Object Storage Service

An encrypted and secure cloud storage service that can store, process, and access massive amounts of data

Learn More

Message Queue for Apache Kafka

A fully-managed Apache Kafka service to help you build data pipelines for your big data analytics quickly

Learn More

Your Challenges

The gaming industry demands efficient data analytics for timely gaming scenario adjustment and high scalability of storage and computing power for traffic peaks and upgrading requirements.

Our Solution

  • This solution adopts EMR to help you deploy clusters with different types of data processing platforms and systems, such as Hadoop and Hive, for data analytics requirements in different gaming scenarios. You can leverage OSS to archive cold data in more cost-efficient storage and store hot data in highly available instances to optimize resource utilization and performance. The OSS + EMR combination architecture can provide speed, reliability, and cost-efficiency consistent with the Hadoop Distributed File System (HDFS). The decoupled storage and computing instances can be scaled separately to adjust system performance and scale flexibly, simplifying management and O&M while lowering upgrading costs.

Object Storage Service

An encrypted and secure cloud storage service that can store, process, and access massive amounts of data

Learn More

Message Queue for Apache Kafka

A fully-managed Apache Kafka service to help you quickly build data pipelines for your big data analytics

Learn More

DataWorks

A secure environment for offline data development, with powerful Open APIs, to create an ecosystem for redevelopment

Learn More

Your Challenges

The advertising industry faces constantly changing search traffic and content demands, leading to low performance or resource waste due to resident computing clusters.

Our Solution

  • This solution deploys data processing platforms, such as Hadoop, Hive, and Presto, with highly elastic Kubernetes clusters formed by ECS. You can scale up for traffic peaks during events and promotions or scale down to save costs during low traffic. Query data in various formats are stored in OSS; cold data are archived in OSS Archive, and regularly accessed data are stored in OSS Standard storage for high availability. You can retrieve and manage data with configurable rules, scale the storage capacity for business needs, and optimize storage costs easily. This solution uses Alibaba Cloud Elasticsearch for quick indexing and precise searching of website data and Message Queue for Apache Kafka to monitor website activities and collect real-time statistics. In addition, it uses DLA to process interactive queries and EMR Druid for real-time queries and ad hoc queries. These tasks are processed by the computing clusters (Hadoop, Hive, Presto, etc.), and the results are stored in OSS Standard Storage.

Elastic Compute Service

Elastic and secure virtual cloud servers to cater all your cloud hosting needs

Learn More

Alibaba Cloud Elasticsearch

A cloud-based Service that offers built-in integrations such as Kibana, commercial features, and Alibaba Cloud VPC, Cloud Monitor, and Resource Access Management

Learn More

Data Lake Analytics

An interactive analytics service that allows you to use standard SQL syntax and BI tools to analyze your data stored in the cloud with low costs

Learn More

Learn more about Data Lake Storage

Contact Sales

Security and Compliance

We are committed to providing stable, reliable, secure, and compliant cloud computing infrastructure services across major jurisdictions around the world.
Learn More
  • CSA STAR
  • ISO 27001
  • SOC2 Type II Report
  • C5
  • MLPS 2.0
  • MTCS

Start with Alibaba Cloud Solutions

Learn and experience the power of Alibaba Cloud.

Contact Sales