Alibaba Big Data Practices on Cloud-Native – EMR Spark on ACK

This article discusses the practices and challenges of EMR Spark on Alibaba Cloud Kubernetes.

1. Cloud-Native Challenges and Alibaba Practices

Development Trend of Big Data Technology

Challenges of Cloud-Native Development

Computing and Storage Separation

Building an HCFS file system with Object Storage System as the base:

Fully compatible with existing HDFS
Performance benchmarking HDFS with lower costs

Shuffle, Storage, and Computing Separation

Solving Alibaba Cloud Kubernetes (ACK) hybrid heterogeneous models:

No local disk for heterogeneous models
Community [Spark-25299] discussion, supporting Spark dynamic resources, which has become the consensus of the industry

Cache Solution

Supporting cross-room, cross-dedicated hybrid cloud effectively:

A cache system must be supported within the container.

ACK Scheduling

Resolving scheduling performance bottlenecks:

Performance benchmarking Yarn
Multi-level queue management

Others

Staggered scheduling
Yarnon ACK nodes resource mutual awareness

Alibaba Practices – EMR on ACK

Overall Solution Introduction

Submission to different execution platforms via data development clusters/scheduling platforms
Staggered scheduling, adjusted policies based on business peak and off-peak
Cloud-native data lake architecture with powerful elastic expansion and contraction capabilities
Hybrid scheduling on and off the cloud via dedicated lines
ACK manages heterogeneous clusters with high flexibility

2. Spark Containerization Solution

Introduction

RSS Q&A

1. Why Do I Need the Remote Shuffle Service?

RSS enables the Spark job without the need for Executor Pod cloud disk. Attaching cloud disks is not conducive to scalability and large-scale production.
The disk size cannot be determined in advance. The size of the cloud disk cannot be determined in advance; if it is too big, it wastes space; if it is too small, it will fail to shuffle. RSS is designed specifically for storage and computing separation scenarios.
Executor writes shuffle data to the RSS system, which manages shuffle data and can be recycled once the Executor is idle. [SPARK-25299]
It can support dynamic resources perfectly to avoid long-tail tasks with data skews that prevent the Executor resources from being released.

2. What Is the Performance, Cost, and Scalability of RSS?

RSS is deeply optimized for shuffle and is specially designed for storage and computing separation scenarios and K8s elastic scenarios.
For the Shufflefetch phase, the random read in the reduce phase can be converted into the sequential read, which improves the stability and performance of the job significantly.
You can use the disk in the original K8s cluster for deployment without adding extra cloud disks for shuffle, which is cost-effective and flexible.

Spark Shuffle

Generate numMapper * numReducer and block
Sequential write and random read
Spill on Write
Single copy, stage recalculation required for data loss

EMR Remote Shuffle Service

Append write and sequential read
Spill on no Write
Two copies; Complete once the copy is copied to memory
Backup between copies via an intranet. No need for public bandwidth

RSS TeraSort Benchmark

Note: Taking the 10T Terasort as an example, the shuffle amount is about 5.6T after compression. In the RSS scenario, the performance of jobs of this magnitude is improved significantly because shuffle read is changed to sequential read.

Effects of Spark on ECI

Summary

0 1 0

Share on

ACK One

Provides a control plane to allow users to manage Kubernetes clusters that run based on different infrastructure resources

Learn More

Big Data Consulting for Data Technology Solution

Alibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.

Learn More

Big Data Consulting Services for Retail Solution

Alibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.

Learn More

E-MapReduce Service

A Big Data service that uses Apache Hadoop and Spark to process and analyze data

Learn More

Community

Alibaba Big Data Practices on Cloud-Native – EMR Spark on ACK

1. Cloud-Native Challenges and Alibaba Practices

Development Trend of Big Data Technology

Challenges of Cloud-Native Development

Computing and Storage Separation

Shuffle, Storage, and Computing Separation

Cache Solution

ACK Scheduling

Others

Alibaba Practices – EMR on ACK

Overall Solution Introduction

2. Spark Containerization Solution

Introduction

RSS Q&A

1. Why Do I Need the Remote Shuffle Service?

2. What Is the Performance, Cost, and Scalability of RSS?

Spark Shuffle

EMR Remote Shuffle Service

RSS TeraSort Benchmark

Effects of Spark on ECI

Summary

Read previous post:

Read next post:

Alibaba EMR

You may also like

Comments

Alibaba EMR

Related Products

ACK One

Big Data Consulting for Data Technology Solution

Big Data Consulting Services for Retail Solution

E-MapReduce Service

A Free Trial That Lets You Build Big!