Lecturer: Jianfeng, Head of Alibaba Cloud EMR Platform
Content Framework:
For example, you only refer to other people's Spark code, but you do not understand the internal mechanism of Spark, and you do not know how to tune Spark jobs.
1. Configure Driver
2. Configure Executor
3. Configure Runtime
4. Configure DAE
Learn more: https://spark.apache.org/docs/latest/configuration.html
Notebook + Airflow: Connects development and production scheduling seamlessly
1. All data is stored on OSS, including:
2. Even if the cluster is destroyed, the cluster can be rebuilt, and data can be restored easily.
E-MapReduce: https://www.alibabacloud.com/product/emapreduce
EMR Studio (Beta): https://help.aliyun.com/document_detail/208107.html (Article in Chinese)
For specific product introduction and demonstration, you can click the following link to watch the playback: https://developer.aliyun.com/live/247072 (Video in Chinese)
DLF + DDI Best Practices for One-Stop Data Lake Formation and Analysis
Best Practices for Flink on Zeppelin Stream Computing Processing
59 posts | 6 followers
FollowAlibaba Clouder - November 14, 2017
Alibaba EMR - October 12, 2021
Alibaba Clouder - April 9, 2019
Alibaba Cloud MaxCompute - March 24, 2021
Apache Flink Community - July 5, 2024
Alibaba Cloud Security - July 31, 2018
59 posts | 6 followers
FollowAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreRealtime Compute for Apache Flink offers a highly integrated platform for real-time data processing, which optimizes the computing of Apache Flink.
Learn MoreBuild a Data Lake with Alibaba Cloud Object Storage Service (OSS) with 99.9999999999% (12 9s) availability, 99.995% SLA, and high scalability
Learn MoreMore Posts by Alibaba EMR