The core of Flink is a streaming execution engine. The engine provides features such as data distribution, data communication, and fault tolerance for distributed streaming computing. Based on the streaming execution engine, Flink provides APIs of a higher abstraction level to allow you to compile distributed tasks.
Background information
EMR Flink is fully compatible with open source Flink. For more information, see the official documentation of open source Flink. References:
Scenarios
Flink is widely used for real-time big data computing. This section describes the use scenarios of Flink from the perspectives of technologies and enterprise applications.
Technologies
From the perspective of technologies, Flink is suitable for the following scenarios:
Real-time extract, transform, load (ETL) and data streams
Data is delivered from Point A to Point B by using the real-time ETL process and data streams. During data delivery, data cleansing and integration operations may be required, such as real-time indexing in the search system and ETL process in real-time data warehousing.
Real-time data analysis
Real-time data analysis is a process of extracting and integrating required information from raw data to achieve your business objectives. For example, you can view the top 10 products sold per day, the average turn-around time of a warehouse, the average document click rate, and the reachability of push notifications. Real-time data analysis allows you to view real-time reports or dashboards.
Event-driven applications
An event-driven application is a system that processes or reacts to subscription events. Event-driven applications depend on internal states and respond to suspicious events detected during fraud detection or in the risk management system or O&M exception detection system. If your behavior triggers a risk management rule, the system captures the event and analyzes your current and previous behavior to determine whether to perform risk management.
Enterprises
From the perspective of enterprises, Flink is suitable for the following scenarios:
Business department: real-time risk control, real-time recommendation, and real-time indexing of search engines
Data department: real-time data warehousing, real-time reports, and real-time dashboards
O&M department: real-time monitoring, real-time exception detection and alerting, and end-to-end debugging