Step up the digitalization of your business with Alibaba Cloud 2020 Double 11 Big Sale! Get new user coupons and explore over 16 free trials, 30+ bestselling products, and 6+ solutions for all your needs!
In recent years, with the rapid development of the Internet finance industry, various kinds of business within traditional financial institutions, such as exchanges, securities companies, and banks, have been increasingly integrated with the Internet. For example, major financial institutions have launched mobile app clients to support users to handle various financial services, such as mobile payment, financial management, online lending, and the purchase of financial products. The new business methods also lay down new requirements for the financial industry.
Solutions based on Realtime Compute for Apache Flink allows financial institutions to easily address the preceding challenges. We use Apache Flink to build real-time data warehouses and real-time anti-fraud systems to help financial institutions quickly build a real-time risk control system. The data warehouse architecture is shown below:
The data processing of real-time data warehouses involves the following key steps:
Data Generation: Typically, data comes from two sources
When selecting the system type, we need to select Flink for data processing. Flink features powerful data processing capabilities, low latency, and high throughput, which ensures business output. At the same time, Alibaba Cloud launched Alibaba Cloud Realtime Compute for Apache Flink to provide users with an all-in-one highly available Flink service.
When designing the data architecture, you can build the ODS, DWD, and ADS layers based on the basic methodology of the data warehouse to reduce data redundancy and data storage costs and make the data structure more scalable.
Flink-Based Real-Time Data Warehouse
Flink is mainly used for computing the ETL and BI metrics in real-time data warehouses. It is also integrated with various upstream and downstream systems.
Based on Alibaba Cloud Realtime Compute for Apache Flink, ZhongAn Insurance has built a real-time data warehouse. Its application scenarios include marketing activities, real-time monitoring, and anti-fraud.
With the rapid development of business, real-time computing is increasingly demanding. It requires a platform with low latency, low resource consumption, high efficiency, high accuracy, and other capabilities. In addition to meeting the most basic business needs, we are also making full use of the features of Flink to enrich the input and output interfaces and ensure data quality. Based on the SQL version, the ML and Scala versions of Flink will also help real-time computing show its capabilities in anti-fraud and complex business scenarios.
The new generation of monitoring systems of the Shenzhen Stock Exchange focuses on core business, such as the supervision of abnormal trading behaviors and the clue screening of illegal behaviors. It fully supports the integrated supervision of transaction monitoring, investigation and analysis, and business research. The system follows the design principle of "safety, efficiency, sustainable evolution, open, autonomous, and controllable." It was created as a secure, efficient, flexible, easy-to-use, and highly inclusive distributed architecture technology system.
The real-time monitoring platform is the core subsystem of the monitoring system. In terms of architecture design, core technologies, computing capabilities, high availability, and disaster recovery design, Flink is believed to represent the future trend of real-time computing technology. Flink is the best choice for the real-time monitoring platform. Compared with Storm, Flink provides a powerful state management mechanism, more user-friendly programming interfaces, and Exactly-Once semantics. Compared with Spark Streaming, Flink provides a more powerful window computing capability and can better meet the performance requirement for low latency.
Flink helps applications manage their state, automatically saves checkpoints, and provides multiple backend implementations. When an application needs to maintain large states, it can use the state backend of RocksDB to greatly reduce memory overhead and facilitate garbage collection. In case of failure, the application state can be restored to the latest checkpoint.
The powerful abilities of Flink SQL greatly reduce the threshold for developing stream computing services. Flink SQL also can meet over 80% of the development needs for the real-time statistics and real-time alerting business of the monitoring system. The UDF, UDAF, and UDTF functions are extended to implement specific business functions, further simplifying business R&D.
Flink window computing supports business time, full computation, and incremental computation. Optimized internal algorithms make Flink window computing excellent in performance metrics, making it easy to compute interval indicators.
Flink is based on Chandy-Lamport's distributed snapshot algorithms and delivers automatic fault handling. When a system failure occurs, jobs can be recovered from the most recent state snapshot and can continue to run, which ensures the Exactly-Once semantics of internal data processing. This provides a solid foundation for the monitoring system to build a distributed real-time computing platform with high availability.
In 2019, the Shenzhen Stock Exchange signed a cooperation agreement with the Alibaba Realtime Compute team. The next-generation monitoring system's real-time computing platform has been running securely and reliably for nearly 300 days. By the end of April 2020, the number of raw business messages was more than 5,000 per second on average, with a peak of over 1.2 million per second. It takes 100 milliseconds on average to perform statistics on key services and system alerting, which provides strong support for the real-time monitoring of the core business.
Learn more about Alibaba Cloud Realtime Compute for Apache Flink
Architecture Evolution and Application Scenarios of Real-time Warehouses in the Cainiao Supply Chain
Apache Flink Best Practices: Constructing Real-Time Data Warehouses in the Financial Industry
152 posts | 44 followers
FollowApache Flink Community China - June 28, 2021
Alibaba EMR - March 18, 2022
Apache Flink Community China - November 5, 2020
Alibaba Cloud Indonesia - March 23, 2023
Apache Flink Community China - January 11, 2021
Apache Flink Community - April 18, 2024
152 posts | 44 followers
FollowGet started on cloud with $1. Start your cloud innovation journey here and now.
Learn MoreAlibaba Cloud equips financial services providers with professional solutions with high scalability and high availability features.
Learn MoreThis solution enables FinTech companies to run workloads on the cloud, bringing greater customer satisfaction with lower latency and higher scalability.
Learn MoreMore Posts by Apache Flink Community