By Tang Yun
2020 will be remembered forever. When facing global cooperation challenges, the open-source community (led by the Apache Software Foundation) brought together the world's top developers and made inspiring achievements. On January 1, 2021, the Foundation published the "Apache in 2020 - By The Digits" [1] article on its official blog. This article reviews the development of the community with numbers throughout 2020:
Apache Flink is one of the 199 top-notch projects of the Apache Software Foundation. In this report, Apache Flink has made great achievements in terms of user activity, developer activity, and exposure.
A mailing list is commonly used by the Apache Software Foundation to facilitate communication between developers and users. It is generally divided into two channels for developer exchange (dev@ mailing list) and user exchange (user@ mailing list). The community's communication activity is often reflected by the mailing list activity. In 2020, Flink won first place on the user exchange channel and second place on the developer exchange channel.
In the top 20 mailing lists, the Flink community is the only community that provides user exchange channels in Chinese (user-zh@flink.apache.org). In addition, the activity of the Chinese user mailing list in 2020 was second only to the user mailing list in English. Since 2018, Flink has won first place in terms of mailing list activity. We are delighted to see more native Chinese speakers have their voices heard in open-source communities. This brought had a profound impact on the global open-source software community.
The number of new commits of open-source projects over the past year is a commonly used metric for measuring the development activity of the projects. The Apache Software Foundation annually announces the top five projects with the most submissions of the last year. In 2020, Flink ranked the second only to Apache Camel, the routing engine building software. When only considering the big data computing/storage fields, Apache Flink is the project with the most active developers. Take a look at the annual reports from 2019 [2] and 2018 [3]. There were always projects related to big data among the five most active open-source projects each year. Flink, Hadoop, HBase, Beam, Airflow, and Spark were all on the list. The following table shows the trend. Some projects may be excluded from some years since only the top 5 projects were released.
Apache Flink is the only open-source big data-related project that has been ranked in the top 5 for three consecutive years, and its ranking is constantly rising.
Since the Top 5 list changes every year, we have also counted the number of commits [4] for the projects on the list over the past three years. The following shows the statistical chart. The number of commits of Flink has increased year by year and was outstanding in 2020. Flink has further gained more advantages in the big data field.
The Apache Flink community is not only highly active in development and user communication but also highly exposed and visited on the Internet. The Apache Software Foundation counted the visits on Github pages of Flink in 2020, which ranked the second among all projects.
This indicator was not shown in the Apache Software Foundation's 2019 and 2018 annual reports, but we found the index of Github's visit traffic in the annual reports [5] of the fiscal year 2019 (2018.5.1-2014.4.30) and fiscal year 2020 (2019.5.1 - 2020.4.30):
Since the middle of 2018, the total exposure and browsing of Flink rose from third place to second place in 2020.
The Apache Software Foundation's annual reports of the past three years and reports of fiscal years 2018 and 2019 show how Flink has grown into one of Apache's leading projects. It ranks at the top in all of the open-source Apache projects in terms of user activity, development activity, and influence.
Flink Forward Asia 2020 introduced the rapid development, technological innovation, and the implementation of stream-batch unification in the production of the Flink community. More enterprises, such as ByteDance, NetEase, and Zhihu, are want to use Flink to make the solutions with unified stream-batch architecture.
A large number of developers and users from China contribute the most to the achievements of Flink. Maybe you, the readers of this article, are contributing to one of Apache's top projects. In 2021, Apache Flink will continue to evolve in terms of stream-batch unification, offline and real-time unification, big data, and AI unification. More progress will be achieved throughout the year!
Real-time is the future. The Flink community is looking forward to your participation!
[1] Apache in 2020 - By The Digits
[2] Apache in 2019 - By The Digits
[3] Apache in 2018 - By The Digits
[4] Referred instruction for commit number counting: git rev-list --after="Jan 1 2020" --before="Jan 1 2021" --all --no-merges --count
[5] Apache FY2019 Annual Report
[6] Apache FY2020 Annual Report
Flink 1.11: An Engine with Unified SQL Support for Batch and Streaming Data
150 posts | 43 followers
FollowApache Flink Community China - June 28, 2021
Apache Flink Community China - January 11, 2021
Alibaba EMR - May 7, 2020
Alibaba Cloud_Academy - May 26, 2022
Alibaba Clouder - December 2, 2020
Apache Flink Community China - March 29, 2021
150 posts | 43 followers
FollowRealtime Compute for Apache Flink offers a highly integrated platform for real-time data processing, which optimizes the computing of Apache Flink.
Learn MoreAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreApsaraDB for HBase is a NoSQL database engine that is highly optimized and 100% compatible with the community edition of HBase.
Learn MoreMore Posts by Apache Flink Community