Microservices architecture is a new concept that has become very popular and has become a hot topic of research recently. However, the implementation of microservices is still loosely defined and there is no theoretical proof of its effectiveness.
You may be surprised to know that the idea of microservices was first introduced in an article published over fifty years ago. Additionally, over the years, numerous studies have proven the accuracy of numerous points presented in that article.
One of the fundamental concepts introduced in the article is Conway's Laws. Although initially intended to point out the flaws of distributed teams, many organizations have applied Conway's Laws to create efficient microservices architecture.
This article explores the ideas of Conway's Laws with reference to the article titled "Conway's Law under Remote Distance – Team Construction in a Distributed World", written by Mike Amundsen (author of Design RESTful API).
The most famous line in Conway's Law is:
"Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations." - Melvin Conway (1967).
This means that organizations that design systems are constrained to producing designs that replicate the organizations' communication structures. The following figure illustrates this concept.
The figure depicts the existing communication structure of the organizations, which coincides with their respective product development processes. Simply put, organization structure equals system design.
Here, systems mentioned by the author are not restricted to software systems. It is also speculated that the Harvard Business Review initially rejected this article. Therefore, Conway submitted it to a programming magazine, which led to the misconception of the article being about software development. In the beginning, the author did not propose his ideas as laws and only described his findings and conclusions. When the famous book The Mythical Man-Month introduced Brooks' Law and cited some of Conway's points, Conway's ideas were popularized into the well-known Conway's Law we know today.
In his articles, Mike Amundsen summarized some core viewpoints, as stated below.
o The mode of organizational communication is expressed through system design
o A task can never be done perfectly, even with unlimited time, but there is always time to complete a task
o Homomorphism exists between linear systems and linear organizational structures
o A large system organization is easier to decompose than a smaller one
"Human beings are complex social animals."
Other fields have also provided some illustrations on the tight relationship between organized communication and system design. For a complex system, design topics always involve communication between human beings. A good system design addresses issues about such communication. Many viewpoints in the classic era from 1975, The Mythical Man-Month, resonate with this idea.
The most memorable line from The Mythical Man-Month is:
"Adding manpower to a late software project makes it later" – Fred Brooks, (1975)
Increasing the number of programmers to keep up with a tight schedule is a common pitfall for many organizations. While it makes sense to increase the work force to increase output, it just does not apply to the wordl of software development.
Why is this the case? The Mythical Man-Month provides a simple answer: Communication cost increases exponentially as the number of personnel in a project or organization increases. The communication cost can be calculate with the formula n(n-1)/2, where the complexity of the project management algorithm is O(n^2). The following example illustrates the idea of communication cost:
This is the main reason why internet startups have small teams. If a startup has too many employees, it will exhaust the investment from VC soon after the CEO introduces his/her idea to everyone involved.
Another interesting and relevant theory put forward by biologist Robin Dunbar in 1992 is called the "Dunbar Number". At first, Dunbar found that the brain capacity of a primate correlates with the size of its population. He then postulated some estimates on the number of relationships that a human brain can maintain. For example, a typical person would have
Aren't they seemingly associated with the communication costs mentioned above? Yes, our brains limit us to maintain only that many relationships. (In a development team, the number may be even smaller).
Communication issues lead to system design issues that affect the development efficiency of the entire system as well as the final results of product development.
"Rome was not built in a day. Address the issues that can be addressed first."
Erik Hollnagel, one of the titans in agile development, has explained some similar points in his book titled Efficiency-Thoroughness Trade-Off.
"Problem too complicated? Ignore details.
Not enough resources? Give up features."
– Erik Hollnagel (2009)
The system's complexity, the number of functions, market competition, and investor expectations are increasing, but human intelligence remains constant. No organization is certain whether it can find sufficient talents, regardless of the capabilities and funds. For an extremely complex system, there will always be something ignored by the operators. Erik believes that the best solution to this issue is to just "let it be."
We often encounter such issues during daily development tasks. Are the requirements raised by product managers too complex? If so, ignore some minor requirements and focus on the major ones first. Do the product managers have too many requirements? If yes, give up some functions.
Reports indicate that Erik once received an invitation by an airline carrier to provide consulting services on a flight system's stability and safety. Erik believes that it is possible to ensure safety by two means:
For a system as complex as the flight system, some vulnerabilities are likely to be overlooked, no matter how good the tester. Therefore, Erik recommended that the company to drop the idea of setting up a perfect system. Instead, he recommended relative safety and correctness, where the carrier carries out continuous flight tests to identify issues and ensure that the system can automatically recover in case of a fault. The following figure shows the different interpretations of safety.
Does this sound familiar? Doesn't it mean continuous integration and agile development? Absolutely.
The above principle is the same as that applied to the resilience of distributed systems maintained by Internet companies. It is impossible to identify and fix all the bugs in a distributed system, even if unit tests cover the entire system. Distributed systems are prone to errors. The optimal solution is not to eliminate all the issues, but to tolerate them and implement automatic recovery in case of a failure. In a system comprised of microservices, each microservice may stop responding, which is completely normal. We only need to ensure enough redundancy and backup, which is also called resilience or high availability design.
"Create independent subsystems to reduce the communication cost."
The diagram represents a specific application of the internal relationship between an organization and system design according to Conway's first law. Simply put, set up a team suitable for the system that you want. If you have a front-end team, a Java back-end development team, a DBA team, and an O&M team, your system will look like the following:
Instead, if business boundaries create divisions in your system and all members turn their modules into small systems or products to address the same business goals, your larger system will look like a microservice architecture as shown in the following:
The idea of microservices among teams should be "inter-operate, not integrate." Inter-operate means to define system boundaries and interfaces and offer a full stack to the entire team for complete autonomy. If the setup of a team follows this conjecture, it will generate intra-system communication costs, and subsystems will communicate more. Such arrangement results in less inter-system dependency and lower inter-system communication costs.
"Divide and conquer."
As mentioned above, human beings are complex social animals and communication between people is very complicated. When it comes to a system, we often choose to add manpower to reduce its complexity. For our organization, how do we address such communication issues? Divide and conquer. Look at your company, isn't it true that a line-1 manager in your company manages less than 15 people, a line-2 manager manages fewer people than a line-1 manager, a line-3 manager manages even fewer people than a line-2 manager, and so on? (I am not implying that it is more difficult to manage development managers than programmers.)
Therefore, a large organization usually has small team divisions to reduce the communication costs/ management issues. Here are some scenarios for you to consider.
Conway's Law also tells us that we can see organizational communication modes from system design. Each manager is responsible for a certain duty on a small part of a large system. In this way, there are communication boundaries between them and the larger system. As such, the larger system incorporates smaller division teams in charge of the smaller systems (microservice serves well for this).
Let us have a look at how Conway's Law provided the theoretical basis for microservices half a century ago.
Here are some practical suggestions:
When looking at the following microservice criteria, we can easily see the close relationship between microservices and Conway's Law:
This article introduces Conway's laws and explores whether they offer a theoretical explanation of the concept of microservices. It discusses the four laws in detail and the application of each law. The first law talks about the connection between communication and system design. The second law talks about efficiently completing tasks, with perfection not an attainable goal and hence should not be a reason for delayed task completion. Instead, people should focus on completing tasks on time, with regular improvements to follow. The third law talks about the homomorphism that exists between linear systems and linear organizational structures. Finally, the fourth law discusses the means with which people can utilize the "Divide and Conquer" approach to reduce the complexity and costs involved in communication within large enterprises.
The Differences between AI, Machine Learning, and Deep Learning
2,599 posts | 764 followers
FollowAlibaba Cloud Native Community - December 16, 2021
Alibaba Developer - September 22, 2020
Alibaba Cloud Native Community - March 8, 2021
Alibaba Tech - July 2, 2019
Alibaba Tech - April 24, 2020
Alibaba Cloud Community - June 28, 2023
2,599 posts | 764 followers
FollowMSE provides a fully managed registration and configuration center, and gateway and microservices governance capabilities.
Learn MoreHighly reliable and secure deployment solutions for enterprises to fully experience the unique benefits of the hybrid cloud
Learn MoreFully managed, locally deployed Alibaba Cloud infrastructure and services with consistent user experience and management APIs with Alibaba Cloud public cloud.
Learn MoreThis solution enables FinTech companies to run workloads on the cloud, bringing greater customer satisfaction with lower latency and higher scalability.
Learn MoreMore Posts by Alibaba Clouder