By Alwyn Botha, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.
This set of tutorials focuses on giving you practical experience on using Docker Compose when working with containers on Alibaba Cloud Elastic Compute Service (ECS).
Part 4 of this series looked at some productivity tips and best practices of running Docker compose limits. In the final part of this series, we will talk about parallelism in Docker containers and conclude our tutorial series.
Parallelism configures how quickly a service should be updated. Containers started up fresh / new are started as quickly as possible. New containers do not use this update_config settings.
Therefore to play with this and test it, we need a set of already running containers so that we can observe the UPDATE parallelism process.
First off parallelism = 1 ( the default ).
Add the following to your docker-compose.yml using
nano docker-compose.yml
version: "3.7"
services:
alpine:
image: alpine:3.8
command: sleep 600
deploy:
replicas: 6
update_config:
parallelism: 1
Run:
docker stack rm mystack
docker stack deploy -c docker-compose.yml mystack
docker ps -a
And around 3 seconds later, we now have 6 containers running. This first docker stack deploy we ran ignored the parallelism option.
Unfortunately, if we just rerun docker stack deploy, we still will not see the parallelism option in action. Docker is too clever - it reads the docker-compose.yml file and see that nothing changed, so it does not update and containers in our mystack.
To prove this, start another console command session and run docker events
Back at the original shell, run 3 times
docker stack deploy -c docker-compose.yml mystack
Observe the other console output:
Expected output :
2018-11-08T12:48:34.385699607+02:00 service update qdzqiek7c59mkxznatszsh13j (name=mystack_alpine)
2018-11-08T12:48:37.423780596+02:00 service update qdzqiek7c59mkxznatszsh13j (name=mystack_alpine)
2018-11-08T12:48:39.804717121+02:00 service update qdzqiek7c59mkxznatszsh13j (name=mystack_alpine)
The service gets updated 3 separate times but no stopping and no starting of fresh containers.
Therefore to force an update we need to make a change to docker-compose.yml
Easiest change: just add a digit to back of the sleep 600.
Do this now, and rerun
docker stack deploy -c docker-compose.yml mystack
Observe the other console output - not shown here.
LOTS of activities: new containers created and old ones killed.
If you run docker ps -a, you will see the new containers at the top and the exited ones below. See that I changed sleep to 6001.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bafa8e4e1d62 alpine:3.8 "sleep 6001" 21 seconds ago Up 4 seconds mystack_alpine.5.mtw5rzfpyd6dwnh9mez7p7hlk
a80b531eff59 alpine:3.8 "sleep 6001" 21 seconds ago Up 4 seconds mystack_alpine.6.nl9culv3otymevgjufamkkwld
fa45c52b0825 alpine:3.8 "sleep 6001" 21 seconds ago Up 4 seconds mystack_alpine.1.oe2d85c55uf1qlbvv1pozcsgx
2565cfbda1db alpine:3.8 "sleep 6001" 21 seconds ago Up 5 seconds mystack_alpine.4.5zaqrvwv32ou8vmdvnbu21qtn
b69ceeaf69a1 alpine:3.8 "sleep 6001" 21 seconds ago Up 5 seconds mystack_alpine.2.utc38k1pg124zx65ae1s8qo5g
9669904d0bb1 alpine:3.8 "sleep 6001" 21 seconds ago Up 4 seconds mystack_alpine.3.zbltrdwmk0omxtkywuwhlw9ub
dc8566cc12ae alpine:3.8 "sleep 600" 9 minutes ago Exited (137) 8 seconds ago mystack_alpine.3.bi13jj6v7f2s3b31yc6k9dmf0
9d385bfd3565 alpine:3.8 "sleep 600" 9 minutes ago Exited (137) 8 seconds ago mystack_alpine.6.g8w5a0fe0ufcum2y2lhd0i1dq
58f14d78f436 alpine:3.8 "sleep 600" 9 minutes ago Exited (137) 8 seconds ago mystack_alpine.1.zotzhrpqblzzyo62urafwgzcs
2090bb37bb31 alpine:3.8 "sleep 600" 9 minutes ago Exited (137) 8 seconds ago mystack_alpine.2.loezk57p62tkfohgzbh1tc1j8
c8df0b31e188 alpine:3.8 "sleep 600" 9 minutes ago Exited (137) 8 seconds ago mystack_alpine.4.619ms1rkhar35un6x4g5ulc3h
c85a0f2db1e0 alpine:3.8 "sleep 600" 9 minutes ago Exited (137) 8 seconds ago mystack_alpine.5.odw21g73i1p62s90lpj1936xv
We just witnessed container updates happening. However, the purpose of this part of the tutorial is to show how parallelism works:
Stop previous docker events command in your other console and enter this:
docker events --filter event=create --filter event=kill
Make any digit change to sleep time in docker-compose.yml
Rerun:
docker stack deploy -c docker-compose.yml mystack
Repeatedly run docker ps -a in first console. After a minute the update process will be done.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
84da4828fea3 alpine:3.8 "sleep 6002" 3 seconds ago Created mystack_alpine.4.ludkp9e9ec63tf27n05j2h8rt
99d63687086c alpine:3.8 "sleep 6002" 18 seconds ago Up 3 seconds mystack_alpine.1.zbfm2q2403wg5f0626dlodab4
5d4ac8f2ae15 alpine:3.8 "sleep 6002" 32 seconds ago Up 17 seconds mystack_alpine.5.oadzbajbcr6l1rms28kb23xux
350971f0734e alpine:3.8 "sleep 6002" 47 seconds ago Up 32 seconds mystack_alpine.2.lxggijot4518tj0xl3bi36eay
95f6fcc3c898 alpine:3.8 "sleep 6002" About a minute ago Up 46 seconds mystack_alpine.6.qgje7g5r7e7e24neuqiafip0g
960174cdab88 alpine:3.8 "sleep 6002" About a minute ago Up About a minute mystack_alpine.3.v9zm2yipmvjzb673da8sbryh1
You get about one new container every 10-15 seconds. This is what you get with parallelism = 1.
Investigate the second console output: docker events
You will see: 1 new container created, 1 old container killed - and so on for all 6 containers.
To speed this update process up, lets increase parallelism to 3 in our docker-compose.yml
nano docker-compose.yml
parallelism: 3
Rerun
docker stack deploy -c docker-compose.yml mystack
If you run docker ps -a you will see the new containers at the top and the exited ones below. See that I changed sleep to 6003.
You will immediately see 3 containers created at the top and the remaining 3 created 15 seconds later.
So surely the perfect ideal is parallelism: 6 ? Update all 6 in one go.
Unfortunately, this is not the case. If you have a failing update you will have 6 failed new containers and 6 previous perfectly running containers have all exited.
Fortunately max_failure_ratio and failure_action deals with such problem cases. We'll discuss more about this topic in the next section.
You can use these 2 settings to lessen the damage caused by failed updates.
failure_ratio specifies which percent failure to tolerate: Syntax: 0.1 means 10 percent.
failure_action: Specifies what to do if an update fails. continue, rollback, or pause (default: pause).
Right now we still have our previous containers running. We are going to update that with 6 new containers that each fail immediately.
Add the following to your docker-compose.yml using
nano docker-compose.yml
version: "3.7"
services:
alpine:
image: alpine:3.8
command: exit 1
deploy:
replicas: 6
update_config:
parallelism: 1
max_failure_ratio: 0
failure_action: pause
Note the command is exit 1 - meaning exit with error response code 1.
Default failure_action is pause but we include it anyway.
Use your other console command session and run docker events
Deploy these failing containers:
docker stack deploy -c docker-compose.yml mystack
Run docker ps -a - after around 30 seconds you will see this:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
cbcaf8fce479 alpine:3.8 "exit 1" 6 seconds ago Created mystack_alpine.2.zvk1x6hevpt3x0q6aha0616sx
3ce0f3bca6c8 alpine:3.8 "exit 1" 12 seconds ago Created mystack_alpine.2.hyikxli7cwuk0xauaqw87epu0
4ae02b292f54 alpine:3.8 "exit 1" 18 seconds ago Created mystack_alpine.2.lseuovn0g4imn75q1eufnyfx9
1ea70f30f397 alpine:3.8 "exit 1" 24 seconds ago Created mystack_alpine.2.tfwagwvevh9vdxyne7cfy41fa
2eeef13d4240 alpine:3.8 "exit 1" 30 seconds ago Created mystack_alpine.2.n5qny1d5sbwah7fgsa83eabat
b926e22199d1 alpine:3.8 "sleep 6003" 21 minutes ago Up 20 minutes mystack_alpine.5.w3ll2y30r1b75137fbbqak1rf
248f8ffe019e alpine:3.8 "sleep 6003" 21 minutes ago Up 20 minutes mystack_alpine.1.62dpe6cgrtmkercmn2bdlo3j3
815143b43f11 alpine:3.8 "sleep 6003" 21 minutes ago Up 21 minutes mystack_alpine.4.enk3mweaht4zqre0jehm2nyn1
c13461b6f58c alpine:3.8 "sleep 6003" 21 minutes ago Up 21 minutes mystack_alpine.3.9jfg8kc1l0ps6km6hv5wlzddc
f2dc173cbf21 alpine:3.8 "sleep 6003" 21 minutes ago Up 21 minutes mystack_alpine.6.8mo8t73z58jbf1e9vhvzjup53
6 new containers are created at top of list, but 6 old containers are still running below. But since new containers are not running - they are in created state only.
Observe the events console - Docker is continually destroying and recreating these new containers in order to get them to be 'running'. Clearly pause does not pause.
Cleanup all those containers: ( prune deletes containers that stack rm missed. Yes that happens often unfortunately. )
docker stack rm mystack
docker container prune -f
Pause does not work as I expected. Let's test failure_action: rollback
To test failure_action: rollback we need to start fresh:
Add the following to your docker-compose.yml using
nano docker-compose.yml
version: "3.7"
services:
alpine:
image: alpine:3.8
command: sleep 600
deploy:
replicas: 6
Deploy the working containers:
docker stack deploy -c docker-compose.yml mystack
To create 6 error containers: Add the following to your docker-compose.yml using
nano docker-compose.yml
version: "3.7"
services:
alpine:
image: alpine:3.8
command: exit 1
deploy:
replicas: 6
update_config:
parallelism: 1
max_failure_ratio: 0
failure_action: rollback
Deploy the error containers:
docker stack deploy -c docker-compose.yml mystack
Observe results: run docker ps -a repeatedly for around 30 seconds:
Carefully observe the first 3 containers to see if you can determine what is happening.
docker ps -a
Here is my final results:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c2d1ae806906 alpine:3.8 "sleep 600" 25 seconds ago Up 22 seconds mystack_alpine.2.soht49wefpwm1nvm1gbl7ru1l
455a5907758a alpine:3.8 "exit 1" 40 seconds ago Created mystack_alpine.2.yr0zkfu9n40s0rbezfhjtl6yu
3c254ba9a72b alpine:3.8 "sleep 600" 2 minutes ago Exited (137) 26 seconds ago mystack_alpine.2.04g1vrnoomagvv89aobbbzmxz
b635a1e52147 alpine:3.8 "sleep 600" 2 minutes ago Up 2 minutes mystack_alpine.1.gibfdph75s0o46s5h3x96csm2
0ac32ac0ad34 alpine:3.8 "sleep 600" 2 minutes ago Up 2 minutes mystack_alpine.3.oxs3mjm7vp3c6jbc1kj2kz990
33554d287fe9 alpine:3.8 "sleep 600" 2 minutes ago Up 2 minutes mystack_alpine.5.ds3lra1qvr9y8e8b1xi2cn5c0
f381b1250167 alpine:3.8 "sleep 600" 2 minutes ago Up 2 minutes mystack_alpine.4.t4gv1gor6aul3b53ei6pcxu5e
fd97395ba2ac alpine:3.8 "sleep 600" 2 minutes ago Up 2 minutes mystack_alpine.6.n1nshrlnywqcrvn5u2x93nr10
First the new container is created:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
455a5907758a alpine:3.8 "exit 1" 7 seconds ago Created mystack_alpine.2.yr0zkfu9n40s0rbezfhjtl6yu
Around 5 seconds later one old container is exited.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3c254ba9a72b alpine:3.8 "sleep 600" About a minute ago Exited (137)
Then Docker determines the new container cannot progress into running state so it recreates / rolls back one container with the previous settings in docker-compose.yml
This container is listed right at the top. See status: up 22 seconds; it is working.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c2d1ae806906 alpine:3.8 "sleep 600" 25 seconds ago Up 22 seconds mystack_alpine.2.soht49wefpwm1nvm1gbl7ru1l
Slowly scroll through your output and see if the same thing happened on your computer.
Rollback works great.
Conclusion: the settings below is useful in production.
update_config:
parallelism: 1
max_failure_ratio: 0
failure_action: rollback
From https://docs.docker.com/compose/compose-file/#update_config
monitor: Duration after each task update to monitor for failure (ns|us|ms|s|m|h) (default 0s).
There are no exercises here, but this setting is important.
As you just read its the duration after container update to monitor for failure.
Your application in the container must have sufficient time to do some work before Docker can test it for failure.
The default duration value of 0 seconds is not suitable for practical purposes. This is useful only to ensure that the container start up successfully.
Let's assume you run a web server inside the container. Further assume one website visit per minute. So to have monitor set at 1 minute is probably too frequent. Instead, you could set it at least 5 minutes - to give it opportunity to crash before you test for failure.
You must determine an appropriate monitor duration value for your applications at your workplace.
Health checks are similar to asking the question: can I ping it?
These monitor for failure tests if the container as a whole crashes or not. They test whether the application in the container still work in the context of the other containers that it must cooperate with.
For example, if you deploy a new Apache version, but fail to ensure it can load all required modules, health check pings and wgets of text html pages will work perfectly, but the moment php + Mysql pages need to be served it will crash.
That is why you should give your application in the newly updated container sufficient time to run before Docker can test it for failure.
We want to develop health checks that not only test connectivity, but also functionality.
Such health checks can then within seconds indicate an unhealthy new container - minutes before it gets ( and fails ) at processing real production work.
Below is some docker events output I gathered by experimenting with different parallelism settings: ( I output the events to /tmp and manipulated that file afterwards )
Two kill lines are for one container: signal 15; if not dead quickly enough: send signal 9
parallelism: 1
cut -c1-60 /tmp/events | egrep 'create|kill'
2018-11-03T10:01:35.148583578+02:00 container create 8b51499
2018-11-03T10:01:37.840231366+02:00 container kill be8f3715a
2018-11-03T10:01:37.865611953+02:00 container kill be8f3715a
2018-11-03T10:01:39.886188372+02:00 container create a38781a
2018-11-03T10:01:42.572866743+02:00 container kill e498e5f5f
2018-11-03T10:01:42.598606635+02:00 container kill e498e5f5f
2018-11-03T10:01:44.423905486+02:00 container create 64ae4c0
2018-11-03T10:01:47.123993008+02:00 container kill 914343611
2018-11-03T10:01:47.146988704+02:00 container kill 914343611
2018-11-03T10:01:48.972005129+02:00 container create b37cef5
2018-11-03T10:01:51.642712373+02:00 container kill 92619e0a6
2018-11-03T10:01:51.667003244+02:00 container kill 92619e0a6
2018-11-03T10:01:53.497100262+02:00 container create 8a73470
2018-11-03T10:01:56.163374613+02:00 container kill 420dc4d89
2018-11-03T10:01:56.188237090+02:00 container kill 420dc4d89
2018-11-03T10:01:58.000843644+02:00 container create 41b4480
2018-11-03T10:02:00.699576981+02:00 container kill c8f4d973c
2018-11-03T10:02:00.721565297+02:00 container kill c8f4d973
parallelism: 2
cut -c1-60 /tmp/events | egrep 'create|kill'
2018-11-03T10:08:47.299682233+02:00 container create 6f1df52
2018-11-03T10:08:47.567222566+02:00 container create ea9bf95
2018-11-03T10:08:49.943237084+02:00 container kill 8b51499ad
2018-11-03T10:08:49.958679991+02:00 container kill 64ae4c05c
2018-11-03T10:08:49.977677725+02:00 container kill 8b51499ad
2018-11-03T10:08:49.997521920+02:00 container kill 64ae4c05c
2018-11-03T10:08:52.539334772+02:00 container create cdbbef8
2018-11-03T10:08:52.812900162+02:00 container create 16e1af2
2018-11-03T10:08:55.157361545+02:00 container kill b37cef51e
2018-11-03T10:08:55.169221551+02:00 container kill 8a73470b2
2018-11-03T10:08:55.193477357+02:00 container kill b37cef51e
2018-11-03T10:08:55.207277169+02:00 container kill 8a73470b2
2018-11-03T10:08:57.830146930+02:00 container create 0ab17e5
2018-11-03T10:08:57.949710902+02:00 container create 9cc8547
2018-11-03T10:09:00.233887111+02:00 container kill a38781a0f
2018-11-03T10:09:00.257647812+02:00 container kill 41b4480ad
2018-11-03T10:09:00.272834309+02:00 container kill a38781a0f
2018-11-03T10:09:00.288598877+02:00 container kill 41b4480ad
parallelism: 3
cut -c1-60 /tmp/events | egrep 'create|kill'
2018-11-03T10:11:34.283896923+02:00 container create 8a0373b
2018-11-03T10:11:34.583536405+02:00 container create 61cbe75
2018-11-03T10:11:34.803563295+02:00 container create a2bd707
2018-11-03T10:11:36.854815108+02:00 container kill cdbbef891
2018-11-03T10:11:36.861978752+02:00 container kill 0ab17e57f
2018-11-03T10:11:36.890035520+02:00 container kill ea9bf9502
2018-11-03T10:11:36.899725135+02:00 container kill cdbbef891
2018-11-03T10:11:36.905718703+02:00 container kill 0ab17e57f
2018-11-03T10:11:36.922317316+02:00 container kill ea9bf9502
2018-11-03T10:11:39.891013146+02:00 container create 7576427
2018-11-03T10:11:40.238136177+02:00 container create a26d947
2018-11-03T10:11:40.439589543+02:00 container create 53002e5
2018-11-03T10:11:42.434787914+02:00 container kill 16e1af20f
2018-11-03T10:11:42.445537379+02:00 container kill 9cc854731
2018-11-03T10:11:42.485085063+02:00 container kill 9cc854731
2018-11-03T10:11:42.490162686+02:00 container kill 16e1af20f
2018-11-03T10:11:42.498272764+02:00 container kill 6f1df5233
2018-11-03T10:11:42.547462663+02:00 container kill 6f1df523
parallelism: 6
cut -c1-60 /tmp/events | egrep 'create|kill'
2018-11-03T10:13:22.444286947+02:00 container create bb4b2db
2018-11-03T10:13:22.838989116+02:00 container create a00d0b1
2018-11-03T10:13:23.039740661+02:00 container create f1f9090
2018-11-03T10:13:23.595395816+02:00 container create 568b219
2018-11-03T10:13:23.824193225+02:00 container create 77d7d22
2018-11-03T10:13:24.191986311+02:00 container create 1ea8ad8
2018-11-03T10:13:25.105183046+02:00 container kill 8a0373b67
2018-11-03T10:13:25.146410226+02:00 container kill 8a0373b67
2018-11-03T10:13:25.150991208+02:00 container kill 53002e5b3
2018-11-03T10:13:25.190384877+02:00 container kill 75764275f
2018-11-03T10:13:25.204178523+02:00 container kill a2bd707bc
2018-11-03T10:13:25.230797581+02:00 container kill a26d9476c
2018-11-03T10:13:25.234104353+02:00 container kill 61cbe7540
2018-11-03T10:13:25.252980697+02:00 container kill 53002e5b3
2018-11-03T10:13:25.268581894+02:00 container kill 75764275f
2018-11-03T10:13:25.283548856+02:00 container kill a2bd707bc
2018-11-03T10:13:25.299920739+02:00 container kill a26d9476c
2018-11-03T10:13:25.306631692+02:00 container kill 61cbe7540
This configuration option specify how to restart containers when they exit.
The default value for condition is any: this means that container restarts on-failure or when it successfully exits.
restart_policy is applied only when deploying a stack in swarm mode.
restart is applied when deploying using docker-compose up - starting a single service.
To see how this works, we will successfully exit our container after a 3 seconds sleep. We expect the container to be restarted automatically.
Add the following to your docker-compose.yml using
nano docker-compose.yml
version: "3.7"
services:
alpine:
image: alpine:3.8
command: sleep 3
deploy:
replicas: 1
docker stack deploy -c docker-compose.yml mystack
After running docker ps -a repeatedly for 25 seconds you will see the trend: container exits after 3 seconds and new one is automatically created to fill its place. The default restart_policy works as expected.
docker ps -a
Expected output after around 25 seconds:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
42738b793169 alpine:3.8 "sleep 3" 6 seconds ago Up Less than a second mystack_alpine.1.wopmb7xzyakbftkwsk9eq0goo
95e00ae7c883 alpine:3.8 "sleep 3" 15 seconds ago Exited (0) 7 seconds ago mystack_alpine.1.h0n7hh12bn4jgb35bozpirm9r
bac92d42ca3f alpine:3.8 "sleep 3" 25 seconds ago Exited (0) 16 seconds ago mystack_alpine.1.2xujcjgypj9kbdcwsw0g0ysw8
0998efbcba8f alpine:3.8 "sleep 3" 34 seconds ago Exited (0) 26 seconds ago mystack_alpine.1.puqpmp9u13ivqvclah5cmidgx
This will go on forever. Fortunately we can limit this restart max attempts.
max_attempts specifies how many times to attempt to restart a container before giving up (default: never give up)
Add the following to your docker-compose.yml using
nano docker-compose.yml
version: "3.7"
services:
alpine:
image: alpine:3.8
command: sleep 2.22
deploy:
replicas: 1
restart_policy:
max_attempts: 3
Deploy this using:
docker stack deploy -c docker-compose.yml mystack
If you run you will notice via repeated docker ps -a after 20 seconds that only 3 new containers get created. Then the auto restarting stops as expected.
You can also adjust how long to wait between restart attempts using delay.
Its default value is zero as you have seen: restarts happen immediately.
You can set this to any duration you desire. This tutorial will not cover this, but you are welcome to perform a quick test.
Finally there is also a window option:
window: How long to wait before deciding if a restart has succeeded, specified as a duration (default: decide immediately).
Feel free to test this on your own as well.
For an even more interesting exercise experiment with the interaction between delay and window.
This concludes this tutorial set, but only marks the start of your docker-compose training on Alibaba Cloud Elastic Compute Service (ECS). You should now have a solid understanding and considerable practical experience of some docker-compose configuration options.
To learn more about Docker compose, visit the official documentation at https://docs.docker.com/compose/compose-file/
Or check out Alibaba Cloud's Container Service to learn how you can reap the full benefits of containerization.
Backing Up and Restoring Databases with Alibaba Cloud Database Backup Service
2,599 posts | 762 followers
FollowAlibaba Clouder - January 24, 2019
Alibaba Clouder - January 24, 2019
Alibaba Clouder - January 25, 2019
Alibaba Clouder - January 24, 2019
Alibaba Clouder - July 24, 2020
Apache Flink Community China - April 23, 2020
2,599 posts | 762 followers
FollowElastic and secure virtual cloud servers to cater all your cloud hosting needs.
Learn MoreLearn More
A secure image hosting platform providing containerized image lifecycle management
Learn MoreMore Posts by Alibaba Clouder