By Bruce Wu
With the popularization of the container technology, more and more applications are container-based. Containers are used frequently, but most container users may ignore a simple but important problem - the size of container images. This article briefly describes the necessity of simplifying container images, and shows you some common tricks for minimizing Java images by taking a Spring Boot-based Java application as an example.
Simplifying container images is very necessary. This will be explained in terms of both security and agility.
Removing unnecessary components from the image can reduce the attack surface and security risks. Docker allows you to limit operations within your container by using Seccomp, and configure security policies for the container by using AppArmor. However, you must have sufficient proficiency in the security field to use them.
A simplified container image improves the deployment speed of the container. Assume that the access traffic suddenly bursts, and you need to increase the number of containers to address the suddenly increased pressure. If some hosts do not contain the target image, you need to first pull the image and then start the container. In this case, smaller images can speed up the process and shorten the period for scaling up. In addition, smaller images can be built faster and can save the storage and transmission costs.
You can perform the following steps to containerize a Java application:
The example used in this section is spring-boot-docker, a Spring Boot-based Java application. The unoptimized dockerfile file used in this example is as follows:
FROM maven:3.5-jdk-8
COPY src /usr/src/app/src
COPY pom.xml /usr/src/app
RUN mvn -f /usr/src/app/pom.xml clean package
ENTRYPOINT ["java","-jar","/usr/src/app/target/spring-boot-docker-1.0.0.jar"]
The application was created by using Maven, and maven:3.5-jdk-8 is specified as the base image in the dockerfile. The size of this image is 635 MB. The size of the final image created through this method is 719 MB, which is quite large. The reasons are that the base image is large, and Maven downloads many JAR packages to build the final image.
To run a Java application, you need only the Java Runtime Environment (JRE). You do not need Maven or any compiling, debugging, or running tools of the Java Development Kit (JDK). Therefore, a straight forward optimization method is to separate the image that compiles and creates the Java source code from the image that runs the Java application. To do this, you need to maintain two dockerfile files before the release of Docker 17.05, which increases the complexity of image building. Starting from Docker 17.05, the Multi-stage builds feature allows you to use multiple FROM statements in one dockerfile. Each FROM statement can specify different base images and start a completely new image-building process. You can choose to copy the product of a previous image-building stage to another stage, and keep only the necessary content in the final image. The optimized dockerfile file is as follows:
FROM maven:3.5-jdk-8 AS build
COPY src /usr/src/app/src
COPY pom.xml /usr/src/app
RUN mvn -f /usr/src/app/pom.xml clean package
FROM openjdk:8-jre
ARG DEPENDENCY=/usr/src/app/target/dependency
COPY --from=build ${DEPENDENCY}/BOOT-INF/lib /app/lib
COPY --from=build ${DEPENDENCY}/META-INF /app/META-INF
COPY --from=build ${DEPENDENCY}/BOOT-INF/classes /app
ENTRYPOINT ["java","-cp","app:app/lib/*","hello.Application"]
The dockerfile uses maven:3.5-jdk-8
as the build image in the first stage, and openjdk:8-jre
as the base image to run the Java application. Only the .class
file that was compiled in the first stage is copied to the final image together with third-party JAR dependencies. The size of the image is reduced to 459 MB after the optimization.
Although multistage builds do have reduced the size of the final image, 459 MB is still too large. Through comprehensive analysis, we find that the size of the base image openjdk:8-jre
is 443 MB, which is too large. Therefore, the next step of optimization is to reduce the size of the base image.
Distroless, an open source project of Google, was developed to solve this problem. Distroless images contain only the application and its runtime dependencies. They do not contain package managers, shells or any other programs you would expect to find in a standard Linux distribution. Currently, Distroless provides base images for applications running in environments such as Java, Python, Node.js and .NET.
The dockerfile file that uses a distroless image is as follows:
FROM maven:3.5-jdk-8 AS build
COPY src /usr/src/app/src
COPY pom.xml /usr/src/app
RUN mvn -f /usr/src/app/pom.xml clean package
FROM gcr.io/distroless/java
ARG DEPENDENCY=/usr/src/app/target/dependency
COPY --from=build ${DEPENDENCY}/BOOT-INF/lib /app/lib
COPY --from=build ${DEPENDENCY}/META-INF /app/META-INF
COPY --from=build ${DEPENDENCY}/BOOT-INF/classes /app
ENTRYPOINT ["java","-cp","app:app/lib/*","hello.Application"]
The only difference between this dockerfile and the previous one is that the base image for running the application is changed from openjdk:8-jre
(443 MB) to gcr.io/distroless/java
(119 MB). As a result, the size of the final image becomes 135 MB.
The only inconvenience of using a distroless image is that the image does not contain shell. You cannot use docker attach to attach the standard input, output, and error (or any combination of the three) of your application to a running container for debugging. debug image of distroless provides a busybox shell. But you have to repackage the image and deploy the container, which is not helpful for containers that have been deployed based on non-debug images. From a security point of view, this could be an advantage because attackers cannot attack through shells.
If you do need to use docker attach and hope to minimize the image size, you can use an alpine image as the base image. Alpine images are characterized by their unbelievably small size, and the base image is only about 4 MB in size.
The dockerfile file that uses an alpine image is as follows:
FROM maven:3.5-jdk-8 AS build
COPY src /usr/src/app/src
COPY pom.xml /usr/src/app
RUN mvn -f /usr/src/app/pom.xml clean package
FROM openjdk:8-jre-alpine
ARG DEPENDENCY=/usr/src/app/target/dependency
COPY --from=build ${DEPENDENCY}/BOOT-INF/lib /app/lib
COPY --from=build ${DEPENDENCY}/META-INF /app/META-INF
COPY --from=build ${DEPENDENCY}/BOOT-INF/classes /app
ENTRYPOINT ["java","-cp","app:app/lib/*","hello.Application"]
Instead of directly using the base alpine image, we chose openjdk:8-jre-alpine
(83MB) as the base image. openjdk:8-jre-alpine was built based on alpine and contains the Java runtime. The size of the image built with this dockerfile is 99.2 MB, which is smaller than the one that is built based on a distroless image.
Run the command docker exec -ti <container_id> sh
to attach to a running container.
Distroless and alpine images can both provide very small base images. Which one should we use in the production environment? If security is your primary concern, distroless is recommended because your packaged application is the only binary file that it can run. If you are more concerned about the size of the image, you can go with alpine.
In addition to the aforementioned tricks, you can perform the following operations to further simplify the image size:
For more tips on optimizing Dockerfiles, see Best practices for writing Dockerfiles.
57 posts | 12 followers
FollowAlibaba Tech - July 19, 2019
Alibaba Clouder - September 1, 2020
Alibaba Clouder - April 14, 2017
Alibaba Cloud Community - October 10, 2024
Alibaba Cloud Serverless - August 4, 2021
Alibaba Cloud Security - May 28, 2019
57 posts | 12 followers
FollowAn all-in-one service for log-type data
Learn MorePlan and optimize your storage budget with flexible storage services
Learn MoreAlibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreLog into an artificial intelligence for IT operations (AIOps) environment with an intelligent, all-in-one, and out-of-the-box log management solution
Learn MoreMore Posts by Alibaba Cloud Storage