Cloud-Native Storage: Container Storage and Kubernetes Storage Volumes

This articles explores the opportunities and challenges of cloud-native storage and is the second article of the series on the concepts of container storage.

By Kan Junbao (Junbao), Alibaba Cloud Senior Technical Expert

This series of articles on cloud-native storage explains the concepts, features, requirements, principles, usage, and cases of cloud-native storage. It aims to explore the new opportunities and challenges of cloud-native storage technology. This article is the second in the series and explains the concepts of container storage. If you are not familiar with this concept, I suggest that you read the first article of this series, "Cloud-Native Storage: The Cornerstone of Cloud-Native Applications."

Docker storage volumes and Kubernetes storage volumes are essential for cloud-native storage.

A Docker storage volume is a form of storage organization for container services on a single node. It is based on data storage and container runtime technologies.
A Kubernetes storage volume is designed for storage orchestration in container clusters. It focuses on application-specific storage services.

Docker Storage

This container service is widely used because it provides an organization format for container images during container runtime. Multiple containers can share the same image resource, or more precisely, the same image layer, on the same node by using the container image reuse technology. This means image files are not copied or loaded each time a container is started. This reduces the storage space usage of hosts and improves the container startup efficiency.

1. Container Read/Write Layer

The same image resource can be shared by different running containers, and data can be shared by different images. This improves the storage efficiency of nodes. An image is divided into multiple data layers, and the data of each layer is superimposed and overwritten. This structure enables image data sharing.

Each layer of a container image is read-only so that image data can be shared by multiple containers. In practice, when you start a container by using an image, you can read and write this image in the container. How is this done?

When a container uses an image, the container adds a read/write layer at the top of all image layers. Each running container mounts a read/write layer on top of all layers of the current image. All operations on the container are completed at this layer. When the container is released, the read/write layer is also released.

As shown in the preceding figure, three containers exist on the node. Container 1 and Container 2 run based on Image 1, and Container 3 runs based on Image 2.

The image storage layers are explained as follows:

The node contains six image layers from Layer 1 to Layer 6.
Image 1 consists of Layer 1, Layer 3, Layer 4, and Layer 5.
Image 2 consists of Layer 2, Layer 3, Layer 5, and Layer 6.

The two images share Layer 3 and Layer 5.

Container storage is explained as follows:

Container 1 is started by using Image 1.
Container 2 is started by using Image 1.
Container 3 is started by using Image 2.
Container 1 and Container 2 share Image 1. Each container has an independent writable layer. Container 3 shares Layer 3 and Layer 5 with Container 1 and Container 2.

Data sharing based on the layered structure of container images can significantly reduce the host storage usage by the container service.

In the container image structure with the read/write layer, data is read and written in the following way:

In the case of data read, when different layers contain duplicate data, data at the lower layer is overwritten by the same data at the upper layer.

Data is written at the uppermost read/write layer when you modify a file in a container. The technologies involved are copy-on-write (CoW) and allocate-on-demand.

(1) CoW

CoW indicates that data is copied only when it is written. It is applicable to scenarios where existing files are modified. CoW allows all containers to share the file system of an image and read all data from this image. When you write a file, this file is copied to the uppermost read/write layer of the image for modification. For all the containers that share the same image, each container writes the file copies instead of the original files of the image. When multiple containers write the same file, each container creates a copy of this file in its file system and modifies this copy independently.

(2) Allocate-on-demand

Storage space is allocated only when new files are written to images. This improves the utilization of storage resources. For example, disk space is allocated to a container only when new files are written to this container. Disk space is not pre-allocated during container startup.

2. Storage Drivers

Storage drivers are used to manage container data at each layer to enable image sharing among containers. Storage drivers support read and write operations on files. The storage drivers of containers store and manage data at the read/write layer. Common storage drivers include:

AUFS
OverlayFS
Devicemapper
Btrfs
ZFS

The following section explains how AUFS works.

AUFS is a type of union file system (UFS) and a file-level storage driver.

AUFS is a layered file system able to transparently superimpose one or more existing file systems to form a single layer. AUFS can mount different directories to the file systems under the same virtual file system.

You can superimpose and modify files layer by layer. Only the file system at the uppermost layer is writable, whereas the file systems at lower layers are read-only.

When you modify a file, AUFS creates a copy of this file and uses CoW to transfer this copy from the read-only layer to the writable layer for modification. The modified file is stored at the writable layer.

In Docker, the uppermost writable layer is the container runtime, and all the lower layers are image layers.

3. Docker Data Volumes

Any data read and write operations on applications that run inside a container are completed at the read/write layer of the container. The image layers and the read/write layer are mapped to the underlying structure, which is responsible for intra-container storage in the container's internal file system. A container data volume allows applications inside a container to interact with external storage. This volume is similar to an external storage device, like a USB flash drive.

Containers store data temporarily. The stored data is deleted when containers are released. After you mount external storage to a container file system by using a data volume, the application can reference external data or persistently store its generated data in the data volume. Therefore, container data volumes provide a method for data persistence.

Container storage consists of multiple read-only layers (image layers), a read/write layer, and external storage (data volume).

Container data volumes can be divided into single-node data volumes and cluster data volumes. A single-node data volume is a data volume that the container service mounts to a node. Docker volumes are typical single-node data volumes. Cluster data volumes provide cluster-level data volume orchestration capabilities. Kubernetes data volumes are typical cluster data volumes.

A Docker volume is a directory that can be used by multiple containers simultaneously. It is independent of the UFS and provides the following features:

Data volumes can be shared and reused among containers.
Storage drivers only support write operations at the writable layer, whereas data volumes support direct read and write operations on external storage, which is more efficient.
Data volumes are updated by reading and writing external storage. This does not affect images and the read/write layer of containers.
Data volumes can exist until they are no longer used by containers.

(1) Types of Docker Data Volumes

Bind: You can directly mount host directories and files to containers.

Mount operations only use absolute paths on the host. Host directories can be automatically created.
You can modify any files in container-mounted directories. This makes applications easier to use but also introduces security threats.

Volume: You can enable this mode when you use third-party data volumes.

Volume command on the command-line interface (CLI): docker volume (create/rm)
The Volume mode is provided by Docker, so it cannot be used in non-Docker environments.
In Volume mode, data volumes are divided into named volumes and anonymous volumes. The only difference between them is that the names of anonymous volumes are random codes.
Data volume drivers can be extended to support access by more types of external storage.

Tmpfs is a non-persistent volume type, which stores data in the memory. Tmpfs data is easy to lose.

(2) Syntax for Mounting in Bind Mode

-v: src:dst:opts: This is only applicable to single-node data volumes.

"src" specifies a volume mapping source, host directory, or file. It must be an absolute address.
"dst" specifies the destination address inside the container for the mount operation.
(Optional) "opts" specifies a mount attribute. The options include ro, consistent, delegated, cached, z, and Z.
The options "consistent", "delegated", and "cached" are used to configure shared mount propagation in macOS.
The options "Z" and "z" are used to configure SELinux labels for host directories.

Example:

$ docker run -d --name devtest -v /home:/data:ro,rslave nginx
$ docker run -d --name devtest --mount type=bind,source=/home,target=/data,readonly,bind-propagation=rslave nginx
$ docker run -d --name devtest -v /home:/data:z nginx

(3) Syntax for Mounting in Volume Mode

-v: src:dst:opts: This is only applicable to single-node data volumes.

"src" specifies a volume mapping source. It can be set to the name of a data volume or left empty.
"dst" specifies the destination directory inside the container.
(Optional) "opts" specifies a mount attribute. The option "ro" specifies read-only.

Example:

$ docker run -d --name devtest -v myvol:/app:ro nginx
$ docker run -d --name devtest --mount source=myvol2,target=/app,readonly nginx

4. Usage of Docker Data Volumes

This section explains how to use Docker data volumes.

(1) Volume Type

Anonymous data volumes: docker run –d -v /data3 nginx

By default, the directory /var/lib/docker/volumes/{volume-id}/_data is created on the host for mapping purposes.

Named data volumes: docker run –d -v nas1:/data3 nginx

If the nas1 volume cannot be found, a volume of the default type (local) is created.

(2) Bind Mode

docker run -d -v /test:/data nginx

If the host does not contain the /test directory, this directory is created by default.

(3) Data Volume Containers

A volume container is a running container. Other containers can inherit the data volumes mounted to this container. All the mounts of the container are reflected in the reference containers.

docker run -d --volumes-from nginx1 -v /test1:/data1 nginx

The preceding command is used to inherit all data volumes from a configured container, including custom volumes.

(4) Data Volume Mount Propagation

You can configure mount propagation for Docker volumes by using the propagation command.

Private: Mounts are not propagated. The mounts in the source and destination directories are not propagated to each other.
Shared: Mounts are propagated between the source and destination directories.
Slave: Mounts of the source object can be propagated to the destination object, but not vice versa.
Rprivate: This implements Private recursion, which is the default mode.
Rshared: This implements Shared recursion.
Rslave: This implements Slave recursion.

Examples:

$ docker run –d -v /home:/data:shared nginx
The directories mounted to the /home directory of the host are available in the /data directory of the container, and vice versa.
$ docker run –d -v /home:/data:slave nginx
The directories mounted to the /home directory of the host are available in the /data directory of the container, but not vice versa.

(5) Visibility of Data Volume Mounts

Mount visibility in Volume mode:

Empty local directories and empty image directories: No special operations are required.
Empty local directories and non-empty image directories: Copy the content of image directories to the host. The content is retained when the container is deleted.
Non-empty local directories and empty image directories: Map the content of local directories to the container.
Non-empty local directories and non-empty image directories: Map the content of local directories to the container. The content of container directories is hidden.

Mount visibility in Bind mode: This is determined by host directories.

Empty local directories and empty image directories: No special operations are required.
Empty local directories and non-empty image directories: The container directories become empty.
Non-empty local directories and empty image directories: Map the content of local directories to the container.
Non-empty local directories and non-empty image directories: Map the content of local directories to the container. The content of container directories is hidden.

5. Docker Data Volume Plug-ins

You can use Docker data volumes to mount the external storage of containers to container file systems. To allow containers to support more external storage classes, Docker supports the mounting of different types of storage services through storage plug-ins. The extension plug-ins are also known as volume drivers. You can develop a storage plug-in for each storage class.

Multiple storage plug-ins can be deployed on a single node.
A storage plug-in manages the mounting service of a specific storage class.

The Docker daemon communicates with volume drivers in the following ways:

Sock file: stored in the /run/docker/plugins directory in Linux
Spec file: defined by /etc/docker/plugins/convoy.spec
JSON file: defined by /usr/lib/docker/plugins/infinit.json
Interfaces: Create, Remove, Mount, Path, Umount, Get, List, Capabilities;

Example:

$ docker volume create --driver nas -o diskid="" -o host="10.46.225.247" -o path="/nas1" -o mode="" --name nas1

Docker volume drivers can be used to manage data volumes in single-node container environments or on the Swarm platform. Currently, Docker volume drivers are less used because Kubernetes has become increasingly popular. For more information about Docker volume drivers, see https://docs.docker.com/engine/extend/plugins_volume/

Kubernetes Storage Volumes

1. Basic Concepts

As mentioned above, data volumes can be used to persistently store container data. Below, we will discuss how to define storage for loads or pods in a Kubernetes orchestration system during runtime. Kubernetes is a container orchestration system that is designed for the management and deployment of container applications throughout the cluster. Therefore, we need to define application storage in Kubernetes based on clusters. Kubernetes storage volumes define the relationship between applications and storage in a Kubernetes environment. The following sections explain related concepts.

(1) Data Volumes

A data volume defines the details of external storage and is embedded in a pod. In essence, a data volume records information about external storage for the Kubernetes system. When loads need external storage, the system queries related information in the data volume and mounts external storage.

A data volume has the same lifecycle as the pod where it resides. When the pod is deleted, the data volume disappears (the data is not deleted) at the same time.
Storage details are defined in an orchestration template and are perceived during the application orchestration process.
You can define multiple volumes of the same or different storage classes for a load, also known as a pod.
Each container of a pod can reference one or more volumes. Different containers can share the same volume.

Common types of Kubernetes volumes include:

Local storage: This includes HostPath and emptyDir. These storage volumes store data on specific nodes of the cluster and do not drift with applications. The stored data is unavailable when the nodes are down.
Network storage: This includes Ceph, Glusterfs, network file system (NFS), and Internet Small Computer System Interface (iSCSI). These storage volumes store data through remote storage services. When you use these storage volumes, you need to mount storage services locally.
Secret and ConfigMap: These storage volumes store the object information of a cluster and do not belong to any nodes. Object data is mounted as volumes to nodes for use by applications.
Container Storage Interface (CSI) and FlexVolume: These are two extensions of data volumes and can be viewed as a type of abstract data volume. Each extension can be divided into different storage classes.
Persistent Volume Claim (PVC): This is a mechanism for defining data volumes. A PVC abstracts a data volume into a pod-independent object. The storage information that is defined for or associated with this object is stored in a storage volume and used when Kubernetes loads are mounted.

Examples of volume templates:

volumes:
  - name: hostpath
    hostPath:
      path: /data
      type: Directory
---
  volumes:
  - name: disk-ssd
    persistentVolumeClaim:
      claimName: disk-ssd-web-0
  - name: default-token-krggw
    secret:
      defaultMode: 420
      secretName: default-token-krggw
---
  volumes:
    - name: "oss1"
      flexVolume:
        driver: "alicloud/oss"
        options:
          bucket: "docker"
          url: "oss-cn-hangzhou.aliyuncs.com"

(2) PVC and PV

Kubernetes storage volumes are used at the cluster level instead of the node level.
Kubernetes storage volumes include PVC, Persistent Volume (PV), and Service Catalog (SC) objects. These objects are independent of application loads, also known as pods, and associated by using an orchestration template.
Each Kubernetes storage volume has its own lifecycle, which is independent of the pod lifecycle.

PVCs are a type of abstract storage volume in Kubernetes and represents the data volume of a specific storage class. PVCs are designed to separate storage from application orchestration. A PVC object abstracts storage details and implements storage volume orchestration. This makes storage volume objects independent of application orchestration in Kubernetes and decouples applications from storage at the orchestration layer.

PVs are a specific type of storage volume in Kubernetes. A PV object defines a specific storage class and a set of volume parameters. All information about the target storage service is stored in a PV object. Kubernetes references the PV-stored information for mounting.

The following figure shows the relationships among loads, PVC objects, and PV objects.

A PV object can be used to separate storage from application orchestration and mount data volumes. So what is the purpose of combining PVC and PV objects? Through the combined use of PVC and PV objects, Kubernetes implements secondary abstraction of storage volumes. A PV object describes a specific storage class by defining storage details. Users do not want to have to study underlying details when they use storage services at the application layer. Therefore, it is not a user-friendly practice to define specific storage services at the application orchestration layer. To fix this problem, Kubernetes implements secondary abstraction of storage services. Kubernetes only extracts the parameters related to user relationships, and uses PVC objects to abstract underlying PV objects. Therefore, PVC and PV have different focuses. A PVC object focuses on users' storage needs and provides a unified way to define storage. A PV object focuses on storage details, allowing users to define specific storage classes and storage mount parameters.

Specifically, the application layer declares a storage need (PVC), and Kubernetes selects the PV object that best fits this PVC object and binds them together. PVCs are a type of storage object used by applications and belong to the application domain. That means the PVC object resides in the same noun space as the application. PVs are a type of storage object that belongs to the storage domain instead of a noun space.

PVC and PV objects have the following attributes:

A PVC object is always paired with a PV object. A PVC object must be bound to a PV object before it can be consumed by an application or a pod.
One PVC object is bound to only one PV object. One PV object cannot be bound to multiple PVC objects, and one PVC object cannot be bound to multiple PV objects.
PVCs are a storage class at the application layer and belongs to a specific noun space.
PVs are a storage class at the storage layer. They belong to a cluster instead of a noun space. PV objects are managed by the storage O&M personnel.
Pods consume PVC objects and PVC objects consume PV objects. A PV object defines a specific storage medium.

(3) PVC Definition

The PVC definition template is as follows:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: disk-ssd-web-0
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: alicloud-disk-available
  volumeMode: Filesystem

The PVC-defined storage interfaces are related to the storage access mode, resource capacity, and volume mode. The main parameters are described as follows:

"accessModes" defines the mode of access to storage volumes. The options include ReadWriteOnce, ReadWriteMany, and ReadOnlyMany.

"ReadWriteOnce" specifies that a PVC object can be consumed by one pod at a time through read and write operations.
"ReadWriteMany" specifies that a PVC object can be consumed by multiple pods simultaneously through read and write operations.
"ReadOnlyMany" specifies that a PVC object can be consumed by multiple pods simultaneously in read-only mode.

Note: The preceding access modes are only declared at the orchestration layer. Whether stored files are readable and writable is determined by specific storage plug-ins.

"storage" defines the storage capacity that the specified PVC object is expected to provide. The defined data size is only declared at the orchestration layer. The actual storage capacity is determined by the type of the underlying storage service.

"volumeMode" defines the mode of mounting storage volumes. The options include FileSystem and Block.

"FileSystem" specifies that data volumes are mounted as file systems for use by applications.

"Block" specifies that data volumes are mounted as block devices for use by applications.

(4) PV Definition

The following example illustrates how to orchestrate the PV objects of data volumes in cloud disks.

apiVersion: v1
kind: PersistentVolume
metadata:
  labels:
    failure-domain.beta.kubernetes.io/region: cn-shenzhen
    failure-domain.beta.kubernetes.io/zone: cn-shenzhen-e
  name: d-wz9g2j5qbo37r2lamkg4
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 30Gi
  flexVolume:
    driver: alicloud/disk
    fsType: ext4
    options:
      VolumeId: d-wz9g2j5qbo37r2lamkg4
  persistentVolumeReclaimPolicy: Delete
  storageClassName: alicloud-disk-available
  volumeMode: Filesystem

"accessModes" defines the mode of accessing storage volumes. The options include ReadWriteOnce, ReadWriteMany, and ReadOnlyMany. These options are the same as for PVC.
"capacity" defines the storage volume capacity.
"persistentVolumeReclaimPolicy" defines the reclaim policy, that is, how to process a PV object when the bound PVC object is deleted. The options include Delete and Retain. This parameter will be described in the "Dynamic Data Volumes" section.
"storageClassName" defines the storage class name used by a storage volume. This parameter will be described in the "Dynamic Data Volumes" section.
"volumeMode" has the same meaning as the "volumeMode" parameter for PVC.
"Flexvolume" defines a specific abstract storage class. The sub-configuration items define a specific storage class and a set of storage parameters.

(5) PVC-PV Binding

Only PVC-bound PV objects can be consumed by pods. The PVC-PV binding process is the process of PV consumption. Only a PV object that meets the following requirements can be bound to a PVC object:

VolumeMode: The PV object to be consumed must be in the same volume mode as the PVC object.
AccessMode: The PV object to be consumed must be in the same access mode as the PVC object.
StorageClassName: If this parameter is defined for a PVC object, only a PV object that has the corresponding parameters defined can be bound to this PVC object.
LabelSelector: The appropriate PV object is selected from a PV list through label matching.
storage: The PV object to be consumed must have a storage capacity not less than that of the PVC object.

Only a PV object that meets the preceding requirements can be bound to the PVC object.

If multiple PV objects meet requirements, the most appropriate PV object is selected for binding. Generally, the PV object with the minimum capacity is selected. If multiple PV objects have the same minimum capacity, one of them is randomly selected.

If no PV objects meet the preceding requirements, the PVC object enters the pending state until a conforming PV object appears.

2. Static and Dynamic Storage Volumes

As we have learned earlier, PVCs are secondary storage abstractions for application services. A PVC object provides simple storage definition interfaces. PVs are storage abstractions with complex details. PV objects are generally defined and maintained by the cluster management personnel.

Storage volumes are divided into dynamic storage volumes and static storage volumes based on the PV creation method.

Static storage volumes are PV objects created by administrators.
Dynamic storage volumes are PV objects created by the Provisioner plug-in.

(1) Static Storage Volumes

A cluster administrator analyzes the storage needs of the cluster and pre-allocates storage media. The administrator also creates PV objects to be consumed by PVC objects. If PVC needs are defined in loads, Kubernetes binds PVC and PV objects according to relevant rules. This allows applications to access storage services.

(2) Dynamic Storage Volumes

A cluster administrator configures a backend storage pool and creates a storage class template. When a PVC object needs to consume a PV object, the Provisioner plug-in dynamically creates a PV object based on the PVC needs and the details of the storage class.

Dynamic and static storage volumes are compared as follows:

Dynamic and static storage volumes are allocated to pods, PVC objects, and PV objects in sequence and defined by the same object template.
Dynamic storage volumes are PV objects automatically created by plug-ins, whereas static storage volumes are PV objects manually created by cluster administrators.

Dynamic storage volumes provide the following advantages:

Dynamic volumes allow Kubernetes to implement automatic PV lifecycle management. PV objects are created and deleted by the Provisioner plug-in.
PV objects can be created automatically, which simplifies configuration and reduces the workload of system administrators.
Dynamic volumes maintain consistency between the PVC-required storage capacity and the PV capacity that is configured by the Provisioner plug-in. This optimizes storage capacity planning.

(3) Implementation Process for Dynamic Volumes

When you declare a PVC object, you can add the StorageClassName field to this PVC object. This allows the Provisioner plug-in to create a suitable PV object based on the definition of StorageClassName when no PV object in the cluster fits the declared PVC object. This process can be viewed as the creation of a dynamic data volume by the Provisioner plug-in. The created PV object is associated with the PVC object based on StorageClassName.

A storage class can be viewed as the template used to create a PV storage volume. When a PVC object triggers the automatic PV creation process, a PV object is created by using the content of a storage class. The content includes the name of the target Provisioner plug-in, a set of parameters used for PV creation, and the reclaim mode.

A storage class template is defined as follows:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: alicloud-disk-topology
parameters:
  type: cloud_ssd
provisioner: diskplugin.csi.alibabacloud.com
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer

"provisioner" specifies the name of a registration plug-in, which is used to create a PV object. A storage class defines only one Provisioner plug-in.
"parameters" specifies a set of parameters used to create a data volume. In this example, an SSD-type cloud disk is created.
"reclaimPolicy" specifies the value of the persistentVolumeReclaimPolicy field used to create a PV object. The options include Delete and Retain. "Delete" specifies that a dynamically created PV object is automatically released when the bound PVC object is released. "Retain" indicates that a PV object is dynamically created, but must be released by the administrator.
"allowVolumeExpansion" specifies whether the PV object created based on the current storage class performs dynamic scale-out. The default value is "false". This parameter is only used to enable or disable dynamic scale-out. Whether to enable this feature is determined by the underlying storage plug-in.
"volumeBindingMode" specifies the time when PV objects are dynamically created. The options include Immediate (immediate creation) and WaitForFirstConsumer (delayed creation).

When you create a PVC declaration, Kubernetes finds a suitable PV object in the cluster to be bound to the created PVC object. If no suitable PV object exists, the following process is triggered:

Volume Provisioner watches the existence of this PVC object. If StorageClassName is defined for this PVC object and the Provisioner plug-in defined by the storage class is owned by the PVC object, then the Provisioner plug-in triggers the PV creation process.
The Provisioner plug-in creates a PV object based on the PVC-defined parameters, such as Size, VolumeMode, and AccessModes, as well as the storage class-defined parameters, such as ReclaimPolicy and Parameters.
The Provisioner plug-in creates a data volume in the storage medium by calling the API or by other means. After a data volume is created, the Provisioner plug-in creates a PV object.
The created PV object is bound to the PVC object so that pods can be started.

(4) Delayed Binding of Dynamic Data Volumes

Certain types of storage, such as Alibaba Cloud disks, impose limitations on the mount attribute. For example, data volumes can only be mounted to nodes in the same zone as these volumes. This type of storage volume produces the following problems:

A data volume is created in Zone A, but Zone A has no available node resources. As a result, the created volume cannot be mounted to a pod upon startup.
When administrators plan PVC and PV objects, they cannot determine in which zones they can create multiple PV objects for backup.

The storage class template provides the volumeBindingMode field to fix the preceding problems. When this field is set to WaitForFirstConsumer, the Provisioner plug-in delays data volume creation when it receives the PVC pending state. Instead, the Provisioner plug-in creates a data volume only after the PVC object is consumed by a pod.

The detailed process is as follows:

The Provisioner plug-in delays data volume creation when it receives the PVC pending state. Instead, the Provisioner plug-in creates a data volume only after the PVC object is consumed by a pod.
If a pod consumes the PVC object and the scheduler determines that the PVC object enables delayed binding, then the PV scheduling process continues. The scheduler patches the scheduling result to the metadata of the PVC object. Storage scheduling will be described in a later article.
When the Provisioner plug-in determines that scheduling information has been written to the PVC object, it retrieves location information such as the zone and node based on the scheduling information to create a data volume. Then, the Provisioner plug-in triggers the creation process.

The delayed binding feature is used to schedule application loads to ensure that sufficient resources are available for use by pods before dynamic volumes are created. This also ensures that data volumes are created in zones with available resources and improves the accuracy of storage planning.

We recommend that you use the delayed binding feature when you create dynamic volumes in a multi-zone cluster. The preceding configuration process is supported by Alibaba Cloud Container Service for Kubernetes (ACK) clusters.

3. Example

The following example illustrates how pods consume PVC and PV objects:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nas-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi
  selector:
    matchLabels:
      alicloud-pvname: nas-csi-pv
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nas-csi-pv
  labels:
    alicloud-pvname: nas-csi-pv
spec:
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  flexVolume:
    driver: "alicloud/nas"
    options:
      server: "***-42ad.cn-shenzhen.extreme.nas.aliyuncs.com"
      path: "/share/nas"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-nas
  labels:
    app: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx1
        image: nginx:1.8
      - name: nginx2
        image: nginx:1.7.9
        volumeMounts:
          - name: nas-pvc
            mountPath: "/data"
      volumes:
        - name: nas-pvc
          persistentVolumeClaim:
            claimName: nas-pvc

Template explanation:

The preceding application is an NGINX service that is orchestrated in Deployment mode. Each pod includes two containers: nginx1 and nginx2.
The template defines the Volumes field to mount data volumes for the application. The data volumes are defined as PVC objects.
Within the application, the data volume nas-pvc is mounted to the /data directory of the nginx2 container. No data volume is mounted to the nginx1 container.
The PVC object nas-pvc is defined as a storage volume with no less than 50 GB capacity and assigned the ReadWriteOnce permission. PV labeling is required.
The PV object nas-csi-pv is defined as a 50 GB storage volume that is assigned the ReadWriteOnce permission, Retain mode, and Flexvolume type. This PV object is configured with a label.

According to the PVC-PV binding logic, this PV object meets the PVC consumption requirements. Therefore, the PVC object is bound to the PV object and mounted to a pod.

Summary

This article gives a detailed explanation of container storage, including single-node Docker data volumes and cluster-level Kubernetes data volumes. Kubernetes data volumes are designed for cluster-level storage orchestration and can be mounted to nodes. Kubernetes provides a sophisticated architecture to implement complex storage volume orchestration capabilities. The next article will explain the Kubernetes storage architecture and its implementation process.

Community

Cloud-Native Storage: Container Storage and Kubernetes Storage Volumes

Docker Storage

1. Container Read/Write Layer

(1) CoW

(2) Allocate-on-demand

2. Storage Drivers

3. Docker Data Volumes

(1) Types of Docker Data Volumes

(2) Syntax for Mounting in Bind Mode

(3) Syntax for Mounting in Volume Mode

4. Usage of Docker Data Volumes

(1) Volume Type

(2) Bind Mode

(3) Data Volume Containers

(4) Data Volume Mount Propagation

(5) Visibility of Data Volume Mounts

5. Docker Data Volume Plug-ins

Kubernetes Storage Volumes

1. Basic Concepts

(1) Data Volumes

(2) PVC and PV

(3) PVC Definition

(4) PV Definition

(5) PVC-PV Binding

2. Static and Dynamic Storage Volumes

(1) Static Storage Volumes

(2) Dynamic Storage Volumes

(3) Implementation Process for Dynamic Volumes

(4) Delayed Binding of Dynamic Data Volumes

3. Example

Summary

Read previous post:

Read next post:

Alibaba Cloud Native Community

You may also like

Comments

Alibaba Cloud Native Community

Related Products

Container Service for Kubernetes

Cloud-Native Applications Management Solution

ACK One

Storage Capacity Unit

A Free Trial That Lets You Build Big!