PouchContainer is a lightweight open-source and enterprise-level rich container engine technology developed by Alibaba. PouchContainer is used to develop a container ecosystem and features high performance, robust isolation, and low loads.
A container network is the basis for inter-container communication and is an important concept for container operation. This article describes how PouchContainer establishes a network and connects containers to the network, as well as analyzes the source code to provide a full explanation of the network connection.
PouchContainer adopts the Container Network Model (CNM) introduced by Docker to implement inter-container communication based on libnetwork. The CNM has three main concepts: sandbox, endpoint, and network.
A sandbox represents the network stack configuration of a container, including a network interface card (NIC) for container management, route table, and DNS setting. A sandbox can be established by a Linux-based network namespace, a FreeBSD jail, or other similar methods. A sandbox may contain multiple endpoints.
An endpoint connects a sandbox to a network. An endpoint can be established by a veth pair, an Open vSwitch internal port, or other methods. An endpoint belongs to only one network and only one sandbox.
A network is a set of endpoints that communicate with each other. A network can be established by a Linux bridge, a VLAN, or other methods. A network may contain multiple endpoints.
As shown in the following figure, Container A and Container B belong to the backend network and form a network by using their respective endpoints (highlighted in purple); Container B and Container C belong to the frontend network and form a network by using their respective endpoints (highlighted in blue). Container A can communicate with Container B, and Container B can communicate with Container C.
Though Container B has two endpoints corresponding to the backend network and frontend network, the two endpoints do not communicate with each other. The backend network and frontend network are isolated from each other, though they are connected to the same container. Container A and Container C cannot communicate with each other.
The bridge mode is the default network mode of PouchContainer. When creating a container, if you do not specify a network mode by using the --net
parameter, the container is created in bridge mode.
When pouchd starts, a virtual bridge is automatically created on the host, and a Pouch container started on the host connects to the virtual bridge and perform communication through the Layer-2 network in the container.
If you select the host mode when starting a container, the container does not obtain an independent network namespace, but it shares a network namespace with the host. Therefore, the container uses the IP address and port of the host for lack of NIC and IP address configuration, but its FS and PID are isolated from those of the host.
A container created in container mode shares a network namespace with an existing container and uses its veth device pair.
A container created in none mode has an independent network namespace, but it does not have any network configuration. Therefore, a container in none mode can be considered not to communicate with other containers. After a container is created, you can add a NIC and configure an IP address for the container to enable it to communicate with the containers in the same network.
A network is a unique and identifiable endpoint group, where the endpoints can communicate with each other. From the CNM's perspective, endpoints can be simply considered as veth device pairs. The sandbox in a container may have multiple endpoints, each of which represents a connection to a network.
// daemon/mgr/container.go
// Connect is used to connect a container to a network.
func (mgr *ContainerManager) Connect(ctx context.Context, name string, networkIDOrName string, epConfig *types.EndpointSettings) error {
c, err := mgr.container(name)
⋯⋯
n, err := mgr.NetworkMgr.Get(context.Background(), networkIDOrName)
⋯⋯
if c.State.Status != types.StatusRunning {
if c.State.Status == types.StatusDead {
⋯⋯
}
if err := mgr.updateNetworkConfig(c, n.Name, epConfig); err != nil {
return err
}
} else if err := mgr.connectToNetwork(ctx, c, networkIDOrName, epConfig); err != nil {
return err
}
return c.Write(mgr.Store)
}
The Connect
function obtains a container and a network based on input parameters. The epConfig
parameter stores the parameters input by a flag from the CLI, such as the container alias in a network and specified IP address range.
Check c.State.Status
to determine the container status. If the container is in the dead state, it does not support the Connect operation. If the container is in a non-running state but remains live, updateNetworkConfig()
is called to update the network configuration of the container and add the input epConfig
to the network configuration. In this case, no NIC is allocated to the container, so it is not connected to the network. If the container is in the running state, connectToNetwork()
is called for subsequent operations. connectToNetwork()
completes NIC configuration based on the given network and container, allocates a NIC on the host, and adds the NIC to the sandbox in the container. In this way, the container is connected to the network. The specific process will be described later.
c.Write(mgr.Store)
writes the configurations the container uses for network connection to the metadata of the container, which ensures data persistence. Otherwise, the established network connection is a one-time connection, and all data and configurations will be lost after pouchd restarts.
func (mgr *ContainerManager) connectToNetwork(ctx context.Context, container *Container, networkIDOrName string, epConfig *types.EndpointSettings) (err error) {
⋯⋯
endpoint := mgr.buildContainerEndpoint(container)
endpoint.Name = network.Name
endpoint.EndpointConfig = epConfig
if _, err := mgr.NetworkMgr.EndpointCreate(ctx, endpoint); err != nil {
⋯⋯
}
return mgr.updateNetworkConfig(container, networkIDOrName, endpoint.EndpointConfig)
}
The endpoint contains three pieces of information, which is from the container, network, and flag configuration in the connect command, respectively. buildContainerEndpoint()
has relatively simple logic used to obtain the container information required by the endpoint. NetworkMgr's EndpointCreate()
is called subsequently for specific construction purposes.
// EndpointCreate is used to create network endpoint.
func (nm *NetworkManager) EndpointCreate(ctx context.Context, endpoint *types.Endpoint) (string, error) {
⋯⋯
// create endpoint
epOptions, err := endpointOptions(n, endpoint)
if err != nil {
return "", err
}
endpointName := containerID[:8]
ep, err := n.CreateEndpoint(endpointName, epOptions...)
if err != nil {
return "", err
}
⋯⋯
// create sandbox
sb := nm.getNetworkSandbox(containerID)
if sb == nil {
sandboxOptions, err := buildSandboxOptions(nm.config, endpoint)
⋯⋯
sb, err = nm.controller.NewSandbox(containerID, sandboxOptions...)
⋯⋯
}
// endpoint joins into sandbox
joinOptions, err := joinOptions(endpoint)
⋯⋯
if err := ep.Join(sb, joinOptions...); err != nil {
return "", fmt.Errorf("failed to join sandbox(%v)", err)
}
// update endpoint settings
epInfo := ep.Info()
if epInfo.Gateway() != nil {
endpointConfig.Gateway = epInfo.Gateway().String()
}
if epInfo.GatewayIPv6().To16() != nil {
endpointConfig.IPV6Gateway = epInfo.GatewayIPv6().String()
}
endpoint.ID = ep.ID()
endpointConfig.EndpointID = ep.ID()
endpointConfig.NetworkID = n.ID()
iface := epInfo.Iface()
if iface != nil {
if iface.Address() != nil {
mask, _ := iface.Address().Mask.Size()
endpointConfig.IPPrefixLen = int64(mask)
endpointConfig.IPAddress = iface.Address().IP.String()
}
if iface.MacAddress() != nil {
endpointConfig.MacAddress = iface.MacAddress().String()
}
}
return endpointName, nil
}
The endpoint is created by calling libnetwork. endpointOptions()
is called first to construct the EndpointOption
parameter required by the interface. This parameter of the setter function type transfers different options to the interfaces of the network and endpoint. Then, libnetwork's
CreateEndpoint()
is called for specific construction purposes.
A sandbox represents a unique network namespace of the container, and it is created based on libnetwork. Existing sandboxes are traversed to find the corresponding sandbox, which is returned if any. In none mode, the container uses the namespace of the host, and the returned sandbox is empty. A new sandbox is created for subsequent operations.
Adding an endpoint to a sandbox is equivalent to allocating a NIC to a container. The NIC is the core for establishing a connection. The container connects to the network by using the virtual NIC to communicate with other containers.
Last, changes are synchronized to the endpoint configuration.
A PouchContainer network is established step by step. First, a sandbox is created to identify a container with a unique network namespace during communication. An endpoint is then created to enable NIC-based communication. NICs are added to sandboxes to enable inter-container communication. This completes the process of a network connection setup.
503 posts | 48 followers
FollowAlibaba System Software - December 6, 2018
Alibaba System Software - August 14, 2018
Alibaba System Software - August 27, 2018
Alibaba System Software - November 29, 2018
Alibaba System Software - August 30, 2018
Alibaba System Software - August 14, 2018
503 posts | 48 followers
FollowAlibaba Cloud offers an accelerated global networking solution that makes distance learning just the same as in-class teaching.
Learn MoreConnect your business globally with our stable network anytime anywhere.
Learn MoreProvides a control plane to allow users to manage Kubernetes clusters that run based on different infrastructure resources
Learn MoreA secure image hosting platform providing containerized image lifecycle management
Learn MoreMore Posts by Alibaba Cloud Native Community