By Hongliang Tian, Senior technical expert of Ant Group and head of Occlum open source.
Cloud computing, big data, and artificial intelligence; we are in an era of data explosion. How can we enjoy and use the value generated by massive data while ensuring data security and user privacy? This is undoubtedly a common concern for users, enterprises, and regulatory authorities.
Confidential computing has emerged in recent years and aims to solve this problem. Confidential computing keeps data encrypted and strongly isolated at all times using trusted execution environment (TEE) technology, thus ensuring the security and privacy of users' data. Confidential computing can solve the trust problem in many scenarios, including data fusion and joint analysis between multiple untrusted organizations, confidentiality protection of smart contracts on the blockchain, and public cloud platforms' defense against external or internal attacks, and security protection of highly sensitive information (such as cryptographic materials and medical files).
Confidential computing relies on TEE technology, such as Intel SGX, the most mature cloud TEE technology, but it brings additional functional limitations and compatibility problems. This causes a huge obstacle to the developers of confidential computing. Application development is difficult.
This article analyzes the challenges and pain points currently encountered by SGX application developers and how Occlum, an in-house open-source TEE OS system developed by Ant Group, lowers the threshold for SGX application development to help everyone take advantage of confidential computing.
Since SGX applications are based on this partitioned architecture, application developers usually need to use some SGX SDKs, such as Intel SGX SDK, Open Enclave SDK, Google Asylo, or Apache Rust SGX SDK. However, no matter which SDK is used, developers will encounter the following difficulties in their development:
The dilemma above is quite tricky when developing applications for SGX, restricting the popularity and acceptance of SGX, and confidential computing.
Occlum is an open-source TEE OS of Ant Group, which can lower the development threshold of SGX applications. We need to learn three commands in Occlum: new
, build
, and run
. This section uses Occlum to run a Hello World program in SGX.
Here is a very simple Hello World program:
$ cat hello_world.c
#include <stdio.h>
int main() {
printf("Hello World!\n");
return 0;
}
First, we compile the program with the GCC toolchain (occlum-gcc
) provided by Occlum and verify that it works properly on Linux:
$ occlum-gcc hello_world.c -o hello_world
$ ./hello_world
Hello World!
Then, we create an Occlum instance directory (use occlum new
command) for this program:
$ occlum new occlum_hello
$ cd occlum_hello
The command creates a directory named occlum_hello
and prepares some necessary files (such as configuration file Occlum.json
) and subdirectories (such as image/
) in the directory.
Next, we will make an Occlum enclave file and a trusted image (using occlum build
command) based on the newly compiled hello_world
:
$ cp ../hello_world image/bin
$ occlum build
Finally, we run the hello_world
in SGX (using occlum run
command):
$ occlum run /bin/hello_world
Hello World!
More complex programs can also be ported into SGX through Occlum using a process similar to the one listed above. Users can freely choose their programming language, such as Java, Python, and Go, to modify the application code (or only modify a small amount of application code) without understanding the dichotomous programming model of SGX. Occlum allows application developers to focus their efforts on writing applications rather than porting them for SGX.
After understanding Occlum's basic usage, readers will naturally be curious about the technical principle of Occlum. Why is Occlum's user interface designed like this? What is the technical architecture behind the simple interfaces? This section tries to answer these questions.
One of Occlum's design concepts is Enclave-as-a-Container. In the cloud-native era, containers are of paramount importance, and containers are everywhere. The most common implementation of containers is Linux-based cgroup and namespace (such as Docker), but there are also virtualization-based implementations (such as Kata). We have observed that TEE or enclave can also be used as a container implementation method. Therefore, we purposefully designed Occlum's user interface to be close to Docker and OCI standards to provide a consistent user experience. In addition to the aforementioned new
, build
, and run
commands, Occlum provides commands, such as start
, exec
, stop
, and kill
, which have a similar meaning to the commands with the same names in Docker.
Complex implementation details are behind a simple user interface. In order to describe the technical principles of Occlum at a higher level, we will discuss them from the perspectives of a trusted development environment and untrusted deployment environment.
In a trusted development environment (the upper part in the figure above), users use occlum build
to package and make trusted images. Merkel Tree is used to ensure that the trusted images cannot be tampered with by attackers after being uploaded to an untrusted deployment environment. The content of the trusted image is the rootfs loaded when Occlum starts. The organizational structure is similar to the usual Unix operating system, and the content is determined by the user.
In an untrusted deployment environment (the lower part in the figure above), users use occlum run
to start a new Occlum enclave. Occlum TEE OS in the enclave will load and execute corresponding applications from trusted images. Occlum provides Linux-compatible system calls to applications, so applications can run in an enclave without modification (or only a few modifications). The memory state of applications is protected by an enclave, and the file I/O of applications is automatically encrypted and decrypted by Occlum. This way, the confidentiality and integrity of the data in the internal and external storage of the application can be protected at the same time.
In addition to providing container-like, user-friendly interfaces, Occlum has three main features:
The links below provide more information:
A Brief Discussion about Confidential Computing: Inclavare Containers
Dragonfly Releases the Nydus Container Image Acceleration Service
85 posts | 5 followers
FollowOpenAnolis - September 26, 2022
OpenAnolis - June 9, 2022
Alibaba Container Service - February 24, 2021
OpenAnolis - July 8, 2022
OpenAnolis - October 13, 2023
Alibaba Clouder - October 10, 2020
85 posts | 5 followers
FollowMulti-source metrics are aggregated to monitor the status of your business and services in real time.
Learn MoreAlibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.
Learn MoreHigh Performance Computing (HPC) and AI technology helps scientific research institutions to perform viral gene sequencing, conduct new drug research and development, and shorten the research and development cycle.
Learn MoreDeploy custom Alibaba Cloud solutions for business-critical scenarios with Quick Start templates.
Learn MoreMore Posts by OpenAnolis