Write Code with a Mobile Phone: An Exploration of Online Programming Capabilities Based on Serverless

This article will take Alibaba Cloud Function Compute (FC) as an example to implement a Python-oriented online programming feature through the Serverless architecture.

The development of computer science and technology has caused more people to get into programming, and many online programming platforms were born as a result. Let's take the online programming platforms of Python as an example. They can be divided into two categories:

The OJ Types are programming platforms with online evaluation, featuring blocking execution. The user needs to submit the code and standard input content at one time, and the result will be returned at one time when the program execution is completed.
The others are for learning or serve as tools, such as Anycodes. This type of platform is characterized by non-blocking execution. Users can see the results of code execution and input content in real-time.

However, no matter what type of online programming platform, the core modules (code executor or problem judging machine) are of great research value. On the one hand, such websites usually need strict security mechanisms to judge whether the program will have malicious codes, dead loop, and damage to the computer system, whether the program needs isolated operations, and whether codes submitted by others will be obtained during operation.

On the other hand, these platforms usually consume a lot of resources, especially when facing competition. They need to suddenly expand the relevant machines, and if necessary, use large-scale clusters. Such websites also have a feature called trigger type, which means there is no close relationship between the context before and after each code is executed.

With the development of Serverless architecture, many people have found that the request-level isolation and extreme elasticity of Serverless architecture can solve the security risk and resource consumption problems encountered by traditional online programming platforms. The pay-as-you-go mode of Serverless architecture can help reduce costs while maintaining the performance of online programming features. Therefore, an increasing number of people have begun to learn about the development of online programming functions through Serverless architecture. This article will take Alibaba Cloud Function Compute (FC) as an example to implement a Python-oriented online programming feature through the Serverless architecture and optimize this feature to make it closer to the local code execution experience.

Development of Online Programming Function

A simple and typical online programming function on an online execution module usually requires the following capabilities:

Execute code online
Support content input by users
Return results (standard output, standard errors...)

In addition to the functions required for online programming, under the Serverless architecture, the business logic required for online programming is converged to the code execution module. The process is: get the program information (including code, standard input...) sent by the client, cache the code to local, execute the code, and get the results. The following figure shows the process of the entire architecture:

The code execution part can be implemented through the Popen() method in the subprocess dependency of the Python language. When using the Popen() method, several important concepts need to be clarified:

subprocess.PIPE: A special value for the stdin, stdout, and stderr parameters of Popen, indicating that a new pipeline needs to be created
subprocess.STDOUT: An output value for the stderr parameter of Popen, indicating that the standard error of the sub-program converges to the standard output

So, we can perform standard input (stdin) and get standard output (stdout) and standard error (stderr) functions.

Sample codes are listed below:

# -*- coding: utf-8 -*-
import subprocess
child = subprocess.Popen("python %s" % (fileName),
                         stdin=subprocess.PIPE,
                         stdout=subprocess.PIPE,
                         stderr=subprocess.STDOUT,
                         shell=True)
output = child.communicate(input=input_data.encode("utf-8"))
print(output)

In addition to the code execution part, we must get the user code and store it in the procedures in the Serverless architecture. We need to pay attention to the read and write permissions of the directory in the function instance. Normally, if the hard disk is not mounted in FC, only the /tmp/ directory has write permissions. Therefore, in this project, when we store the temporary code passed by the user to the server, we need to write it to the temporary directory /tmp/. When temporarily storing the code, we also need to consider instance reuse. As such, we can provide a temporary file name for the temporary code. For example:

# -*- coding: utf-8 -*-
import randomrandom
Str = lambda num=5: "".join(random.sample('abcdefghijklmnopqrstuvwxyz', num))
path = "/tmp/%s"% randomStr(5)

The complete implementation code is listed below:

# -*- coding: utf-8 -*-
import json
import uuid
import random
import subprocess
# Random string
randomStr = lambda num=5: "".join(random.sample('abcdefghijklmnopqrstuvwxyz', num))
# Response
class Response:
   def __init__(self, start_response, response, errorCode=None):
       self.start = start_response
       responseBody = {
           'Error': {"Code": errorCode, "Message": response},
       } if errorCode else {
           'Response': response
       }
            # uuid is added by default to facilitate later positioning.
       responseBody['ResponseId'] = str(uuid.uuid1())
       self.response = json.dumps(responseBody)

   def __iter__(self):
       status = '200'
       response_headers = [('Content-type', 'application/json; charset=UTF-8')]
       self.start(status, response_headers)
       yield self.response.encode("utf-8")

def WriteCode(code, fileName):
   try:
       with open(fileName, "w") as f:
           f.write(code)
       return True
   except Exception as e:
       print(e)
       return False

def RunCode(fileName, input_data=""):
   child = subprocess.Popen("python %s" % (fileName),
                            stdin=subprocess.PIPE,
                            stdout=subprocess.PIPE,
                            stderr=subprocess.STDOUT,
                            shell=True)
   output = child.communicate(input=input_data.encode("utf-8"))
   return output[0].decode("utf-8")

def handler(environ, start_response):
   try:
       request_body_size = int(environ.get('CONTENT_LENGTH', 0))
   except (ValueError):
       request_body_size = 0
   requestBody = json.loads(environ['wsgi.input'].read(request_body_size).decode("utf-8"))

   code = requestBody.get("code", None)
   inputData = requestBody.get("input", "")
   fileName = "/tmp/" + randomStr(5)
   responseData = RunCode(fileName, inputData) if code and WriteCode(code, fileName) else "Error"
   return Response(start_response, {"result": responseData})

After writing the core business logic, we can deploy the code to Alibaba Cloud FC. After the deployment is completed, we can obtain the temporary test address of the interface and conduct a test on this interface through PostMan. Let's take the output statement of Python as an example:

print('HELLO WORLD')

When using the POST method and carrying code as parameters to initiate the request, the response is shown below:

The system can normally output expected results: HELLO WORLD. So far, we have completed the test of the standard output function. Next, we will test the standard error and other functions. We will change the output code:

Print('HELLO WORLD)

Use the same method, perform the code execution again, and obtain the result:

The error message of Python is in line with our expectations. So far, we have completed the test of the standard error function of the online programming function. Next, we will test the standard input function. Since the subprocess.Popen() method we use is a blocking method, we need to put the code and standard input content together on the server at this time. The code tested is listed below:

tempInput = input('please input: ')
print('Output: ', tempInput)

The standard input content for the test is serverless devs.

When we use the same method to initiate the request, we can see that:

The result is what we expected. So far, we have completed a simple interface for online programming services. This interface is the primary version now and is only used for learning. It has great optimization space:

Handling of timeout
Code cleaning after being executed

However, we can also find a problem through this interface. The code execution process is blocked. We cannot carry out continuous input and real-time output. Even if we need to input content, we need to send the code and input content to the server together. This mode is similar to the OJ mode commonly used on the market at present. However, in terms of simple online programming, further optimization of the project is needed so it can be implemented through non-blocking methods to implement code execution and continuously perform input and output operations.

Code Executor: Closer to "Local" Experience

Let's take a piece of code as an example:

import time
print("hello world")
time.sleep(10)
tempInput = input("please: ")
print("Input data: ", tempInput)

When we execute this Python code locally, the actual overall performance on the user side is listed below:

The system outputs hello world.
The system waits for ten seconds.
The system reminds us of please, and we can input a string at this time.
The system outputs Input data and the string we just input.

However, if this code is applied to traditional OJ or the online programming system we have just implemented, the performance is different:

The code is passed to the system along with what we want to input.
The system waits for ten seconds.
The system outputs hello world, please, Input data, and the content that we input.

The difference between the online programming features on OJ mode and local tools is large, at least in terms of experience. If you want to reduce the problem of inconsistent experience, we can upgrade the architecture mentioned above to implement an online programming function with user experience closer to local programming through the asynchronous triggering of functions and the pexpect.spawn() method of Python:

The entire project includes two functions and two storage buckets:

Business Logic Function:The main operation of this function is business logic, including creating tasks for code execution (asynchronous function execution through OSS trigger), getting returned results, and performing related operations on standard input of task functions.
Executor Function:This function executes the user's function code, which is triggered by OSS by downloading the code, executing the code, and getting the input content and output results. The code is obtained from the code storage bucket, and the output results and the input content are obtained from the business storage bucket.
Code Storage Bucket: The role of this bucket is code storage. When the user initiates a request to run the code, the business logic function receives the user code and stores it in the bucket. Then, the bucket will trigger asynchronous tasks.
Business Storage Bucket: The role of this bucket is outputting intermediates, including the cache of output and input. This part of data can have a lifecycle based on the characteristics of OSS.

The code of this scheme is divided into two functions to make the code execute online and closer to the local experience, perform business logic processing, and provide core functions of online programming, respectively.

The business logic processing function can:

Obtain the information of the user's code, generate the code execution ID, and save the code to OSS. It also asynchronously triggers the execution of the online programming function and returns the generated code execution ID.
Obtain the user's input information and code execution ID and store the content in OSS.
Obtain the output result of the code, read the execution result from OSS according to the code execution ID specified by the user, and return it to the user.

The overall business logic is shown below:

The implementation code is listed below:

# -*- coding: utf-8 -*-

import os
import oss2
import json
import uuid
import random

# Basic configuration information
AccessKey = {
   "id": os.environ.get('AccessKeyId'),
   "secret": os.environ.get('AccessKeySecret')
}

OSSCodeConf = {
   'endPoint': os.environ.get('OSSConfEndPoint'),
   'bucketName': os.environ.get('OSSConfBucketCodeName'),
   'objectSignUrlTimeOut': int(os.environ.get('OSSConfObjectSignUrlTimeOut'))
}

OSSTargetConf = {
   'endPoint': os.environ.get('OSSConfEndPoint'),
   'bucketName': os.environ.get('OSSConfBucketTargetName'),
   'objectSignUrlTimeOut': int(os.environ.get('OSSConfObjectSignUrlTimeOut'))
}

# Obtain/upload files from/to OSS temporary address
auth = oss2.Auth(AccessKey['id'], AccessKey['secret'])
codeBucket = oss2.Bucket(auth, OSSCodeConf['endPoint'], OSSCodeConf['bucketName'])
targetBucket = oss2.Bucket(auth, OSSTargetConf['endPoint'], OSSTargetConf['bucketName'])

# Random string
randomStr = lambda num=5: "".join(random.sample('abcdefghijklmnopqrstuvwxyz', num))

# Response
class Response:
   def __init__(self, start_response, response, errorCode=None):
       self.start = start_response
       responseBody = {
           'Error': {"Code": errorCode, "Message": response},
       } if errorCode else {
           'Response': response
       }
            # uuid is added by default to facilitate later positioning
       responseBody['ResponseId'] = str(uuid.uuid1())
       self.response = json.dumps(responseBody)

   def __iter__(self):
       status = '200'
       response_headers = [('Content-type', 'application/json; charset=UTF-8')]
       self.start(status, response_headers)
       yield self.response.encode("utf-8")

def handler(environ, start_response):
   try:
       request_body_size = int(environ.get('CONTENT_LENGTH', 0))
   except (ValueError):
       request_body_size = 0
   requestBody = json.loads(environ['wsgi.input'].read(request_body_size).decode("utf-8"))

   reqType = requestBody.get("type", None)

   if reqType == "run":
            # Execute the code
       code = requestBody.get("code", None)
       runId = randomStr(10)
       codeBucket.put_object(runId, code.encode("utf-8"))
       responseData = runId
   elif reqType == "input":
            # Input content
       inputData = requestBody.get("input", None)
       runId = requestBody.get("id", None)
       targetBucket.put_object(runId + "-input", inputData.encode("utf-8"))
       responseData = 'ok'
   elif reqType == "output":
            # Obtain the result
       runId = requestBody.get("id", None)
       targetBucket.get_object_to_file(runId + "-output", '/tmp/' + runId)
       with open('/tmp/' + runId) as f:
           responseData = f.read()
   else:
       responseData = "Error"

   return Response(start_response, {"result": responseData})

The executor function is mainly triggered by a code storage bucket to perform code execution. This part mainly includes:

Obtain the code from the bucket and execute the code through pexpect.spawn()
Obtain non-blocking execution results with pexpect.spawn().read_nonblocking() and write them to OSS
Input the content through pexpect.spawn().sendline()

The overall process is shown below:

The implementation code is listed below:

# -*- coding: utf-8 -*-

import os
import re
import oss2
import json
import time
import pexpect

# Basic configuration information
AccessKey = {
   "id": os.environ.get('AccessKeyId'),
   "secret": os.environ.get('AccessKeySecret')
}

OSSCodeConf = {
   'endPoint': os.environ.get('OSSConfEndPoint'),
   'bucketName': os.environ.get('OSSConfBucketCodeName'),
   'objectSignUrlTimeOut': int(os.environ.get('OSSConfObjectSignUrlTimeOut'))
}

OSSTargetConf = {
   'endPoint': os.environ.get('OSSConfEndPoint'),
   'bucketName': os.environ.get('OSSConfBucketTargetName'),
   'objectSignUrlTimeOut': int(os.environ.get('OSSConfObjectSignUrlTimeOut'))
}

# Obtain/upload files from/to OSS temporary address
auth = oss2.Auth(AccessKey['id'], AccessKey['secret'])
codeBucket = oss2.Bucket(auth, OSSCodeConf['endPoint'], OSSCodeConf['bucketName'])
targetBucket = oss2.Bucket(auth, OSSTargetConf['endPoint'], OSSTargetConf['bucketName'])


def handler(event, context):
   event = json.loads(event.decode("utf-8"))

   for eveEvent in event["events"]:

            # Obtain the object
       code = eveEvent["oss"]["object"]["key"]
       localFileName = "/tmp/" + event["events"][0]["oss"]["object"]["eTag"]

            # Download the code
       codeBucket.get_object_to_file(code, localFileName)

            # Execute the code
       foo = pexpect.spawn('python %s' % localFileName)

       outputData = ""

       startTime = time.time()

            # timeout can be identified by file name
       try:
           timeout = int(re.findall("timeout(.*?)s", code)[0])
       except:
           timeout = 60

       while (time.time() - startTime) / 1000 <= timeout:
           try:
               tempOutput = foo.read_nonblocking(size=999999, timeout=0.01)
               tempOutput = tempOutput.decode("utf-8", "ignore")

               if len(str(tempOutput)) > 0:
                   outputData = outputData + tempOutput

                         # Output the data and store in OSS
               targetBucket.put_object(code + "-output", outputData.encode("utf-8"))

           except Exception as e:

               print("Error: ", e)

                         # An input request is blocked
               if str(e) == "Timeout exceeded.":

                   try:
                                       # Read the data from OSS
                       targetBucket.get_object_to_file(code + "-input", localFileName + "-input")
                       targetBucket.delete_object(code + "-input")
                       with open(localFileName + "-input") as f:
                           inputData = f.read()
                       if inputData:
                           foo.sendline(inputData)
                   except:
                       pass

                         # Execution completed and output results
               elif "End Of File (EOF)" in str(e):
                   targetBucket.put_object(code + "-output", outputData.encode("utf-8"))
                   return True

                         # An exception occurs
               else:

                   outputData = outputData + "\n\nException: %s" % str(e)
                   targetBucket.put_object(code + "-output", outputData.encode("utf-8"))

                   return False

After we finish writing the core business logic, we can deploy the project online.

After deploying the project, the interface is tested through PostMan. At this point, we need to test the code with broader coverage, including output printing, input, sleep(), and other methods.

import time
print('hello world')
time.sleep(10)
tempInput = input('please: ')
print('Input data: ', tempInput)

After we initiate a request to execute the code through PostMan, the system returns the expected code execution ID:

The system will return a code execution ID, which will be used as the ID of our entire request task. We can obtain the result by using the interface to obtain the output result.

Since the code contains…

time.sleep(10)

…when getting the results quickly, we cannot see the second half of the output results. We can set up a pooling task and constantly refresh the interface through the ID.

After ten seconds, the code was executed to the input section:

tempInput = input('please: ')

We input content through the input interface.

After completion, you can see the result of successful input (result: ok). As such, we continue to refresh the request in the previous result obtaining section.

We have obtained the output of all the results.

Compared with the preceding online programming function, this code executor with an experience closer to local tools becomes complicated. However, in actual practices, it can better simulate some phenomena when executing the code locally, such as code sleep, blocking, and content output.

Summary

This article gives some brief introductions about:

Basic usage of HTTP triggers and OSS triggers
Basic usage of FC components and OSS components and implementation of inter-component dependencies

This article also answers a common question from one side. If I have a project, do I need to use a function for each interface or can multiple interfaces reuse one function?

To answer this question, the most important thing is to focus on the demands of the business. If multiple interfaces express the same meaning (or are of the same kind) and consume similar resources, it is okay to distinguish them by different paths in one function. It is applicable to put multiple interfaces under multiple functions if there is a large difference in resource consumption or function type, size, or category.

This article is only an introduction. Both the problem judging machine in the OJ system and the executor part of online programming tools can be combined with Serverless architecture. This combination can solve the traditional online programming pain points (security risk, resource consumption, concurrency, and traffic instability) and bring out the value of Serverless in a new field.

Community

Write Code with a Mobile Phone: An Exploration of Online Programming Capabilities Based on Serverless

Development of Online Programming Function

Code Executor: Closer to "Local" Experience

Summary

Read previous post:

Read next post:

Alibaba Cloud Serverless

You may also like

Comments

Alibaba Cloud Serverless

Related Products

Function Compute

Serverless Workflow

Serverless Application Engine

A Free Trial That Lets You Build Big!