The development of computer science and technology has caused more people to get into programming, and many online programming platforms were born as a result. Let's take the online programming platforms of Python as an example. They can be divided into two categories:
However, no matter what type of online programming platform, the core modules (code executor or problem judging machine) are of great research value. On the one hand, such websites usually need strict security mechanisms to judge whether the program will have malicious codes, dead loop, and damage to the computer system, whether the program needs isolated operations, and whether codes submitted by others will be obtained during operation.
On the other hand, these platforms usually consume a lot of resources, especially when facing competition. They need to suddenly expand the relevant machines, and if necessary, use large-scale clusters. Such websites also have a feature called trigger type, which means there is no close relationship between the context before and after each code is executed.
With the development of Serverless architecture, many people have found that the request-level isolation and extreme elasticity of Serverless architecture can solve the security risk and resource consumption problems encountered by traditional online programming platforms. The pay-as-you-go mode of Serverless architecture can help reduce costs while maintaining the performance of online programming features. Therefore, an increasing number of people have begun to learn about the development of online programming functions through Serverless architecture. This article will take Alibaba Cloud Function Compute (FC) as an example to implement a Python-oriented online programming feature through the Serverless architecture and optimize this feature to make it closer to the local code execution experience.
A simple and typical online programming function on an online execution module usually requires the following capabilities:
In addition to the functions required for online programming, under the Serverless architecture, the business logic required for online programming is converged to the code execution module. The process is: get the program information (including code, standard input...) sent by the client, cache the code to local, execute the code, and get the results. The following figure shows the process of the entire architecture:
The code execution part can be implemented through the Popen() method in the subprocess dependency of the Python language. When using the Popen() method, several important concepts need to be clarified:
So, we can perform standard input (stdin) and get standard output (stdout) and standard error (stderr) functions.
Sample codes are listed below:
# -*- coding: utf-8 -*-
import subprocess
child = subprocess.Popen("python %s" % (fileName),
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
shell=True)
output = child.communicate(input=input_data.encode("utf-8"))
print(output)
In addition to the code execution part, we must get the user code and store it in the procedures in the Serverless architecture. We need to pay attention to the read and write permissions of the directory in the function instance. Normally, if the hard disk is not mounted in FC, only the /tmp/ directory has write permissions. Therefore, in this project, when we store the temporary code passed by the user to the server, we need to write it to the temporary directory /tmp/. When temporarily storing the code, we also need to consider instance reuse. As such, we can provide a temporary file name for the temporary code. For example:
# -*- coding: utf-8 -*-
import randomrandom
Str = lambda num=5: "".join(random.sample('abcdefghijklmnopqrstuvwxyz', num))
path = "/tmp/%s"% randomStr(5)
The complete implementation code is listed below:
# -*- coding: utf-8 -*-
import json
import uuid
import random
import subprocess
# Random string
randomStr = lambda num=5: "".join(random.sample('abcdefghijklmnopqrstuvwxyz', num))
# Response
class Response:
def __init__(self, start_response, response, errorCode=None):
self.start = start_response
responseBody = {
'Error': {"Code": errorCode, "Message": response},
} if errorCode else {
'Response': response
}
# uuid is added by default to facilitate later positioning.
responseBody['ResponseId'] = str(uuid.uuid1())
self.response = json.dumps(responseBody)
def __iter__(self):
status = '200'
response_headers = [('Content-type', 'application/json; charset=UTF-8')]
self.start(status, response_headers)
yield self.response.encode("utf-8")
def WriteCode(code, fileName):
try:
with open(fileName, "w") as f:
f.write(code)
return True
except Exception as e:
print(e)
return False
def RunCode(fileName, input_data=""):
child = subprocess.Popen("python %s" % (fileName),
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
shell=True)
output = child.communicate(input=input_data.encode("utf-8"))
return output[0].decode("utf-8")
def handler(environ, start_response):
try:
request_body_size = int(environ.get('CONTENT_LENGTH', 0))
except (ValueError):
request_body_size = 0
requestBody = json.loads(environ['wsgi.input'].read(request_body_size).decode("utf-8"))
code = requestBody.get("code", None)
inputData = requestBody.get("input", "")
fileName = "/tmp/" + randomStr(5)
responseData = RunCode(fileName, inputData) if code and WriteCode(code, fileName) else "Error"
return Response(start_response, {"result": responseData})
After writing the core business logic, we can deploy the code to Alibaba Cloud FC. After the deployment is completed, we can obtain the temporary test address of the interface and conduct a test on this interface through PostMan. Let's take the output statement of Python as an example:
print('HELLO WORLD')
When using the POST method and carrying code as parameters to initiate the request, the response is shown below:
The system can normally output expected results: HELLO WORLD. So far, we have completed the test of the standard output function. Next, we will test the standard error and other functions. We will change the output code:
Print('HELLO WORLD)
Use the same method, perform the code execution again, and obtain the result:
The error message of Python is in line with our expectations. So far, we have completed the test of the standard error function of the online programming function. Next, we will test the standard input function. Since the subprocess.Popen() method we use is a blocking method, we need to put the code and standard input content together on the server at this time. The code tested is listed below:
tempInput = input('please input: ')
print('Output: ', tempInput)
The standard input content for the test is serverless devs.
When we use the same method to initiate the request, we can see that:
The result is what we expected. So far, we have completed a simple interface for online programming services. This interface is the primary version now and is only used for learning. It has great optimization space:
However, we can also find a problem through this interface. The code execution process is blocked. We cannot carry out continuous input and real-time output. Even if we need to input content, we need to send the code and input content to the server together. This mode is similar to the OJ mode commonly used on the market at present. However, in terms of simple online programming, further optimization of the project is needed so it can be implemented through non-blocking methods to implement code execution and continuously perform input and output operations.
Let's take a piece of code as an example:
import time
print("hello world")
time.sleep(10)
tempInput = input("please: ")
print("Input data: ", tempInput)
When we execute this Python code locally, the actual overall performance on the user side is listed below:
However, if this code is applied to traditional OJ or the online programming system we have just implemented, the performance is different:
The difference between the online programming features on OJ mode and local tools is large, at least in terms of experience. If you want to reduce the problem of inconsistent experience, we can upgrade the architecture mentioned above to implement an online programming function with user experience closer to local programming through the asynchronous triggering of functions and the pexpect.spawn() method of Python:
The entire project includes two functions and two storage buckets:
The code of this scheme is divided into two functions to make the code execute online and closer to the local experience, perform business logic processing, and provide core functions of online programming, respectively.
The business logic processing function can:
The overall business logic is shown below:
The implementation code is listed below:
# -*- coding: utf-8 -*-
import os
import oss2
import json
import uuid
import random
# Basic configuration information
AccessKey = {
"id": os.environ.get('AccessKeyId'),
"secret": os.environ.get('AccessKeySecret')
}
OSSCodeConf = {
'endPoint': os.environ.get('OSSConfEndPoint'),
'bucketName': os.environ.get('OSSConfBucketCodeName'),
'objectSignUrlTimeOut': int(os.environ.get('OSSConfObjectSignUrlTimeOut'))
}
OSSTargetConf = {
'endPoint': os.environ.get('OSSConfEndPoint'),
'bucketName': os.environ.get('OSSConfBucketTargetName'),
'objectSignUrlTimeOut': int(os.environ.get('OSSConfObjectSignUrlTimeOut'))
}
# Obtain/upload files from/to OSS temporary address
auth = oss2.Auth(AccessKey['id'], AccessKey['secret'])
codeBucket = oss2.Bucket(auth, OSSCodeConf['endPoint'], OSSCodeConf['bucketName'])
targetBucket = oss2.Bucket(auth, OSSTargetConf['endPoint'], OSSTargetConf['bucketName'])
# Random string
randomStr = lambda num=5: "".join(random.sample('abcdefghijklmnopqrstuvwxyz', num))
# Response
class Response:
def __init__(self, start_response, response, errorCode=None):
self.start = start_response
responseBody = {
'Error': {"Code": errorCode, "Message": response},
} if errorCode else {
'Response': response
}
# uuid is added by default to facilitate later positioning
responseBody['ResponseId'] = str(uuid.uuid1())
self.response = json.dumps(responseBody)
def __iter__(self):
status = '200'
response_headers = [('Content-type', 'application/json; charset=UTF-8')]
self.start(status, response_headers)
yield self.response.encode("utf-8")
def handler(environ, start_response):
try:
request_body_size = int(environ.get('CONTENT_LENGTH', 0))
except (ValueError):
request_body_size = 0
requestBody = json.loads(environ['wsgi.input'].read(request_body_size).decode("utf-8"))
reqType = requestBody.get("type", None)
if reqType == "run":
# Execute the code
code = requestBody.get("code", None)
runId = randomStr(10)
codeBucket.put_object(runId, code.encode("utf-8"))
responseData = runId
elif reqType == "input":
# Input content
inputData = requestBody.get("input", None)
runId = requestBody.get("id", None)
targetBucket.put_object(runId + "-input", inputData.encode("utf-8"))
responseData = 'ok'
elif reqType == "output":
# Obtain the result
runId = requestBody.get("id", None)
targetBucket.get_object_to_file(runId + "-output", '/tmp/' + runId)
with open('/tmp/' + runId) as f:
responseData = f.read()
else:
responseData = "Error"
return Response(start_response, {"result": responseData})
The executor function is mainly triggered by a code storage bucket to perform code execution. This part mainly includes:
The overall process is shown below:
The implementation code is listed below:
# -*- coding: utf-8 -*-
import os
import re
import oss2
import json
import time
import pexpect
# Basic configuration information
AccessKey = {
"id": os.environ.get('AccessKeyId'),
"secret": os.environ.get('AccessKeySecret')
}
OSSCodeConf = {
'endPoint': os.environ.get('OSSConfEndPoint'),
'bucketName': os.environ.get('OSSConfBucketCodeName'),
'objectSignUrlTimeOut': int(os.environ.get('OSSConfObjectSignUrlTimeOut'))
}
OSSTargetConf = {
'endPoint': os.environ.get('OSSConfEndPoint'),
'bucketName': os.environ.get('OSSConfBucketTargetName'),
'objectSignUrlTimeOut': int(os.environ.get('OSSConfObjectSignUrlTimeOut'))
}
# Obtain/upload files from/to OSS temporary address
auth = oss2.Auth(AccessKey['id'], AccessKey['secret'])
codeBucket = oss2.Bucket(auth, OSSCodeConf['endPoint'], OSSCodeConf['bucketName'])
targetBucket = oss2.Bucket(auth, OSSTargetConf['endPoint'], OSSTargetConf['bucketName'])
def handler(event, context):
event = json.loads(event.decode("utf-8"))
for eveEvent in event["events"]:
# Obtain the object
code = eveEvent["oss"]["object"]["key"]
localFileName = "/tmp/" + event["events"][0]["oss"]["object"]["eTag"]
# Download the code
codeBucket.get_object_to_file(code, localFileName)
# Execute the code
foo = pexpect.spawn('python %s' % localFileName)
outputData = ""
startTime = time.time()
# timeout can be identified by file name
try:
timeout = int(re.findall("timeout(.*?)s", code)[0])
except:
timeout = 60
while (time.time() - startTime) / 1000 <= timeout:
try:
tempOutput = foo.read_nonblocking(size=999999, timeout=0.01)
tempOutput = tempOutput.decode("utf-8", "ignore")
if len(str(tempOutput)) > 0:
outputData = outputData + tempOutput
# Output the data and store in OSS
targetBucket.put_object(code + "-output", outputData.encode("utf-8"))
except Exception as e:
print("Error: ", e)
# An input request is blocked
if str(e) == "Timeout exceeded.":
try:
# Read the data from OSS
targetBucket.get_object_to_file(code + "-input", localFileName + "-input")
targetBucket.delete_object(code + "-input")
with open(localFileName + "-input") as f:
inputData = f.read()
if inputData:
foo.sendline(inputData)
except:
pass
# Execution completed and output results
elif "End Of File (EOF)" in str(e):
targetBucket.put_object(code + "-output", outputData.encode("utf-8"))
return True
# An exception occurs
else:
outputData = outputData + "\n\nException: %s" % str(e)
targetBucket.put_object(code + "-output", outputData.encode("utf-8"))
return False
After we finish writing the core business logic, we can deploy the project online.
After deploying the project, the interface is tested through PostMan. At this point, we need to test the code with broader coverage, including output printing, input, sleep(), and other methods.
import time
print('hello world')
time.sleep(10)
tempInput = input('please: ')
print('Input data: ', tempInput)
After we initiate a request to execute the code through PostMan, the system returns the expected code execution ID:
The system will return a code execution ID, which will be used as the ID of our entire request task. We can obtain the result by using the interface to obtain the output result.
Since the code contains…
time.sleep(10)
…when getting the results quickly, we cannot see the second half of the output results. We can set up a pooling task and constantly refresh the interface through the ID.
After ten seconds, the code was executed to the input section:
tempInput = input('please: ')
We input content through the input interface.
After completion, you can see the result of successful input (result: ok). As such, we continue to refresh the request in the previous result obtaining section.
We have obtained the output of all the results.
Compared with the preceding online programming function, this code executor with an experience closer to local tools becomes complicated. However, in actual practices, it can better simulate some phenomena when executing the code locally, such as code sleep, blocking, and content output.
This article gives some brief introductions about:
This article also answers a common question from one side. If I have a project, do I need to use a function for each interface or can multiple interfaces reuse one function?
To answer this question, the most important thing is to focus on the demands of the business. If multiple interfaces express the same meaning (or are of the same kind) and consume similar resources, it is okay to distinguish them by different paths in one function. It is applicable to put multiple interfaces under multiple functions if there is a large difference in resource consumption or function type, size, or category.
This article is only an introduction. Both the problem judging machine in the OJ system and the executor part of online programming tools can be combined with Serverless architecture. This combination can solve the traditional online programming pain points (security risk, resource consumption, concurrency, and traffic instability) and bring out the value of Serverless in a new field.
Next Generation Software and Hardware Architecture Collaborative Optimization Revealed
1688's Serverless Efficiency Improvement Practices in Complex Business Scenarios
99 posts | 7 followers
FollowAlibaba Cloud Serverless - June 9, 2022
sunqi - February 12, 2020
XianYu Tech - August 10, 2021
Alibaba Clouder - March 14, 2019
Alibaba Clouder - November 24, 2020
Alibaba Clouder - July 27, 2020
99 posts | 7 followers
FollowAlibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.
Learn MoreVisualization, O&M-free orchestration, and Coordination of Stateful Application Scenarios
Learn MoreServerless Application Engine (SAE) is the world's first application-oriented serverless PaaS, providing a cost-effective and highly efficient one-stop application hosting solution.
Learn MoreMore Posts by Alibaba Cloud Serverless