In an ACK Serverless cluster, you can create pods to meet your business requirements. The system stops billing a pod after the pod lifecycle is terminated. You do not need to reserve computing resources for Spark tasks. This resolves the issues of insufficient computing resources and saves you the need to expand the cluster. In addition, you can reduce the computing costs by using preemptible instances. This topic describes how to use ACK Serverless to create Spark tasks to meet your business requirements.
Prerequisites
An ACK Serverless cluster is created. For more information, see Create an ACK Serverless cluster.
A kubectl client is connected to the cluster. For more information, see Connect to an ACK cluster by using kubectl.
Procedure
Deploy the ack-spark-operator chart by using one of the following methods:
Log on to the Container Service for Kubernetes (ACK) console. In the left-side navigation pane, choose and select ack-spark-operator to deploy the chart.
Run the helm command to manually deploy the chart.
NoteThe Helm version must be V3 or later.
# Create a service account. kubectl create serviceaccount spark # Grant permissions. kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default # Install the operator. helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator helm install incubator/sparkoperator --namespace default --set operatorImageName=registry.cn-hangzhou.aliyuncs.com/acs/spark-operator --set operatorVersion=ack-2.4.5-latest --generate-name
After you deploy the chart, run the following command to check whether spark-operator is started:
kubectl -n spark-operator get pod
Expected output:
NAME READY STATUS RESTARTS AGE ack-spark-operator-7698586d7b-pvwln 1/1 Running 0 5m9s ack-spark-operator-init-26tvh 0/1 Completed 0 5m9s
Create a file named spark-pi.yaml and copy the following content into the file:
apiVersion: "sparkoperator.k8s.io/v1beta2" kind: SparkApplication metadata: name: spark-pi namespace: default spec: arguments: - "1000" sparkConf: "spark.scheduler.maxRegisteredResourcesWaitingTime": "3000s" "spark.kubernetes.allocation.batch.size": "1" "spark.rpc.askTimeout": "36000s" "spark.network.timeout": "36000s" "spark.rpc.lookupTimeout": "36000s" "spark.core.connection.ack.wait.timeout": "36000s" "spark.executor.heartbeatInterval": "10000s" type: Scala mode: cluster image: "registry.aliyuncs.com/acs/spark:ack-2.4.5-latest" imagePullPolicy: Always mainClass: org.apache.spark.examples.SparkPi mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.11-2.4.5.jar" sparkVersion: "2.4.5" restartPolicy: type: Never args: driver: cores: 4 coreLimit: "4" annotations: k8s.aliyun.com/eci-image-cache: "true" memory: "6g" memoryOverhead: "2g" labels: version: 2.4.5 serviceAccount: spark executor: annotations: k8s.aliyun.com/eci-image-cache: "true" cores: 2 instances: 1 memory: "3g" memoryOverhead: "1g" labels: version: 2.4.5
Deploy a Spark task.
Run the following command to deploy a Spark task:
kubectl apply -f spark-pi.yaml
Expected output:
sparkapplication.sparkoperator.k8s.io/spark-pi created
Run the following command to view the deployment status of the Spark task:
kubectl get pod
Expected output:
NAME READY STATUS RESTARTS AGE spark-pi-driver 1/1 Running 0 2m12s
The output shows that the pod is in the Running state, which indicates that the Spark task is being deployed.
Run the following command to view the deployment status of the Spark task again:
kubectl get pod
Expected output:
NAME READY STATUS RESTARTS AGE spark-pi-driver 0/1 Completed 0 2m54s
The output shows that the pod is in the Completed state, which indicates that the Spark task is deployed.
Run the following command to view the computing result of the Spark task:
kubectl logs spark-pi-driver|grep Pi
Expected output:
20/04/30 07:27:51 INFO DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 11.031 s 20/04/30 07:27:51 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 11.137920 s Pi is roughly 3.1414371514143715
Optional:To use a preemptible instance, add annotations for preemptible instances to the pod.
For more information about how to add annotations for preemptible instances, see Use preemptible instances.