You can manage jobs in the Alibaba Cloud E-MapReduce (EMR) console or by using kubectl or APIs. This topic describes how to use kubectl to manage Spark jobs.
Prerequisites
A Spark cluster is created on the EMR on ACK page of the new EMR console. For more information, see Getting started.
Procedure
- Connect to an Alibaba Cloud Container Service for Kubernetes (ACK) cluster by using
kubectl. For more information, see Connect to ACK clusters by using kubectl. You can also connect to the ACK cluster by calling an API operation. For more information, see Use the Kubernetes API.
- Run the following commands to manage a job.
- View the job status. Syntax:
kubectl describe SparkApplication <Job name> --namespace <Namespace in which the cluster resides>
The following information is returned:Name: spark-pi-simple Namespace: c-48e779e0d9ad**** Labels: <none> Annotations: <none> API Version: sparkoperator.k8s.io/v1beta2 Kind: SparkApplication Metadata: Creation Timestamp: 2021-07-22T06:25:33Z Generation: 1 Resource Version: 7503740 UID: 930874ad-bb17-47f1-a556-55118c1d**** Spec: Arguments: 1000 Driver: Core Limit: 1000m Cores: 1 Memory: 4g Executor: Core Limit: 1000m Cores: 1 Instances: 1 Memory: 8g Memory Overhead: 1g Image: registry-vpc.cn-hangzhou.aliyuncs.com/emr/spark:emr-2.4.5-1.0.0 Main Application File: local:///opt/spark/examples/target/scala-2.11/jars/spark-examples_2.11-2.4.5.jar Main Class: org.apache.spark.examples.SparkPi Spark Version: 2.4.5 Type: Scala Status: Application State: State: RUNNING Driver Info: Pod Name: spark-pi-simple-driver Web UI Address: 172.16.230.240:4040 Web UI Ingress Address: spark-pi-simple.c-48e779e0d9ad4bfd.c7f6b768c34764c27ab740bdb1fc2a3ff.cn-hangzhou.alicontainer.com Web UI Ingress Name: spark-pi-simple-ui-ingress Web UI Port: 4040 Web UI Service Name: spark-pi-simple-ui-svc Execution Attempts: 1 Executor State: spark-pi-1626935142670-exec-1: RUNNING Last Submission Attempt Time: 2021-07-22T06:25:33Z Spark Application Id: spark-15b44f956ecc40b1ae59a27ca18d**** Submission Attempts: 1 Submission ID: d71f30e2-9bf8-4da1-8412-b585fd45**** Termination Time: <nil> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SparkApplicationAdded 17s spark-operator SparkApplication spark-pi-simple was added, enqueuing it for submission Normal SparkApplicationSubmitted 14s spark-operator SparkApplication spark-pi-simple was submitted successfully Normal SparkDriverRunning 13s spark-operator Driver spark-pi-simple-driver is running Normal SparkExecutorPending 7s spark-operator Executor spark-pi-1626935142670-exec-1 is pending Normal SparkExecutorRunning 6s spark-operator Executor spark-pi-1626935142670-exec-1 is running
Replace
<Namespace in which the cluster resides>
with the namespace based on your business requirements. To view the namespace, log on to the EMR console and go to the Cluster Details tab.To obtain the
job name
, log on to the EMR console and go to the Cluster Details tab of the cluster details page. The created jobs are displayed in the Jobs section. - Terminate and delete a job. Syntax:
kubectl delete SparkApplication <Job name> -n <Namespace in which the cluster resides>
The following information is returned:sparkapplication.sparkoperator.k8s.io "spark-pi-simple" deleted
- View the logs of a job. Syntax:
kubectl logs <Job name-driver> -n <Namespace in which the cluster resides>
Note If the job name is spark-pi-simple and the namespace is c-d2232227b95145d3, the command you run iskubectl logs spark-pi-simple-driver -n c-d2232227b95145d3
.Information similar to the following output is returned:...... Pi is roughly 3.141488791414888 21/07/22 14:37:57 INFO SparkContext: Successfully stopped SparkContext 21/07/22 14:37:57 INFO ShutdownHookManager: Shutdown hook called 21/07/22 14:37:57 INFO ShutdownHookManager: Deleting directory /var/data/spark-b6a43b55-a354-44d7-ae5e-45b8b1493edb/spark-56aae0d1-37b9-4a7d-9c99-4e4ca12deb4b 21/07/22 14:37:57 INFO ShutdownHookManager: Deleting directory /tmp/spark-e2500491-6ed7-48d7-b94e-a9ebeb899320
- View the job status. Syntax: