After you create a cluster, you can use the manual script execution feature to manually
run a specific script on multiple nodes in the cluster at the same time based on your
business requirements. This topic describes how to manually add and run a script.
Background information
The manual script execution feature is suitable for longstanding clusters. For temporary
clusters, we recommend that you use bootstrap actions to initialize the clusters.
For information about bootstrap actions, see Manage bootstrap actions.
Manual execution scripts are similar to bootstrap action scripts. After you create
a cluster, you can use the manual script execution feature to install software and
services that are not pre-installed in EMR clusters. Examples:
- Use YUM to install software whose installation package is available.
- Download public software from the Internet.
- Install software to read your data from Object Storage Service (OSS).
- Install and run a service, such as Flink or Impala. In this case, the script that
you must write is complex.
Prerequisites
- An E-MapReduce (EMR) cluster is created. For more information, see Create a cluster.
- The cluster is in the Running state. Scripts cannot run in clusters that are in other
states.
- Cluster scripts are developed or obtained and uploaded to OSS. For more information
about cluster scripts, see Example.
Limits
- Only one cluster script can run in a cluster at a specific point in time. You cannot
submit another cluster script if one is already in progress. You can retain up to
10 cluster script records for each cluster. If more than 10 records exist, you must
delete the previous records before you create new cluster scripts.
- A cluster script may succeed on some nodes, but fail on others. For example, the script
may fail to run because you restart a node. After you resolve the issue, you can rerun
the cluster script on the failed nodes. After you scale out a cluster, you can also
run cluster scripts on the added nodes.
Procedure
- Go to the Script Operation tab.
- Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
- In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
- On the EMR on ECS page, find the desired cluster and click the name of the cluster.
- On the page that appears, click the Script Operation tab.
- On the Script Operation tab, click the Manual Execution tab.
- Click Create and Execute.
- In the Add Manual Execution Script dialog box, configure the Name parameter, select a folder from the Script Address drop-down list, select the nodes on which you want to run the script for the Execution Node parameter, and enter custom parameters in the Parameter field.
Note
- We recommend that you test the script on a single node before you run the script for
the entire cluster.
- The script path must be in the oss://**/*.sh format.
- Click OK.
After the script is created, the script is displayed in the cluster script list and
is in the Running state. A script can be in the Running, Complete, or Submit Failed
state.
- To view the task progress, click Operation History in the upper-right corner.
- To view the details of the script, click Details in the Actions column.
The nodes on which you run the script can be in the Waiting, Running, Complete, Failed,
Submit Failed, or Cancel state.
- To delete the script, click Delete in the Actions column.
Example
Similar to bootstrap action scripts, you can specify the object that you want to download
from OSS in a manual execution script. For example, you can download a sample object
in a directory that is in the
oss://<yourBucket>/<myFile>.tar.gz format to your on-premises machine and decompress the object to the
/yourDir directory.
#!/bin/bash
osscmd --id=<yourAccessKeyId> --key=<yourAccessKeySecret> --host=oss-cn-hangzhou-internal.aliyuncs.com get oss://<yourBucketName>/<yourFile>.tar.gz ./<yourFile>.tar.gz
mkdir -p /<yourDir>
tar -zxvf <yourFile>.tar.gz -C /<yourDir>
Note The specified OSS address can be an internal, public, or virtual private cloud (VPC)
endpoint. If the classic network is used, you must specify an internal endpoint. For
example, the internal endpoint of OSS in the China (Hangzhou) region is oss-cn-hangzhou-internal.aliyuncs.com. If a VPC is used, you must specify a domain name that you can access from the VPC.
For example, the domain name of OSS in the China (Hangzhou) region is vpc100-oss-cn-hangzhou.aliyuncs.com.
You can also use YUM to install additional system software packages, such as
ld-linux.so.2.
#!/bin/bash
yum install -y ld-linux.so.2
By default, the root account is used to run specified scripts on nodes in a cluster.
You can run the su hadoop
command in the script to switch to the hadoop user.