All Products
Search
Document Center

Resource Orchestration Service:Deploy Apache Spark on a single instance

Last Updated:Feb 11, 2025

This topic describes how to deploy Apache Spark on a single instance by creating a stack in the Resource Orchestration Service (ROS) console.

Background information

Apache Spark is a general-purpose computing engine designed for large-scale data processing. Apache Spark uses Scala as an application framework and leverages Resilient Distributed Datasets (RDDs) for in-memory computing. Apache Spark provides interactive queries and can optimize workloads by using iterative algorithms.

You can use the Installs Spark on an ECS instance (existing VPC) sample template to create an Elastic Compute Service (ECS) instance based on existing resources and associate an elastic IP address (EIP) with the instance. The existing resources include a virtual private cloud (VPC), vSwitch, and security group. The following software versions are used in the sample template:

  • JDK 1.8.0: Java Development Kit (JDK)

  • Hadoop 2.7.7: the framework for distributed systems

  • Scala 2.12.1: the programming language

  • Apache Spark 2.1.0: the computing engine

After a stack is created by using the sample template, you can obtain the value of SparkWebSiteURL and log on to the Apache Spark management console. If you want to access the URL specified by SparkWebSiteURL over the Internet, you must configure an inbound rule for the security group to allow traffic on ports 8088 and 8080. For more information, see Add a security group rule.

Step 1: Create a stack

  1. Log on to the ROS console.

  2. In the left-side navigation pane, choose Templates > Sample Templates.

  3. Search for the Installs Spark on an ECS instance (existing VPC) sample template.

    image

  4. Click Create Stack.

  5. In the Configure Parameters step, configure the Stack Name parameter and the following parameters.

    Parameter

    Description

    Example

    Existing VPC ID

    The VPC ID of the instance.

    For more information about how to create and query a VPC, see Create and manage a VPC.

    vpc-bp1m6fww66xbntjyc****

    VSwitch Zone ID

    The zone ID of the vSwitch in the VPC.

    Hangzhou Zone K

    VSwitch ID

    The ID of the vSwitch that resides in the VPC.

    For more information about how to create and query a vSwitch, see Create and manage a vSwitch.

    vsw-bp183p93qs667muql****

    Business Security Group ID

    The security group ID of the ECS instance.

    For more information about how to query the ID of a security group, see Search for security groups.

    sg-bp15ed6xe1yxeycg7o****

    Instance Type

    The instance type of the ECS instance.

    Select a valid instance type. For more information, see Overview of instance families.

    ecs.c5.large

    Image ID

    The image ID of the ECS instance. By default, a CentOS 7 image is used.

    For more information, see Overview.

    centos_7

    Instance Password

    The password of the ECS instance.

    Test_12****

    Public IP Bandwidth

    The bandwidth of the public IP address.

    Valid values: 1 to 100.

    Unit: Mbit/s.

    5

    Disk Type

    The disk category. Valid values:

    • cloud_efficiency: ultra disk

    • cloud_ssd: standard SSD

    • cloud_essd: Enterprise SSD (ESSD)

    • cloud: basic disk

    • ephemeral_ssd: local SSD

    For more information, see Disks.

    cloud_efficiency

    System Disk Space

    The system disk size of the ECS instance.

    Valid values: 40 to 500.

    Unit: GB.

    40

  6. Click Create.

  7. On the Stack Information tab, view the stack status. Wait until the stack is created. Then, click the Outputs tab to obtain the value of SparkWebSiteURL.

  8. Access the URL specified by SparkWebSiteURL and log on to the Apache Spark management console.

Step 2: View resources

    In the left-side navigation pane, choose Deployment > Stacks.

    On the Stacks page, click the ID of the desired stack.

  1. Click the Resources tab to view the information about resources in the stack.

    The following table describes the resource in this example.

    Resource

    Quantity

    Resource description

    Specification description

    ALIYUN::ECS::Instance

    1

    Creates an ECS instance to deploy Apache Spark on the instance.

    • An ECS instance that has the following specifications is created:

    • Instance type: ecs.c5.large.

    • Disk category: ultra disk.

    • System disk size: 40 GB.

    • Public IP address: A public IP address is allocated.

    Note

    For more information about the pricing details of resources, go to the relevant console or refer to the pricing documentation of each resource.