All Products
Search
Document Center

E-MapReduce:Vulnerability notice | Apache Hadoop FileUtil.unTar command injection vulnerability

Last Updated:Feb 26, 2026

On August 4, 2022, Apache Hadoop officially announced a fix on the shell command injection vulnerability CVE-2022-25168. The FileUtil.unTar API of Apache Hadoop does not escape the name of an input file before the file is passed to the shell. As a result, attackers can exploit this vulnerability to inject arbitrary commands, which may damage clusters.

Impacts

  • Affected versions of Hadoop:

    • 2.0.0 <= Apache Hadoop <= 2.10.1

    • 3.0.0-alpha <= Apache Hadoop <= 3.2.3

    • 3.3.0 <= Apache Hadoop <= 3.3.2

  • Affected versions of EMR: EMR V3.X, EMR V4.X, EMR V5.8.X, and minor versions earlier than EMR V5.8.X.

  • Severity: The vulnerability is a high severity vulnerability and must be fixed at the earliest opportunity.

Procedure

Fix the vulnerability for an EMR cluster

  1. Download the hadoop-patches-tools.zip package.

  2. Log on to the master node of your EMR cluster and place the package that you downloaded in Step 1 in the home directory of the emr-user or hadoop user.

  3. Run the following commands to switch to the emr-user or hadoop user and decompress the hadoop-patches-tools.zip package:

    • DataLake clusters, Dataflow clusters, OLAP clusters, DataServing clusters, or custom clusters

      su emr-user
      unzip hadoop-patches-tools.zip
    • Other clusters

      su hadoop
      unzip hadoop-patches-tools.zip
  4. Run the following commands to go to the hadoop-patches-tools directory to edit the hosts file:

    cd hadoop-patches-tools
    vim hosts

    Add the hostnames of all nodes in the cluster to the hosts file. Each hostname occupies a separate line. Sample file content:

    emr-header-1
    emr-worker-1
    emr-worker-2
    Important

    For clusters of EMR V3.41 or a later minor version, or clusters of EMR V5.7.0 or a later minor version, the hostnames of nodes are in a different format. Sample file content:

    core-1-1
    core-1-2
    task-1-1
    task-1-2
  5. Run the fix.sh script to fix the vulnerability:

    ./fix.sh

    After the script is run, the following information is returned:

    ### NOTICE: YOU CAN RESTORE THIS PATCH BY RUN RESTORE SCRIPT ABOVE
    $> sh ./restore.sh 20221024160338
    ### DONE

    If you want to perform a rollback, run the following command:

    sh ./restore.sh 20221024160338

    You can run the ./check.sh command to check whether the vulnerability is fixed.

    If the vulnerability is fixed, the following information is returned:

    ********************************************************************
         OK: CVE-2022-25168 is already fixed.
                              This tools updated at 2022/10/13.
    ********************************************************************
    Important

    After the JAR package in the source code is replaced, the vulnerability is fixed. However, Security Center detects vulnerabilities by version number. Therefore, you can ignore the check for this vulnerability in Security Center.

  6. Restart services.

    Restart the affected services, including HDFS, Hive, Presto, Impala, Druid, Flink, Solr, Ranger, Storm, Oozie, Spark, and Zeppelin. To reduce the impact on your business, restart the services for one node at a time.

    For example, choose More > Restart in the upper-right corner of the HDFS service page to restart the HDFS service.

Fix the vulnerability for a gateway cluster

Gateway clusters do not support password-free logon in SSH mode. Therefore, if you use a gateway cluster, you must manually upload the patch package to each node of the gateway cluster and perform the preceding fix operations on each node.

Important
  • You need to enter only the hostname- of the current execution node to the hosts file.

  • A gateway cluster does not contain services. You do not need to restart a service after you upload the patch package.

Fix the vulnerability when you create a cluster or scale out an existing cluster

When you create an EMR cluster, you can add a bootstrap action in the EMR console to fix the vulnerability. When you scale out an existing cluster, the system automatically fixes the vulnerability. When you create an EMR cluster, perform the following steps:

  1. Download the hadoop-patches-tools.zip package and the bootstrap_hadoop.sh script file and upload them to an Object Storage Service (OSS) path.

    In this example, the package and script file are uploaded to oss://<bucket-name>/path/to/.

  2. Add a bootstrap action in the EMR console. For more information, see Use bootstrap actions to execute scripts.

    In the Add Bootstrap Action dialog box, configure the following parameters.

    Parameter

    Description

    Name

    The name of the bootstrap action that you want to add. For example, you can set this parameter to fixFileUtil.unTarvulnerability.

    Script Address

    The OSS path where the script file is located.

    You must specify this parameter in the oss://**/*.sh format. In this example, the path is oss://<bucket-name>/path/to/bootstrap_hadoop.sh.

    Parameter

    The parameter of the bootstrap action script. The parameter is used to specify the value of the variable that is referenced in the script.

    In this example, the parameter is oss://<bucket-name>/path/to/hadoop-patches-tools.zip.

    Execution Scope

    Select Cluster.

    Execution Time

    Select After Component Startup.

    Execution Failure Policy

    Select Proceed.

  3. After you create an EMR cluster, restart the services, such as HDFS, Hive, Presto, Impala, Druid, Flink, Solr, Ranger, Storm, Oozie, Spark, and Zeppelin, on all nodes of the cluster. If you scale out a cluster, you need to only restart the related services that are deployed on the newly added nodes.