All Products
Search
Document Center

Security Center:SDK for malicious file detection

Last Updated:Dec 02, 2024

SDK for malicious file detection is a feature that is developed based on various threat detection engines of Security Center. The feature can detect common viruses such as ransomware and mining programs in offline files and Object Storage Service (OSS) objects to prevent the spread and execution of malicious files. This topic describes how to use SDK for malicious file detection.

Usage notes

SDK for malicious file detection is a cloud detection solution. If you use this feature, your files are uploaded to the cloud for detection.

Detected virus types

Detected virus types (virus_type) table

Virus type

Virus name

Backdoor

Reverse shell

DDoS

DDoS trojan

Downloader

Trojan downloader

Engtest

Engine test program

Hacktool

Attacker tool

Trojan

High-risk program

Malbaseware

Tainted basic software

MalScript

Malicious script

Malware

Malicious program

Miner

Mining software

Proxytool

Proxy

RansomWare

Ransomware

RiskWare

Riskware

Rootkit

Rootkit

Stealer

Tool that is used to steal information

Scanner

Scanner

Suspicious

Suspicious program

Virus

Infectious virus

WebShell

Webshell

Worm

Worm

AdWare

Adware

Patcher

Patch

Gametool

Gametool

Detected files

  • SDK for malicious file detection can decompress and detect unencrypted packages.

    • If you use an SDK to detect offline files, packages are not decompressed by default. You must configure decompression settings. You can specify whether to recognize and decompress packages, the maximum decompression levels, and the maximum number of packages that can be decompressed.

    • If you check OSS objects in the Security Center console, Security Center does not decompress OSS objects by default. You must configure decompression settings. You can configure the decompression level for all buckets or for a specific bucket.

  • SDK for malicious file detection can decrypt and check OSS objects that are encrypted by using a server-side encryption method. The following server-side encryption methods are supported for different scenarios. For more information, see Server-side encryption.

    • Server-side encryption that uses KMS-managed CMKs (SSE-KMS): You can use the default customer master key (CMK) managed by Key Management Service (KMS) or specify a CMK to encrypt or decrypt objects. You do not need to send data to KMS over networks for encryption or decryption.

    • Server-side encryption that uses OSS-managed keys (SSE-OSS): You can use a key managed by OSS to encrypt an object.

Methods to detect malicious files

  • Call the SDK on business servers to detect offline files

    Integrate SDK for malicious file detection into your business servers to detect malicious files. You can use an SDK to detect malicious files and obtain information about the malicious files from the returned results. You can also view the files on which risks are detected in the Security Center console.

    You can use SDK for Java or SDK for Python.

  • Detect OSS objects in the Security Center console

    You can detect objects of OSS buckets in the Security Center console. You can also view the objects on which risks are detected.

Detection results

Security Center assesses the risk level of detected malicious files, and classify them as high, medium, and low. This assessment incorporates results from multiple virus detection engines with security expertise and considers both the potential maliciousness of the detected files and the accuracy of detection. Security Center also provides explanations and suggestions for handling the malicious files.

Log analysis

If you enabled the log analysis feature of Security Center, the records of malicious file detection are delivered to the dedicated Logstore for Security Center. For more information, see Overview of log analysis and Log types and log fields.

Scenarios

Scenario

Malicious files

Server in use

Servers are often subject to widely spread malicious files, such as worms, mining software, DDoS trojans, and malicious scripts.

These malicious files can replicate and disseminate themselves to deplete system resources, launch distributed Denial of Service (DDoS) attacks, or hijack servers for unauthorized activities.

Server under targeted attack

When servers are under targeted attacks, it is crucial to monitor for malicious files such as hacking tools, proxy tools, and backdoor programs, to maintain system security and stability.

These attacks are stealthy and targeted, with attackers embedding malicious files to exfiltrate sensitive data, control systems, or establish footholds for deeper network infiltration.

Comprehensive environment detection

In all the environments, you should stay alert to destructive ransomware and file-infecting viruses.

Ransomware holds user hostage by encrypting user data, while file-infecting viruses can self-replicate and spread to other files. Both can cause significant data loss and potential system failure.

Office network and file storage

If you are in office network or need to store files, you should remain vigilant against malicious document files such as Office files with macro viruses and compressed packages containing harmful payloads.

Such files may masquerade as regular documents in everyday work communications and entice users to open them, which can lead to attacks like credential theft or the activation of remote access trojans.

Limits

  • When you check OSS objects in the Security Center console, the SDK call for malicious file detection can check only a file that is 300 MB or smaller.

  • When you call SDK for malicious file detection, this call can check only a file that is 100 MB or smaller.

  • The default queries per second (QPS) for a free trial is different from the default QPS for Security Center Enterprise.

    • QPS of a free trial: 10.

    • QPS of Security Center Enterprise: 20.

  • The following types of packages can be checked: .7z, .zip, .tar, .gz, .rar, .ar, .bz2, .xz, and .lzma.

  • The maximum decompression level of a package is 5. After the decompression, up to 1,000 files can be extracted from a package, and the total size of all files cannot exceed 1 GB. Files that exceed the limit cannot be checked.

  • The speed of malicious file detection is affected by factors such as network conditions, computer performance, and limits on cloud services. SDK for malicious file detection uses queues to process requests during external request spikes. This helps improve processing capabilities in high-concurrency scenarios. When the internal queue is full, the system rejects external requests and does not process the external requests until the queue has available space.

    You can increase the queue length to process more concurrent requests. However, this method affects the detection duration of some samples. The timeout_ms parameter specifies the timeout period of the samples. Unit: milliseconds. To reduce timeout errors, we recommend that you set the timeout_ms parameter to 60000, which is equivalent to 60 seconds.

  • If you check OSS objects in the Security Center console, only the objects whose storage class is Standard or Infrequent Access (IA) can be detected. The objects whose storage class is Archive cannot be detected. For more information about storage classes, see Overview.

  • If you check OSS objects in the Security Center console, only OSS buckets in the following regions can be detected: China (Qingdao), China (Beijing), China (Zhangjiakou), China (Hohhot), China (Hangzhou), China (Shanghai), China (Shenzhen), China (Heyuan), China (Guangzhou), China (Chengdu), China (Hong Kong), Singapore, Indonesia (Jakarta), Thailand (Bangkok), Philippines (Manila), Malaysia (Kuala Lumpur), South Korea (Seoul), Japan (Tokyo), US (Silicon Valley), UK (London), US (Virginia), and Germany (Frankfurt).

Billing

If you use the feature of SDK for malicious file detection, the quota for the feature is used. If you check a package, the quota to be used is the number of files that are extracted from the package.

  • Alibaba Cloud accounts that have passed enterprise real-name verification can use the free trial of SDK for malicious file detection. If you use the free trial, you are provided a quota of 10,000 on SDK for malicious file detection.

  • If you purchase Security Center Enterprise, you can specify the quota based on your business requirements.

For more details on billing, see Billing overview.

Enable the feature and detect malicious files

Prerequisites

If you use a Resource Access Management (RAM) user, make sure that the AliyunYundunSASFullAccess policy is attached to the RAM user. For more information, see Grant permissions to the RAM user.

Step 1: Enable SDK for malicious file detection

You can apply for a free trial or purchase SDK for malicious file detection.

Free trial

If your Alibaba Cloud account has passed the enterprise real-name verification, you can use a free trial of SDK for malicious file detection and obtain a free quota of 10,000 on SDK for malicious file detection. Each Alibaba Cloud account can use the free trial for only once.

  1. Log on to the Security Center console. In the top navigation bar, select China as the region of the asset that you want to manage.

  2. In the left-side navigation pane, choose Risk Governance > SDK for Malicious File Detection.

  3. On the SDK for Malicious File Detection page, click Try Now.

Payment

If the free trial cannot meet your requirements, you can purchase SDK for malicious file detection.

Important

If the free quota is not exhausted, the remaining free quota is added to the purchased quota after you purchase the feature.

  1. Log on to the Security Center console. In the top navigation bar, select China as the region of the asset that you want to manage.

  2. In the left-side navigation pane, choose Risk Governance > SDK for Malicious File Detection.

  3. On the SDK for Malicious File Detection page, click Buy Now. On the page that appears, select Yes for Malicious File Detection SDK, and configure the Quota for Malicious File Detection SDK parameter based on the number of total files that you want to check.

    If you purchased a paid edition of Security Center, you can specify the quota on SDK for malicious file detection. If you did not purchase a paid edition of Security Center, select a paid edition of Security Center based on your business requirements.

    • If you do not require the other features of Security Center, select Value-added Plan Edition.

    • If you want to use the other features of Security Center, such as vulnerability fixing and container threat detection, select the required edition of Security Center. For more information about the features that are supported by each edition, see Functions and features.

  4. Read and select Terms of Service, click Buy Now, and then complete the payment.

After you enable SDK for malicious file detection, you can view the remaining quota on SDK for malicious file detection on the SDK for Malicious File Detection page. If the remaining quota is insufficient for subsequent detection, you can click Upgrade Configuration to purchase an additional quota for SDK for malicious file detection. For more information, see Upgrade and downgrade Security Center.

image.png

Step 2: Detect malicious files

Select one of the following methods to detect malicious files based on your business scenario.

Important

Before the detection, ensure that your Alibaba Cloud account has enough remaining check quota. If the quota is insufficient, you can purchase more on the SDK for Malicious File Detection page under the Risk File Overview tab by clicking Upgrade Configuration.

Call SDK to detect offline files in business servers

Preparations

  • Configure the ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET environment variables.

    You can define the environment variables ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET to configure the default credentials. When you call an API operation, the system reads the AccessKey pair from the default credentials and uses the AccessKey pair to complete authentication. For more information, see Configure environment variables in Linux, macOS, and Windows.

  • You must select an access method and obtain an SDK based on the following table.

    Access method

    Version

    Procedure

    Java

    Java Development Kit (JDK) 1.8 or later

    You can obtain SDK for Java by using the following method:

    Import and install SDK for Java offline: Visit the SDK code library for Java over the Internet, download SDK for Java, and then add SDK for Java to your project.

    Python

    Python 3.6 or later

    You can obtain SDK for Python by using one of the following methods:

    • Use pip to install SDK for Python if you have Internet access:

      pip install -U alibabacloud_filedetect
    • Install SDK for Python offline if you do not have Internet access: Visit the Code library for Python and download SDK for Python over the Internet. Upload SDK for Python to the project environment, decompress the SDK package, and run the following installation command:

      # Switch to the root directory of SDK for Python.
      cd alibabacloud-file-detect-python-sdk-master
      # Install SDK for Python that is for the correct Python version.
      python setup.py install

Examples

Important

If you use SDK for Python, change the value of the path parameter in the following example.

package com.aliyun.filedetect.sample;

import java.io.File;
import java.util.HashMap;
import java.util.Map;

import com.aliyun.filedetect.*;

public class Sample {

	/**
	 * Synchronous file detection operation.
	 * @param detector The detector object.
	 * @param path The path to the file that you want to detect.
	 * @param timeout_ms The timeout period. Unit: milliseconds.
	 * @param wait_if_queuefull Specify the operation that is performed when the queue is full. The value false indicates that the system directly returns an error, and the value true indicates the system waits until the queue has space available.
	 * @throws InterruptedException 
	 */
	public static DetectResult detectFileSync(OpenAPIDetector detector, String path, int timeout_ms, boolean wait_if_queuefull) throws InterruptedException {
		if (null == detector || null == path) return null;
		DetectResult result = null;
		while(true) {
			result = detector.detectSync(path, timeout_ms);
			if (null == result) break;
			if (result.error_code != ERR_CODE.ERR_DETECT_QUEUE_FULL) break;
			if (!wait_if_queuefull) break;
			detector.waitQueueAvailable(-1);
		}
		return result;
	}
	
	/**
	 * Asynchronous file detection operation.
	 * @param detector The detector object.
	 * @param path The path to the file that you want to detect.
	 * @param timeout_ms The timeout period. Unit: milliseconds.
	 * @param wait_if_queuefull Specify the operation that is performed when the queue is full. The value false indicates that the system directly returns an error, and the value true indicates the system waits until the queue has space available.
	 * @param callback The callback function.
	 * @throws InterruptedException 
	 */
	public static int detectFile(OpenAPIDetector detector, String path, int timeout_ms, boolean wait_if_queuefull, IDetectResultCallback callback) throws InterruptedException {
		if (null == detector || null == path || null == callback) return ERR_CODE.ERR_INIT.value();
		int result = ERR_CODE.ERR_INIT.value();
		if (wait_if_queuefull) {
			final IDetectResultCallback real_callback = callback;
			callback = new IDetectResultCallback() {
				public void onScanResult(int seq, String file_path, DetectResult callback_res) {
					if (callback_res.error_code == ERR_CODE.ERR_DETECT_QUEUE_FULL) return;
					real_callback.onScanResult(seq, file_path, callback_res);
				}
			};
		}
		while(true) {
			result = detector.detect(path, timeout_ms, callback);
			if (result != ERR_CODE.ERR_DETECT_QUEUE_FULL.value()) break;
			if (!wait_if_queuefull) break;
			detector.waitQueueAvailable(-1);
		}
		return result;
	}
	
	/**
	 * Synchronous URL file detection operation.
	 * @param detector The detector object.
	 * @param url The URL file that you want to detect.
	 * @param md5 The MD5 hash value of the file that you want to detect.
	 * @param timeout_ms The timeout period. Unit: milliseconds.
	 * @param wait_if_queuefull Specify the operation that is performed when the queue is full. The value false indicates that the system directly returns an error, and the value true indicates the system waits until the queue has space available.
	 * @throws InterruptedException 
	 */
	public static DetectResult detectUrlSync(OpenAPIDetector detector, String url, String md5, int timeout_ms, boolean wait_if_queuefull) throws InterruptedException {
		if (null == detector || null == url || null == md5) return null;
		DetectResult result = null;
		while(true) {
			result = detector.detectUrlSync(url, md5, timeout_ms);
			if (null == result) break;
			if (result.error_code != ERR_CODE.ERR_DETECT_QUEUE_FULL) break;
			if (!wait_if_queuefull) break;
			detector.waitQueueAvailable(-1);
		}
		return result;
	}
	
	/**
	 * Asynchronous URL file detection operation.
	 * @param detector The detector object.
	 * @param url The URL file that you want to detect.
	 * @param md5 The MD5 hash value of the file that you want to detect.
	 * @param timeout_ms The timeout period. Unit: milliseconds.
	 * @param wait_if_queuefull Specify the operation that is performed when the queue is full. The value false indicates that the system directly returns an error, and the value true indicates the system waits until the queue has space available.
	 * @param callback The callback function.
	 * @throws InterruptedException 
	 */
	public static int detectUrl(OpenAPIDetector detector, String url, String md5, int timeout_ms, boolean wait_if_queuefull, IDetectResultCallback callback) throws InterruptedException {
		if (null == detector || null == url || null == md5 || null == callback) return ERR_CODE.ERR_INIT.value();
		int result = ERR_CODE.ERR_INIT.value();
		if (wait_if_queuefull) {
			final IDetectResultCallback real_callback = callback;
			callback = new IDetectResultCallback() {
				public void onScanResult(int seq, String file_path, DetectResult callback_res) {
					if (callback_res.error_code == ERR_CODE.ERR_DETECT_QUEUE_FULL) return;
					real_callback.onScanResult(seq, file_path, callback_res);
				}
			};
		}
		while(true) {
			result = detector.detectUrl(url, md5, timeout_ms, callback);
			if (result != ERR_CODE.ERR_DETECT_QUEUE_FULL.value()) break;
			if (!wait_if_queuefull) break;
			detector.waitQueueAvailable(-1);
		}
		return result;
	}
	
	/**
	 * Format the detection result.
	 * @param result The detection result.
	 * @return The formatted string.
	 */
	public static String formatDetectResult(DetectResult result) {
		if (result.isSucc()) {
			DetectResult.DetectResultInfo info = result.getDetectResultInfo();
			String msg = String.format("[DETECT RESULT] [SUCCEED] %s", formatDetectResultInfo(info));
			if (info.compresslist != null) {
				int idx = 1;
				for (DetectResult.CompressFileDetectResultInfo comp_res : info.compresslist) {
					msg += String.format("\n\t\t\t [COMPRESS FILE] [IDX:%d] %s", idx++, formatCompressFileDetectResultInfo(comp_res));
				}
			}
			return msg;
		} 
		DetectResult.ErrorInfo info = result.getErrorInfo();
		return String.format("[DETECT RESULT] [FAIL] md5: %s, time: %d, error_code: %s, error_message: %s"
				, info.md5, info.time, info.error_code.name(), info.error_string);
	}
	
	private static String formatDetectResultInfo(DetectResult.DetectResultInfo info) {
		String msg = String.format("MD5: %s, TIME: %d, RESULT: %s, SCORE: %d", info.md5, info.time, info.result.name(), info.score);
		if (info.compresslist != null) {
			msg += String.format(", COMPRESS_FILES: %d", info.compresslist.size());
		}
		DetectResult.VirusInfo vinfo = info.getVirusInfo();
		if (vinfo != null) {
			msg += String.format(", VIRUS_TYPE: %s, EXT_INFO: %s", vinfo.virus_type, vinfo.ext_info);
		}
		return msg;
	}
	private static String formatCompressFileDetectResultInfo(DetectResult.CompressFileDetectResultInfo info) {
		String msg = String.format("PATH: %s, \t\t RESULT: %s, SCORE: %d", info.path, info.result.name(), info.score);
		DetectResult.VirusInfo vinfo = info.getVirusInfo();
		if (vinfo != null) {
			msg += String.format(", VIRUS_TYPE: %s, EXT_INFO: %s", vinfo.virus_type, vinfo.ext_info);
		}
		return msg;
	}
	
	/**
	 * Synchronous detection of directories or files.
	 * @param path The path. You can specify a file or directory. Directories are recursively traversed.
	 * @param is_sync Specify whether to use a synchronous operation. We recommend that you set the value to false. Valid values:  true: synchronous. false: asynchronous.
	 * @throws InterruptedException 
	 */
	public static void detectDirOrFileSync(OpenAPIDetector detector, String path, int timeout_ms, Map<String, DetectResult> result_map) throws InterruptedException {
		File file = new File(path);
		String abs_path = file.getAbsolutePath();
		if (file.isDirectory()) {
			String[] ss = file.list();
	        if (ss == null) return;
	        for (String s : ss) {
	        	String subpath = abs_path + File.separator + s;
	        	detectDirOrFileSync(detector, subpath, timeout_ms, result_map);
	        }
			return;
		}

    	System.out.println(String.format("[detectFileSync] [BEGIN] queueSize: %d, path: %s, timeout: %d", detector.getQueueSize(), abs_path, timeout_ms));
		DetectResult res = detectFileSync(detector, abs_path, timeout_ms, true);
    	System.err.println(String.format("                 [ END ] %s", formatDetectResult(res)));
		result_map.put(abs_path, res);
	}
	
	/**
	 * Asynchronous detection of directories or files.
	 * @param path The path. You can specify a file or directory. Directories are recursively traversed.
	 * @param is_sync Specify whether to use a synchronous operation. We recommend that you set the value to false. Valid values:  true: synchronous. false: asynchronous.
	 * @throws InterruptedException 
	 */
	public static void detectDirOrFile(OpenAPIDetector detector, String path, int timeout_ms, IDetectResultCallback callback) throws InterruptedException {
		File file = new File(path);
		String abs_path = file.getAbsolutePath();
		if (file.isDirectory()) {
			String[] ss = file.list();
	        if (ss == null) return;
	        for (String s : ss) {
	        	String subpath = abs_path + File.separator + s;
	        	detectDirOrFile(detector, subpath, timeout_ms, callback);
	        }
	        return;
		}

    	
		int seq = detectFile(detector, abs_path, timeout_ms, true, callback);
		System.out.println(String.format("[detectFile] [BEGIN] seq: %d, queueSize: %d, path: %s, timeout: %d", seq, detector.getQueueSize(), abs_path, timeout_ms));
	}
	
	/**
	 * Start the detection on files or directories.
	 * @param path The path. You can specify a file or directory. Directories are recursively traversed.
	 * @param is_sync Specify whether to use a synchronous operation. We recommend that you set the value to false. Valid values:  true: synchronous. false: asynchronous.
	 * @throws InterruptedException 
	 */
	public static void scan(final OpenAPIDetector detector, String path, int detect_timeout_ms, boolean is_sync) throws InterruptedException {
		System.out.println(String.format("[SCAN] [START] path: %s, detect_timeout_ms: %d, is_sync: %b", path, detect_timeout_ms, is_sync));
		long start_time = System.currentTimeMillis();
		final Map<String, DetectResult> result_map = new HashMap<>();
		if (is_sync) {
			detectDirOrFileSync(detector, path, detect_timeout_ms, result_map);
		} else {
			detectDirOrFile(detector, path, detect_timeout_ms, new IDetectResultCallback() {
				public void onScanResult(int seq, String file_path, DetectResult callback_res) {
			    	System.err.println(String.format("[detectFile] [ END ] seq: %d, queueSize: %d, %s", seq, detector.getQueueSize(), formatDetectResult(callback_res)));
					result_map.put(file_path, callback_res);
				}
			});
			// Wait until the task is complete.
			detector.waitQueueEmpty(-1);
		}
		long used_time = System.currentTimeMillis() - start_time;
		System.out.println(String.format("[SCAN] [ END ] used_time: %d, files: %d", used_time, result_map.size()));
		
		int fail_count = 0;
		int white_count = 0;
		int black_count = 0;
		for (Map.Entry<String, DetectResult> entry : result_map.entrySet()) {
			DetectResult res = entry.getValue();
			if (res.isSucc()) {
				if (res.getDetectResultInfo().result == DetectResult.RESULT.RES_BLACK) {
					black_count ++;
				} else {
					white_count ++;
				}
			} else {
				fail_count ++;
			}
		}
		System.out.println(String.format("             fail_count: %d, white_count: %d, black_count: %d"
				, fail_count, white_count, black_count));
	}
	
    public static void main(String[] args_) throws Exception {
    	// Obtain the detector instance.
    	OpenAPIDetector detector = OpenAPIDetector.getInstance();
    	
    	// Initialize the SDK.
    	ERR_CODE init_ret = detector.init(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"), System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
    	System.out.println("INIT RET: " + init_ret.name());
    	
    	// Configure the decompression parameters. These parameters are optional. By default, packages are not decompressed.
    	Boolean decompress=true; // Specify whether to recognize and decompress packages. Default value: false.
    	int decompressMaxLayer = 5; // The maximum decompression levels. This parameter is valid only when the decompress parameter is set to true.
    	int decompressMaxFileCount=1000; // The maximum number of packages that can be decompressed. This parameter is valid only when the decompress parameter is set to true.
    	ERR_CODE initdec_ret = detector.initDecompress(decompress, decompressMaxLayer, decompressMaxFileCount);
    	System.out.println("INIT_DECOMPRESS RET: " + initdec_ret.name());
    	
    	if (true) {
    		// Example 1: Scan an on-premises directory or file.
    		boolean is_sync_scan=false; // Specify whether to perform an asynchronous detection or a synchronous detection. An asynchronous detection provides better performance. The value false indicates an asynchronous detection.
        	int timeout_ms=500000; // The detection time of a sample. Unit: milliseconds.
        	String path = "test2.php"; // The file or directory that you want to scan.
        	// Start the scan and wait until the scan is complete.
        	scan(detector, path, timeout_ms, is_sync_scan);
    	}
    	
    	if (true) {
    		// Example 2: Scan a URL file.
        	int timeout_ms=500000; // The detection time of a sample. Unit: milliseconds.
        	String url = "https://xxxxxxxx.oss-cn-hangzhou-1.aliyuncs.com/xxxxx/xxxxxxxxxxxxxx?Expires=1*****25&OSSAccessKeyId=xxx"; // The URL file that you want to scan.
        	String md5 = "a767f*************6e21d000000"; // The MD5 hash value of the URL file that you want to scan.
        	// Synchronously scan the URL file. To asynchronously scan the URL file, call the detectUrl operation.
        	System.out.println(String.format("[detectUrlSync] [BEGIN] URL: %s, MD5: %s, TIMEOUT: %d", url, md5, timeout_ms));
        	DetectResult result = detectUrlSync(detector, url, md5, timeout_ms, true);
        	System.err.println(String.format("[detectUrlSync] [ END ] %s", formatDetectResult(result)));
    	}
    	
    	
		// Deinitialize the SDK.
		System.out.println("Over.");
    	detector.uninit();
    }
}
# -*- coding: utf-8 -*-
import os
import sys
from typing import List
import threading
import time
import traceback

from alibabacloud_filedetect.OpenAPIDetector import OpenAPIDetector
from alibabacloud_filedetect.IDetectResultCallback import IDetectResultCallback
from alibabacloud_filedetect.ERR_CODE import ERR_CODE
from alibabacloud_filedetect.DetectResult import DetectResult

class Sample(object):
    
    def __init__(self):
        pass


    """
    Synchronous file detection operation.
    @param detector The detector object.
    @param path The path to the file that you want to detect.
    @param timeout_ms The timeout period. Unit: milliseconds.
    @param wait_if_queuefull Specify the operation that is performed when the queue is full. The value False indicates that the system directly returns an error, and the value True indicates the system waits until the queue has space available.
    """
    def detectFileSync(self, detector, path, timeout_ms, wait_if_queuefull):
        if detector is None or path is None:
            return None
        result = None
        while True:
            result = detector.detectSync(path, timeout_ms)
            if result is None:
                break
            if result.error_code != ERR_CODE.ERR_DETECT_QUEUE_FULL:
                break
            if wait_if_queuefull is False:
                break
            detector.waitQueueAvailable(-1)
        return result


    """
    Asynchronous file detection operation.
    @param detector The detector object.
    @param path The path to the file that you want to detect.
    @param timeout_ms The timeout period. Unit: milliseconds.
    @param wait_if_queuefull Specify the operation that is performed when the queue is full. The value False indicates that the system directly returns an error, and the value True indicates the system waits until the queue has space available.
    @param callback The callback function.
    """
    def detectFile(self, detector, path, timeout_ms, wait_if_queuefull, callback):
        if detector is None or path is None or callback is None:
            return ERR_CODE.ERR_INIT.value
        result = ERR_CODE.ERR_INIT.value
        if wait_if_queuefull:
            real_callback = callback
            class AsyncTaskCallback(IDetectResultCallback):
                def onScanResult(self, seq, file_path, callback_res):
                    if callback_res.error_code == ERR_CODE.ERR_DETECT_QUEUE_FULL:
                        return
                    real_callback.onScanResult(seq, file_path, callback_res)
            callback = AsyncTaskCallback()
        while True:
            result = detector.detect(path, timeout_ms, callback)
            if result != ERR_CODE.ERR_DETECT_QUEUE_FULL.value:
                break
            if wait_if_queuefull is False:
                break
            detector.waitQueueAvailable(-1)
        return result
    

    """
    Synchronous URL file detection operation.
    @param detector The detector object.
	@param url The URL file that you want to detect.
	@param md5 The MD5 hash value of the file that you want to detect.
	@param timeout_ms The timeout period. Unit: milliseconds.
	@param wait_if_queuefull Specify the operation that is performed when the queue is full. The value false indicates that the system directly returns an error, and the value true indicates the system waits until the queue has space available.
    """
    def detectUrlSync(self, detector, url, md5, timeout_ms, wait_if_queuefull):
        if detector is None or url is None or md5 is None:
            return None
        result = None
        while True:
            result = detector.detectUrlSync(url, md5, timeout_ms)
            if result is None:
                break
            if result.error_code != ERR_CODE.ERR_DETECT_QUEUE_FULL:
                break
            if wait_if_queuefull is False:
                break
            detector.waitQueueAvailable(-1)
        return result


    """
    Asynchronous URL file detection operation.
	@param detector The detector object.
	@param url The URL file that you want to detect.
	@param md5 The MD5 hash value of the file that you want to detect.
	@param timeout_ms The timeout period. Unit: milliseconds.
	@param wait_if_queuefull Specify the operation that is performed when the queue is full. The value false indicates that the system directly returns an error, and the value true indicates the system waits until the queue has space available.
	@param callback The callback function.
    """
    def detectUrl(self, detector, url, md5, timeout_ms, wait_if_queuefull, callback):
        if detector is None or url is None or md5 is None or callback is None:
            return ERR_CODE.ERR_INIT.value
        result = ERR_CODE.ERR_INIT.value
        if wait_if_queuefull:
            real_callback = callback
            class AsyncTaskCallback(IDetectResultCallback):
                def onScanResult(self, seq, file_path, callback_res):
                    if callback_res.error_code == ERR_CODE.ERR_DETECT_QUEUE_FULL:
                        return
                    real_callback.onScanResult(seq, file_path, callback_res)
            callback = AsyncTaskCallback()
        while True:
            result = detector.detectUrl(url, md5, timeout_ms, callback)
            if result != ERR_CODE.ERR_DETECT_QUEUE_FULL.value:
                break
            if wait_if_queuefull is False:
                break
            detector.waitQueueAvailable(-1)
        return result


    """
    Format the detection result.
    @param result The detection result.
    @return The formatted string.
    """
    @staticmethod
    def formatDetectResult(result):
        msg = ""
        if result.isSucc():
            info = result.getDetectResultInfo()
            msg = "[DETECT RESULT] [SUCCEED] {}".format(Sample.formatDetectResultInfo(info))
            if info.compresslist is not None:
                idx = 1
                for comp_res in info.compresslist:
                    msg += "\n\t\t\t [COMPRESS FILE] [IDX:{}] {}".format(idx, Sample.formatCompressFileDetectResultInfo(comp_res))
                    idx += 1
        else:
            info = result.getErrorInfo()
            msg = "[DETECT RESULT] [FAIL] md5: {}, time: {}, error_code: {}, error_message: {}".format(info.md5,
                info.time, info.error_code.name, info.error_string)
        return msg


    @staticmethod
    def formatDetectResultInfo(info):
        msg = "MD5: {}, TIME: {}, RESULT: {}, SCORE: {}".format(info.md5, info.time, info.result.name, info.score)
        if info.compresslist is not None:
            msg += ", COMPRESS_FILES: {}".format(len(info.compresslist))
        vinfo = info.getVirusInfo()
        if vinfo is not None:
            msg += ", VIRUS_TYPE: {}, EXT_INFO: {}".format(vinfo.virus_type, vinfo.ext_info)
        return msg


    @staticmethod
    def formatCompressFileDetectResultInfo(info):
        msg = "PATH: {}, \t\t RESULT: {}, SCORE: {}".format(info.path, info.result.name, info.score)
        vinfo = info.getVirusInfo()
        if vinfo is not None:
            msg += ", VIRUS_TYPE: {}, EXT_INFO: {}".format(vinfo.virus_type, vinfo.ext_info)
        return msg


    """
    Synchronous detection of directories or files.
    @param path The path. You can specify a file or directory. Directories are recursively traversed.
    @param is_sync Specify whether to use a synchronous operation. We recommend that you set the value to False. Valid values:  True: synchronous. False: asynchronous.
    """
    def detectDirOrFileSync(self, detector, path, timeout_ms, result_map):
        abs_path = os.path.abspath(path)
        if os.path.isdir(abs_path):
            sub_files = os.listdir(abs_path)
            if len(sub_files) == 0:
                return
            for sub_file in sub_files:
                sub_path = os.path.join(abs_path, sub_file)
                self.detectDirOrFileSync(detector, sub_path, timeout_ms, result_map)
            return
        
        print("[detectFileSync] [BEGIN] queueSize: {}, path: {}, timeout: {}".format(
            detector.getQueueSize(), abs_path, timeout_ms))
        res = self.detectFileSync(detector, abs_path, timeout_ms, True)
        print("                 [ END ] {}".format(Sample.formatDetectResult(res)))
        result_map[abs_path] = res
        return


    """
    Asynchronous detection of directories or files.
    @param path The path. You can specify a file or directory. Directories are recursively traversed.
    @param is_sync Specify whether to use a synchronous operation. We recommend that you set the value to False. Valid values: True: synchronous. False: asynchronous.
    """
    def detectDirOrFile(self, detector, path, timeout_ms, callback):
        abs_path = os.path.abspath(path)
        if os.path.isdir(abs_path):
            sub_files = os.listdir(abs_path)
            if len(sub_files) == 0:
                return
            for sub_file in sub_files:
                sub_path = os.path.join(abs_path, sub_file)
                self.detectDirOrFile(detector, sub_path, timeout_ms, callback)
            return
        
        seq = self.detectFile(detector, abs_path, timeout_ms, True, callback)
        print("[detectFile] [BEGIN] seq: {}, queueSize: {}, path: {}, timeout: {}".format(
            seq, detector.getQueueSize(), abs_path, timeout_ms))   
        return

    
    """
    Start the detection on files or directories.
    @param path The path. You can specify a file or directory. Directories are recursively traversed.
    @param is_sync Specify whether to use a synchronous operation. We recommend that you set the value to False. Valid values:  True: synchronous. False: asynchronous.
    """
    def scan(self, detector, path, detect_timeout_ms, is_sync):
        try:
            print("[SCAN] [START] path: {}, detect_timeout_ms: {}, is_sync: {}".format(path, detect_timeout_ms, is_sync))
            start_time = time.time()
            result_map = {}
            if is_sync:
                self.detectDirOrFileSync(detector, path, detect_timeout_ms, result_map)
            else:
                class AsyncTaskCallback(IDetectResultCallback):
                    def onScanResult(self, seq, file_path, callback_res):
                        print("[detectFile] [ END ] seq: {}, queueSize: {}, {}".format(seq,
                            detector.getQueueSize(), Sample.formatDetectResult(callback_res)))
                        result_map[file_path] = callback_res
                self.detectDirOrFile(detector, path, detect_timeout_ms, AsyncTaskCallback())
                # Wait until the task is complete.
                detector.waitQueueEmpty(-1)
            
            used_time_ms = (time.time() - start_time) * 1000 
            print("[SCAN] [ END ] used_time: {}, files: {}".format(int(used_time_ms), len(result_map)))
            
            failed_count = 0
            white_count = 0
            black_count = 0
            for file_path, res in result_map.items():
                if res.isSucc():
                    if res.getDetectResultInfo().result == DetectResult.RESULT.RES_BLACK:
                        black_count += 1
                    else:
                        white_count += 1
                else:
                    failed_count += 1
            
            print("               fail_count: {}, white_count: {}, black_count: {}".format(
                failed_count, white_count, black_count))

        except Exception as e:
            print(traceback.format_exc(), file=sys.stderr)


    def main(self):

        # Obtain the detector instance.
        detector = OpenAPIDetector.get_instance()

        # Obtain the AccessKey ID and AccessKey secret in environment variables.
        access_key_id = os.getenv('ALIBABA_CLOUD_ACCESS_KEY_ID')
        access_key_secret = os.getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET')

        # Initialize the SDK.
        init_ret = detector.init(access_key_id, access_key_secret)
        print("INIT RET: {}".format(init_ret.name))

        # Configure the decompression parameters. These parameters are optional. By default, packages are not decompressed.
        decompress = True # Specify whether to recognize and decompress packages. Default value: false.
        decompressMaxLayer = 5 # The maximum decompression levels. This parameter is valid only when the decompress parameter is set to true.
        decompressMaxFileCount = 1000 # The maximum number of packages that can be decompressed. This parameter is valid only when the decompress parameter is set to true.
        initdec_ret = detector.initDecompress(decompress, decompressMaxLayer, decompressMaxFileCount)
        print("INIT_DECOMPRESS RET: {}".format(initdec_ret.name))

        if True:
            # Example 1: Scan an on-premises directory or file.
            is_sync_scan = False # Specify whether the detection is an asynchronous detection or a synchronous detection. An asynchronous detection provides better performance. The value False indicates an asynchronous detection.
            timeout_ms = 500000 # The detection time of a sample. Unit: milliseconds.
            path = "test.bin" # The file or directory that you want to scan.
            # Start the scan and wait until the scan is complete.
            self.scan(detector, path, timeout_ms, is_sync_scan)

        if True:
            # Example 2: Scan a URL file.
            timeout_ms = 500000
            url = "https://xxxxxxxx.oss-cn-hangzhou-1.aliyuncs.com/xxxxx/xxxxxxxxxxxxxx?Expires=1671448125&OSSAccessKeyId=xxx" # The URL file that you want to scan.
            md5 = "a767ffc59d93125c7505b6e21d000000"
            # Synchronously scan the URL file. To asynchronously scan the URL file, call the detectUrl operation.
            print("[detectUrlSync] [BEGIN] URL: {}, MD5: {}, TIMEOUT: {}".format(url, md5, timeout_ms))
            result = self.detectUrlSync(detector, url, md5, timeout_ms, True)
            print("[detectUrlSync] [ END ] {}".format(Sample.formatDetectResult(result)))

        # Deinitialize the SDK.
        print("Over.")
        detector.uninit()
        

if __name__ == "__main__":
    sample = Sample()
    sample.main()

Returned results

After you call the SDK, the detection results are not synchronized to the Security Center console. You can view the results only in the returned results of the SDK call. You can view the remaining quota on SDK for malicious file detection in the Security Center console. The SDK for Malicious File Detection page displays detection statistics. For example, the Total Files parameter on the At-risk File Overview tab specifies the total number of scanned files.

struct DetectResult {
    std::string md5; // The MD5 hash value of the sample.
    long time = 0; // The time that is required to process this request. Unit: milliseconds.

    ERR_CODE error_code;    // The error code.
    std::string error_string;     // The extended error message.

    enum RESULT {
        RES_WHITE = 0,       // The number of secure files.
        RES_BLACK = 1,       // The number of suspicious files.
        RES_PENDING = 3           // The number of files that are being detected.
     };
    RESULT result;          // The detection result.

    int score;                // The detection score. Valid values: 0 to 100.
    std::string virus_type;    // The virus type. Examples: WebShell, MalScript, and Hacktool.
    std::string ext_info;    // The extended information. The value of this parameter is a JSON string.

    struct CompressFileDetectResultInfo {
        std::string path; // The path to the package.
        RESULT result; // The detection result.
        int score; // The detection score. Valid values: 0 to 100.
        std::string virus_type;    // The virus type. Examples: WebShell, MalScript, and Hacktool.
        std::string ext_info;    // The extended information. The value of this parameter is a JSON string.
	};
    std::list<struct CompressFileDetectResultInfo> compresslist = null; // If the files that you detect are a package and the decompress parameter is set to true, the value of this parameter is the detection result of the files in the package.
    
};

The following error codes may be returned.

enum ERR_CODE {
    ERR_INIT = -100,            	// Initialization or re-initialization is required.
    ERR_FILE_NOT_FOUND = -99,  		// The file is not found.
    ERR_DETECT_QUEUE_FULL = -98, 	// The detection queue is full.
    ERR_CALL_API = -97, 			// An error occurred while the API is called.
    ERR_TIMEOUT = -96, 				// The operation timed out.
    ERR_UPLOAD = -95, 				// The file failed to be uploaded. You can reinitiate the detection and try again.
    ERR_ABORT = -94, 				// The program exits, and the sample is not detected.
    ERR_TIMEOUT_QUEUE = -93, 		// The queue timed out. You initiated detections too frequently, or the timeout period that you set is too short.
	ERR_MD5 = -92, 					// The MD5 hash value is invalid.
	ERR_URL = -91, 					// The URL format is invalid.
    
    ERR_SUCC = 0              		// The operation is successful.
};

A higher detection score indicates a higher risk level. The following table describes the mapping between detection scores and risk levels.

Score range

Risk level

0~60

Secure

61~70

At-risk

71~80

Suspicious

81~100

Malicious

Perform detection on buckets in the Security Center console

  1. Log on to the Security Center console. In the top navigation bar, select China as the region of the asset that you want to manage.

  2. In the left-side navigation pane, choose Risk Governance > SDK for Malicious File Detection.

  3. Click the OSS File Check tab, select a detection method, and then start detection.

    If your bucket is not displayed in the list on the OSS File Check tab, you can click Synchronize Buckets to obtain the latest list of buckets.

    Detection method

    Description

    Procedure

    Manual full detection

    Check all objects in one or more buckets.

    1. On the OSS File Check tab, find a bucket and click Check in the Actions column, or select multiple buckets and click Batch Check.

    2. In the Check dialog box, configure the parameters, including File Check Type, Scope, and Scan Path. The following list describes the parameters:

      • Decompression Level: If a compressed package type is specified for the File Check Type parameter, you must configure the Decompression Level parameter, which supports Level 1 to Leval 5 and Do Not Decomporess. If you set this parameter to a value other than Do Not Decompress, you must configure the Extraction Limit parameter, which specifies the number of files that can be extracted from a package. The maximum value is 1000.

      • File Decryption Type: The default value is Do Not Decrypt. If you want to check objects that are encrypted by using SSE-KMS or SSE-OSS, you must specify a decryption method. The value OSS specifies the decryption method for OSS objects that are encrypted by using SSE-OSS. The value KMS specifies the decryption method for OSS objects that are encrypted by using SSE-KMS.

      • Scope: This parameter is optional. You can specify a point in time. The system checks objects that are updated later than the specified point in time.

      • Scan Path: You can select Match by Prefix or Configure for Entire Bucket. The value Match by Prefix indicates that the system checks only objects that are named with the specified prefix. The value Configure for Entire Bucket indicates that the system checks all objects in the bucket.

    3. Click OK.

    Manual incremental detection

    Check only newly added objects in a bucket that has been checked.

    1. On the OSS File Check tab, find the required bucket and click Incremental Check in the Actions column.

    2. In the Incremental Check dialog box, configure the parameters, including File Check Type, File Decryption Type, Scope, and Scan Path. If a compressed package type is specified for the File Check Type parameter, you must also configure the Decompression Level parameter. Valid values of the Scan Path parameter are Match by Prefix and Configure for Entire Bucket.

    3. Click OK.

    Auto detection

    Enable periodic automatic detection for specified buckets based on configured scan policies. Take note of the following items before you configure a scan policy:

    • A bucket can be specified only in one policy.

    • This method is enabled only for newly added objects in OSS. A file is not repeatedly detected.

    1. On the OSS File Check tab, click Policy Configuration below Policy Management.

    2. In the Policy Management panel, click Create Policy.

      If an existing policy meets your business requirements, you can click Edit in the Actions column of the policy. In the Edit Policy panel, select the bucket for which you want to enable automatic detection and click OK.

    3. In the Create Policy panel, configure the parameters, including Policy Name, Policy Status, Scan Path, Scope, Detection Cycle, File Check Time, File Check Type, Decompression Level, File Decryption Type, and Effective Bucket. If a compressed package type is specified for the File Check Type parameter, you must also configure the Decompression Level parameter. Valid values of the Scan Path parameter are Match by Prefix and Configure for Entire Bucket. By default, the switch for the Policy Status parameter is turned on.

    4. Click OK.

Optional: Configure a DingTalk chatbot

You can configure a DingTalk chatbot to receive notifications of alerts triggered by malicious files that are detected by Security Center in the specified DingTalk group in real time. For more information, see Configure notification settings on the DingTalk Chatbot tab.

image

View and export detection results

At-risk File Overview

On the At-risk File Overview tab:

  • View statistics information of risk files

    image

    • Risk levels (High-risk, Medium-risk, Low-risk) are indicated by different colors: High-risk (red), Medium-risk (orange), Low-risk (gray).

    • In the risk file list, you can filter and view malicious file information detected through the SDK (API) or the console (OSS) by selecting from the Detection Scenario column. Use the dropdown box in the upper left corner to filter by Risk Level, File Name, Threat Tag, MD5, or Latest Detection Time.

      For compressed package files, click the expand image icon to view the list of risk files within. The package file's risk level reflects the highest risk level among its contents.

  • View details of targeted risk files

    • In the risk file list, click Actions column of the target risk file and click Details to view information such as File Details and Event Description (including malicious file description and disposal suggestions) in the details panel of the file..

      The risk details panel for compressed package files also shows the number of risk files in the package, the total number of files detected post-decompression, and a list of risk files.

      image

  • Export the list of all risk files

    Click the image..png icon, and once the export is complete, select Download in the notification box.

OSS File Check

On the OSS File Check tab:

  • View statistics information of risk files from the OSS Bucket perspective

    Risk levels (High-risk, Medium-risk, Low-risk) are represented by distinct colors: High-risk (red), Medium-risk (orange), Low-risk (gray).

    image

  • View detection details for a specific OSS Bucket

    In the Bucket list, click the Actions column for the target Bucket and select Details to view the Bucket's basic information and risk file details.

    On the File Detection Details page, you can also see the decompression status of the configuration file, the total number of files post-decompression, the number of detected files, and a list of scanned and unscanned risk files.

    image

  • Export the list of all risk files by Bucket dimension

    On the OSS File Check tab, click the image..png icon, and after the export is finalized, click Download in the notification box.

Handle alerts for at-risk files

After the at-risk files are detected, you should assess and remediate them based on their risk levels, as well as the event descriptions and advice provided by the Security Center.

  • High-risk events: These events are marked by a low rate of false positives and a high degree of maliciousness, necessitating immediate action.

  • Medium and low-risk events: These events are common in business operations. It is important to evaluate their impact on your business before taking action.

    Should they be identified as normal business files or false positives, the corresponding alerts can be safely ignored. If your business application calls SDK for file detection, consider adding to the application the logic of bypassing or whitelisting such files.

References

You can call the following API operations to use SDK for malicious file detection: