Before you can configure anti-crawler rules for web applications, you must integrate the Web SDK into the web applications. This topic describes how to integrate the Web SDK into web applications. In this topic, the Web SDK is referred to as the SDK.
Components
The SDK includes a web collector and an asynchronous API response component.
Web collector
The web collector integrates the characteristics of client browsers into the overall anti-bot attack and defense system to identify the attacks that cannot be identified by using network characteristics. This helps improve the protection capabilities.
After obfuscating and encrypting the collected characteristics, the web collector integrates cookies into root domain names and sends the cookies with requests. This reduces the impact on browser performance.
The web collector collects the following information:
Browser information, including the browser type and version, screen resolution, time zone, and timestamp
Specific attack and defense probes, including the probes for common browser-level bot scripts and drivers
User operations, including keyboard, mouse, and touch events
NoteKeyboard events include only the points in time when keys are pressed, not which keys are pressed. This helps protect your privacy.
Asynchronous API response component
The asynchronous API response component allows web applications to respond to challenge responses that are sent to APIs by the anti-bot attack and defense system. The challenges include JavaScript validation and CAPTCHA verification. After the asynchronous API response component is used, the component responds when it detects that Web Application Firewall (WAF) sends a challenge response to an API.
The asynchronous API response component does not provide security features or perform data collection or reporting.
The following list describes how the asynchronous API response component works:
The component globally rewrites common API request objects that are included in the requested page, such as xmlHttpRequest (XHR), Fetch, and Form, and encapsulates an additional layer of code for the objects. This does not affect the functionality of the original objects.
Instead of the JavaScript code of the requested page, the component checks whether a response is returned by WAF. WAF can send a challenge response to instruct the client to perform JavaScript validation or CAPTCHA verification.
If the response is returned by the origin server, the component does not perform operations on the response and passes the response to the JavaScript code of the requested page. If the response is returned by WAF, the component parses the response, performs the required calculations by using JavaScript, and then sends the request that contains the JavaScript validation signature to WAF. WAF verifies the JavaScript validation signature and sends the request to the origin server.
Compatibility
Environment compatibility: The SDK is compatible with browsers that use the rendering engine in Internet Explorer 8 or a later version.
Compatibility dependency: Client browsers must support cookies. If client browsers do not support cookies, the web collector cannot work as expected.
Hook compatibility: For specific services, hook-native objects that are used in synchronous API requests, including XHR, Form, and Fetch, conflict with the asynchronous API response component.
Deployment methods
Automatic integration
You can use automatic integration only if the requests that are submitted to HTML pages pass through WAF. Automatic integration eliminates the need for code modification on HTML pages. The HTML pages can be updated at runtime without the need for a full refresh.
If you select Automatic Integration in the Configure Scenarios step when you create a scenario-specific protection template, the bot management module parses HTML data, inserts the web collector and the asynchronous API response component into Document Object Model (DOM), and then returns the modified responses to clients when the module handles responses from HTML pages. For more information, see Configure anti-crawler rules for websites.
If you want to integrate the SDK into compressed HTML pages, automatic integration supports only the content-encoding:gzip compression method. Brotli compression and deflate compression are not supported.
Manual integration
Manual integration is suitable for scenarios in which automatic integration is not supported. For example, if requests that are sent to HTML pages do not pass through WAF or the specified compression methods are not supported, you cannot use automatic integration. To perform manual integration, make sure that the following conditions are met:
The SDK is integrated into the HTML pages on which asynchronous calls are initiated.
The SDK is obtained. To obtain the required scripts, perform the following steps: Go to the Bot Management page. On the Scenario-specific Protection tab, click Create Template. In the Configure Scenarios step, set the Web SDK Integration parameter to Manual Integration. Then, click Obtain SDK.
If you want to enable dynamic token-based authentication, place the following scripts <script> before other scripts <script> to ensure that the following scripts are loaded first:
<script src="//g.alicdn.com/frontend-lib/frontend-lib/2.3.66/jquery_240828.min.js"></script>
If you do not want to enable dynamic token-based authentication, place the following scripts <script> before other scripts <script> to ensure that the following scripts are loaded first:
<script src="//g.alicdn.com/frontend-lib/frontend-lib/2.3.59/interfaceacting240527.js"></script> <script src="//g.alicdn.com/frontend-lib/frontend-lib/2.3.59/antidom_240527.js"></script>
On-premises deployment
This deployment method is suitable only for specific cases, such as when Content Security Policy (CSP) does not allow resources to be loaded from alicdn or all components must be deployed in on-premises environments. We recommend that you do not use this method.
To deploy the SDK in an on-premises environment, you must create two copies of JavaScript resources in the on-premises environment, copy all JavaScript code in Alibaba Cloud CDN to the on-premises environment, and then place the preceding copies of JavaScript resources and code before other resources. This ensures that the JavaScript resources and code are loaded first.