Bot management rules can be used to protect your websites or native iOS and Android apps against crawlers. To use the anti-crawler feature on your native iOS and Android apps, you must integrate the Anti-Bot SDK. You can create different anti-crawler rules for requests that have different characteristics. You can also use the built-in crawler libraries such as search engine crawler library, AI protection, bot threat intelligence library, data center blacklist, and fake spider list. This frees you from manual updates and analysis of crawler characteristics.
Create a bot management rule set
Log on to the ESA console.
In the left-side navigation pane, click Websites.
On the Websites page, find the website that you want to manage, and click the website name or View Details in the Actions column.
In the left-side navigation tree, choose Security > Bots.
On the Bots page, click Create Rule Set.
Select Browsers or Apps for the Service Type parameter, configure other parameters as needed, and then click OK. For more information, see Configure anti-crawler rules for websites and Configure anti-crawler rules for apps.
Configure anti-crawler rules for websites
If your web pages, HTML5 pages, or HTML5 apps are accessible from browsers, you can configure anti-crawler rules for the websites to protect your services from malicious crawlers.
Section | Parameter | Description |
Global Settings | Rule Set Name | The name of the rule set. The name can contain letters, digits, and underscores (_). |
Service Type | Select Browsers. This way, the rule set applies to web pages, HTML5 pages, and HTML5 apps. | |
SDK Integration |
| |
Cross-origin Request | If you select Automatic Integration (Recommended), for multiple websites and these websites can access each other, you must select a different domain for this parameter to prevent duplicate JavaScript code. For example, if you log on to the Website A from Website B, you need to specify the domain name of Website B for this parameter. | |
If requests match... | Specify the conditions for matching incoming requests. For more information, see WAF. | |
Then execute... | Legitimate Bot Management | The search engine crawler whitelist contains the crawler IP addresses of major search engines, including Google, Baidu, Sogou, 360, Bing, and Yandex. The whitelist is dynamically updated. After you select a search engine spider whitelist, requests sent from the crawler IP addresses of the search engines are allowed. The bot management module no longer checks the requests. |
Bot Characteristic Detection |
| |
Bot Behavior Detection | After you enable AI Protection, the intelligent protection engine analyzes access traffic and performs machine learning. Then, a blacklist or a protection rule is generated based on the analysis results and learned patterns.
| |
Custom Throttling |
| |
Bot Threat Intelligence Library | The library contains the IP addresses of attackers that have sent multiple requests to crawl content from Alibaba Cloud users over a specific period of time. You can set Action to Monitor or Slider CAPTCHA. | |
Data Center Blacklist | After you enable this feature, the IP addresses in the selected IP address libraries of data centers are blocked. If you use the source IP addresses of public clouds or data centers to access the website that you want to protect, you must add the IP addresses to the whitelist. For example, you must add the callback IP addresses of Alipay or WeChat and the IP addresses of monitoring applications to the whitelist. The data center blacklist supports the following IP address libraries: IP Address Library of Data Center-Alibaba Cloud, IP Address Library of Data Center-21Vianet, IP Address Library of Data Center-Meituan Open Services, IP Address Library of Data Center-Tencent Cloud, and IP Address Library of Data Center-Other. You can set the Action parameter to Monitor, Slider CAPTCHA, or Block. | |
Fake Spider Blocking | After you enable this feature, WAF blocks the User-Agent headers that are used by all search engines specified in the Legitimate Bot Management section. If the IP addresses of clients that access the search engines are proved to be valid, WAF allows requests from the search engines. | |
Effective Time | By default, rules take effect immediately and permanently after they are created. You can configure specific time ranges or cycles in which rules take effect. |
Configure anti-crawler rules for apps
You can configure anti-crawler rules for your native iOS or Android apps to protect your services against crawlers. HTML5 apps are not native iOS or Android apps.
Section | Parameter | Description |
Global Settings | Rule Set Name | The name of the rule set. The name can contain letters, digits, and underscores (_). |
Service Type | Select Apps to configure anti-crawler rules for native iOS and Android apps. HTML5 apps are not native iOS or Android apps. | |
SDK Integration | To obtain the SDK package, click Obtain and Copy AppKey and then submit a ticket. For more information, see Integrate the Anti-Bot SDK into Android apps and Integrate the Anti-Bot SDK into iOS apps. After the Anti-Bot SDK is integrated, the Anti-Bot SDK collects the risk characteristics of clients and generates security signatures in requests. WAF identifies and blocks requests that are identified as unsafe based on the signatures. | |
If requests match... | Specify the conditions for matching incoming requests. For more information, see WAF. | |
Then execute... | Bot Characteristic Detection |
|
Bot Throttling |
| |
Bot Threat Intelligence Library | The library contains the IP addresses of attackers that have sent multiple requests to crawl content from Alibaba Cloud users over a specific period of time. | |
Data Center Blacklist | After you enable this feature, the IP addresses in the selected IP address libraries of data centers are blocked. If you use the source IP addresses of public clouds or data centers to access the website that you want to protect, you must add the IP addresses to the whitelist. For example, you must add the callback IP addresses of Alipay or WeChat and the IP addresses of monitoring applications to the whitelist. The data center blacklist supports the following IP address libraries: IP Address Library of Data Center-Alibaba Cloud, IP Address Library of Data Center-21Vianet, IP Address Library of Data Center-Meituan Open Services, IP Address Library of Data Center-Tencent Cloud, and IP Address Library of Data Center-Other. | |
Effective Time | By default, rules take effect immediately and permanently after they are created. You can configure specific time ranges or cycles in which rules take effect. |
Feature availability
Basic | Standard | Advanced | Enterprise | |
Bot management rule sets | No | No | No | 10 |