All Products
Search
Document Center

Elasticsearch:Use Filebeat to collect Apache log data

Last Updated:Nov 25, 2024

If you want to view and analyze Apache log data, you can use Filebeat to collect the data. Then, use Alibaba Cloud Logstash to filter the data and transfer the processed data to an Alibaba Cloud Elasticsearch cluster for analytics. This topic describes how to use Filebeat to collect Apache log data.

Procedure

  1. Step 1: Make preparations

  2. Step 2: Configure and install a Filebeat shipper

  3. Step 3: Configure a Logstash pipeline to filter and synchronize data

  4. Step 4: View the collected data

Step 1: Make preparations

  1. Create an Elasticsearch cluster and a Logstash cluster that are of the same version and are deployed in the same virtual private cloud (VPC).

  2. Enable the Auto Indexing feature for the Elasticsearch cluster.

    For security purposes, Alibaba Cloud Elasticsearch disables the Auto Indexing feature by default. However, Beats depends on this feature. If you select Elasticsearch for Output when you create a shipper, you must enable the Auto Indexing feature. For more information, see Configure the YML file.

  3. Create an Alibaba Cloud Elastic Compute Service (ECS) instance in the same VPC as the Elasticsearch cluster and Logstash cluster.

    For more information, see Create an instance on the Custom Launch tab.

    Important
    • Beats supports only Alibaba Cloud Linux (Alinux), Red Hat Enterprise Linux (RHEL), and CentOS.

    • Alibaba Cloud Filebeat can be used to collect logs only from an ECS instance that resides in the same region and is deployed in the same VPC as an Alibaba Cloud Elasticsearch cluster and an Alibaba Cloud Logstash cluster. Alibaba Cloud Filebeat cannot be used to collect logs from a source that is deployed on the Internet.

  4. Install HTTP Daemon (HTTPd) on the ECS instance.

    To facilitate the analytics and display of Apache log data by using a visualization tool, we recommend that you define JSON as the format of the log data in the httpd.conf file. For more information, see Step 1: Install and configure Apache HTTP Server. In this example, the following configurations are used:

    LogFormat "{\"@timestamp\":\"%{%Y-%m-%dT%H:%M:%S%z}t\",\"client_ip\":\"%{X-Forwa rded-For}i\",\"direct_ip\": \"%a\",\"request_time\":%T,\"status\":%>s,\"url\":\"%U%q\",\"method\":\"%m\",\"http_host\":\"%{Host}i\",\"server_ip\":\"%A\",\"http_referer\":\"%{Referer}i\",\"http_user_agent\":\"%{User-agent}i\",\"body_bytes_sent\":\"%B\",\"total_bytes_sent\":\"%O\"}"  access_log_json
    # Change the original CustomLog configuration to CustomLog "logs/access_log" access_log_json.
  5. Install Cloud Assistant and Docker on the ECS instance.

    For more information, see Install Cloud Assistant Agent and Install Docker.

Step 2: Configure and install a Filebeat shipper

  1. Log on to the Alibaba Cloud Elasticsearch console.

  2. Navigate to the Beats Data Shippers page.

    1. In the top navigation bar, select a region.

    2. In the left-side navigation pane, click Beats Data Shippers.

    3. Optional: If this is the first time you go to the Beats Data Shippers page, view the information displayed in the message that appears and click OK to authorize the system to create a service-linked role for your account.

      Note

      When Beats collects data from various data sources, Beats depends on the service-linked role and the rules specified for the role. Do not delete the service-linked role. Otherwise, the use of Beats is affected. For more information, see Elasticsearch service-linked roles.

  3. In the Create Shipper section, move the pointer over Filebeat and click ECS Logs.

  4. Configure and install a shipper.

    For more information, see Collect the logs of an ECS instance and Prepare a YML configuration file for a shipper. The following figure shows the configurations that are used in this example.

    image

    Note
    • You must select Logstash for Output and select the ID of the Logstash cluster. Therefore, you do not need to specify an output in Shipper YML Configuration.

    • You must set Filebeat File Path to the path that stores the data source. In addition, you must enable log collection and configure the path that is used to store log data in Shipper YML Configuration.

  5. Click Next.

  6. In the Install Shipper step, select the ECS instance on which you want to install the shipper.

    Note

    The selected ECS instance must meet the preceding prerequisites.

  7. Start the shipper and check whether the shipper is installed.

    1. Click Start.

      Then, the Start Shipper message appears.

    2. Click Back to Beats Shippers. In the Manage Shippers section of the Beats Data Shippers page, view the installed shipper.

    3. After the state of the shipper changes to Enabled 1/1, click View Instances in the Actions column.

    4. In the View Instances panel, check whether the shipper is installed on the ECS instance. If the value of Installed Shippers is Normal Heartbeat, the shipper is installed.

Step 3: Configure a Logstash pipeline to filter and synchronize data

  1. In the left-side navigation pane of the Alibaba Cloud Elasticsearch console, click Logstash Clusters.

  2. On the page that appears, find your Logstash cluster and click Manage Pipeline in the Actions column.

  3. On the Pipelines page, click Create Pipeline.

  4. Configure a pipeline.

    For more information, see Use configuration files to manage pipelines. The following configurations are used in this example:

    input {
      beats {
          port => 8000
        }
    }
    filter {
      json {
            source => "message"
            remove_field => "@version"
            remove_field => "prospector"
            remove_field => "beat"
            remove_field => "source"
            remove_field => "input"
            remove_field => "offset"
            remove_field => "fields"
            remove_field => "host"
            remove_field => "message"
          }
    
    }
    output {
      elasticsearch {
        hosts => ["http://es-cn-mp91cbxsm00******.elasticsearch.aliyuncs.com:9200"]
        user => "elastic"
        password => "<your_password>"
        index => "<your_index>"
      }
    }

    Parameter

    Description

    input

    Receives data collected by the shipper.

    filter

    Filters collected data. The json plug-in is used to decode message data. The remove_field parameter specifies the field that will be deleted.

    Note

    The configurations in the filter part apply to only the current testing scenario. You can configure the filter part based on your business requirements. For information about supported filter plug-ins, see Filter plugins.

    output

    Transfers data to your Elasticsearch cluster. The following parameters are involved:

    • hosts: Set this parameter to the endpoint of your Elasticsearch cluster. You can obtain the endpoint on the Basic Information page of the cluster. For more information, see View the basic information of a cluster.

    • <your_password>: Replace <your_password> with the password that is used to access your Elasticsearch cluster.

    • <your_index>: Replace <your_index> with the name of the index to which the data is transferred.

Step 4: View the collected data

  1. Log on to the Kibana console of your Elasticsearch cluster and go to the homepage of the Kibana console as prompted.

    For more information about how to log on to the Kibana console, see Log on to the Kibana console.

    Note

    In this example, an Elasticsearch V6.7.0 cluster is used. Operations on clusters of other versions may differ. The actual operations in the console prevail.

  2. In the left-side navigation pane of the page that appears, click Dev Tools.

  3. On the Console tab of the page that appears, run the following command to view the collected data:

    GET <your_index>/_search
    Note

    Replace <your_index> with the index name that you configured in the output part of the Logstash pipeline.

  4. In the left-side navigation pane, click Discover. On the page that appears, specify a period in the upper-right corner. Then, view the details of the collected data within the specified period.

    查看采集数据详情

    Note

    Before you view the collected data, make sure that an index pattern is created for the index specified by <your_index>. To create an index pattern in the Kibana console, click Management in the left-side navigation pane. On the page that appears, click Index Patterns in the Kibana section and then click Create index pattern. Follow the instructions to create the index pattern.