全部產品
Search
文件中心

Simple Log Service:特定格式文本資料加工

更新時間:Jun 30, 2024

文檔中的實踐案例主要是根據實際工作中的工單需求產生。本文檔將從工單需求,加工編排等方面介紹如何使用LOG DSL編排解決任務需求。

非標準JSON對象轉JSON對象並展開

需要對收集的dict資料進行二次嵌套展開操作。首先將dict資料轉成JSON資料,再使用e_json函數進行展開即可。

  • 原始日誌

    content: {
      'referer': '-',
      'request': 'GET /phpMyAdmin',
      'status': 404,
      'data-1': {
        'aaa': 'Mozilla',
        'bbb': 'asde'
      },
      'data-2': {
        'up_adde': '-',
        'up_host': '-'
      }
    }
  • 資料加工語句

    1. 將上述content內容中的單引號轉換成雙引號,轉換成JSON格式資料。

      e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))

      處理後的日誌為:

      content: {
        'referer': '-',
        'request': 'GET /phpMyAdmin',
        'status': 404,
        'data-1': {
          'aaa': 'Mozilla',
          'bbb': 'asde'
        },
        'data-2': {
          'up_adde': '-',
          'up_host': '-'
        }
      }
      content_json:  {
        "referer": "-",
        "request": "GET /phpMyAdmin",
        "status": 404,
        "data-1": {
          "aaa": "Mozilla",
          "bbb": "asde"
        },
        "data-2": {
          "up_adde": "-",
          "up_host": "-"
        }
      }
    2. 對經過處理後的標準化的content_json資料進行展開。例如要展開第一層只需要設定JSON中的depth參數為1即可。

      e_json("content_json",depth=1,fmt='full')

      展開的日誌為:

      content_json.data-1.data-1:  {"aaa": "Mozilla", "bbb": "asde"}
      content_json.data-2.data-2:  {"up_adde": "-", "up_host": "-"}
      content_json.referer:  -
      content_json.request:  GET /phpMyAdmin
      content_json.status:  404

      如果depth設定為2,則展開的日誌為:

      content_json.data-1.aaa:  Mozilla
      content_json.data-1.bbb:  asde
      content_json.data-2.up_adde:  -
      content_json.data-2.up_host:  -
      content_json.referer:  -
      content_json.request:  GET /phpMyAdmin
      content_json.status:  404
    3. 綜上LOG DSL規則可以如以下形式:

      e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))
      e_json("content_json",depth=2,fmt='full')
  • 加工後資料

    加工後的資料是按照depth2處理的,具體形式如下:

    content:  {
      'referer': '-',
      'request': 'GET /phpMyAdmin',
      'status': 404,
      'data-1': {
        'aaa': 'Mozilla',
        'bbb': 'asde'
      },
      'data-2': {
        'up_adde': '-',
        'up_host': '-'
      }
    }
    content_json:  {
      "referer": "-",
      "request": "GET /phpMyAdmin",
      "status": 404,
      "data-1": {
        "aaa": "Mozilla",
        "bbb": "asde"
      },
      "data-2": {
        "up_adde": "-",
        "up_host": "-"
      }
    }
    content_json.data-1.aaa:  Mozilla
    content_json.data-1.bbb:  asde
    content_json.data-2.up_adde:  -
    content_json.data-2.up_host:  -
    content_json.referer:  -
    content_json.request:  GET /phpMyAdmin
    content_json.status:  404

其他格式文本轉JSON展開

對一些非標準的JSON格式資料,如果進行展開可以通過組合規則的形式進行操作。

  • 原始日誌

    content : {
      "pod" => {
        "name" => "crm-learning-follow-7bc48f8b6b-m6kgb"
      }, "node" => {
        "name" => "tw5"
      }, "labels" => {
        "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow"
      }, "container" => {
        "name" => "crm-learning-follow"
      }, "namespace" => "testing1"
    }
  • 資料加工語句

    1. 首先將日誌格式轉換為JSON形式,可以使用str_logtash_config_normalize函數進行轉換,操作如下:

      e_set("normalize_data",str_logtash_config_normalize(v("content")))
    2. 可以使用JSON函數進行展開操作,具體如下:

      e_json("normalize_data",depth=1,fmt='full')
    3. 綜上LOG DSL規則可以如以下形式:

      e_set("normalize_data",str_logtash_config_normalize(v("content")))
      e_json("normalize_data",depth=1,fmt='full')
  • 加工後資料

    content : {
      "pod" => {
        "name" => "crm-learning-follow-7bc48f8b6b-m6kgb"
      }, "node" => {
        "name" => "tw5"
      }, "labels" => {
        "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow"
      }, "container" => {
        "name" => "crm-learning-follow"
      }, "namespace" => "testing1"
    }
    normalize_data:  {
      "pod": {
        "name": "crm-learning-follow-7bc48f8b6b-m6kgb"
      },
      "node": {
        "name": "tw5"
      },
      "labels": {
        "pod-template-hash": "7bc48f8b6b",
        "app": "crm-learning-follow"
      },
      "container": {
        "name": "crm-learning-follow"
      },
      "namespace": "testing1"
    }
    normalize_data.container.container:  {"name": "crm-learning-follow"}
    normalize_data.labels.labels:  {"pod-template-hash": "7bc48f8b6b", "app": "crm-learning-follow"}
    normalize_data.namespace:  testing1
    normalize_data.node.node:  {"name": "tw5"}
    normalize_data.pod.pod:  {"name": "crm-learning-follow-7bc48f8b6b-m6kgb"}

部分文本特殊編碼轉換

在日常工作環境中,會遇到一些十六進位字元,需要對其解碼才能正常閱讀。可以使用str_hex_escape_encode函數對一些十六進位字元進行轉義操作。

  • 原始日誌

    content : "\xe4\xbd\xa0\xe5\xa5\xbd"
  • LOG DSL編排

    e_set("hex_encode",str_hex_escape_encode(v("content")))
  • 加工後資料

    content : "\xe4\xbd\xa0\xe5\xa5\xbd"
    hex_encode : "你好"

XML欄位展開

在工作中會遇到各種類型資料,例如xml資料。如果要展開xml資料可以使用xml_to_json函數處理。

  • 測試日誌

    str : <?xmlversion="1.0"?>
    <data>
        <countryname="Liechtenstein">
            <rank>1</rank>
            <year>2008</year>
            <gdppc>141100</gdppc>
            <neighborname="Austria"direction="E"/>
            <neighborname="Switzerland"direction="W"/>
        </country>
        <countryname="Singapore">
            <rank>4</rank>
            <year>2011</year>
            <gdppc>59900</gdppc>
            <neighborname="Malaysia"direction="N"/>
        </country>
        <countryname="Panama">
            <rank>68</rank>
            <year>2011</year>
            <gdppc>13600</gdppc>
            <neighborname="Costa Rica"direction="W"/>
            <neighborname="Colombia"direction="E"/>
        </country>
    </data>
  • LOG DSL編排

    e_set("str_json",xml_to_json(v("str")))
  • 加工後的日誌

    str : <?xmlversion="1.0"?>
    <data>
        <countryname="Liechtenstein">
            <rank>1</rank>
            <year>2008</year>
            <gdppc>141100</gdppc>
            <neighborname="Austria"direction="E"/>
            <neighborname="Switzerland"direction="W"/>
        </country>
        <countryname="Singapore">
            <rank>4</rank>
            <year>2011</year>
            <gdppc>59900</gdppc>
            <neighborname="Malaysia"direction="N"/>
        </country>
        <countryname="Panama">
            <rank>68</rank>
            <year>2011</year>
            <gdppc>13600</gdppc>
            <neighborname="Costa Rica"direction="W"/>
            <neighborname="Colombia"direction="E"/>
        </country>
    </data>
    str_dict :{
      "data": {
        "country": [{
          "@name": "Liechtenstein",
          "rank": "1",
          "year": "2008",
          "gdppc": "141100",
          "neighbor": [{
            "@name": "Austria",
            "@direction": "E"
          }, {
            "@name": "Switzerland",
            "@direction": "W"
          }]
        }, {
          "@name": "Singapore",
          "rank": "4",
          "year": "2011",
          "gdppc": "59900",
          "neighbor": {
            "@name": "Malaysia",
            "@direction": "N"
          }
        }, {
          "@name": "Panama",
          "rank": "68",
          "year": "2011",
          "gdppc": "13600",
          "neighbor": [{
            "@name": "Costa Rica",
            "@direction": "W"
          }, {
            "@name": "Colombia",
            "@direction": "E"
          }]
        }]
      }
    }