Transform logs in specific text formats - Simple Log Service

The practice cases in this topic are based on the actual data transformation requirements submitted in tickets during daily work. This topic describes how to use LOG domain specific language (DSL) orchestration to transform logs to meet the requirements.

Convert non-standard JSON objects to JSON data and spread the objects

Assume that you want to perform secondary nesting on the collected dictionary data and spread the data. You can convert the dictionary data to JSON data and then use the e_json function to spread the data.

Raw log

content: {
  'referer': '-',
  'request': 'GET /phpMyAdmin',
  'status': 404,
  'data-1': {
    'aaa': 'Mozilla',
    'bbb': 'asde'
  },
  'data-2': {
    'up_adde': '-',
    'up_host': '-'
  }
}

Data transformation statement

Convert the single quotation marks in the content field to double quotation marks and transform the log into JSON format data.

e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))

The log after processing is as follows:

content: {
  'referer': '-',
  'request': 'GET /phpMyAdmin',
  'status': 404,
  'data-1': {
    'aaa': 'Mozilla',
    'bbb': 'asde'
  },
  'data-2': {
    'up_adde': '-',
    'up_host': '-'
  }
}
content_json:  {
  "referer": "-",
  "request": "GET /phpMyAdmin",
  "status": 404,
  "data-1": {
    "aaa": "Mozilla",
    "bbb": "asde"
  },
  "data-2": {
    "up_adde": "-",
    "up_host": "-"
  }
}

Spread the standardized content_json data generated after the preceding processing. For example, set depth in the JSON data to 1 to spread data at the first layer.

e_json("content_json",depth=1,fmt='full')

The log after spreading is as follows:

content_json.data-1.data-1:  {"aaa": "Mozilla", "bbb": "asde"}
content_json.data-2.data-2:  {"up_adde": "-", "up_host": "-"}
content_json.referer:  -
content_json.request:  GET /phpMyAdmin
content_json.status:  404

If you set depth to 2, the log after spreading is as follows:

content_json.data-1.aaa:  Mozilla
content_json.data-1.bbb:  asde
content_json.data-2.up_adde:  -
content_json.data-2.up_host:  -
content_json.referer:  -
content_json.request:  GET /phpMyAdmin
content_json.status:  404

To sum up, use the following LOG DSL rules:

e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))
e_json("content_json",depth=2,fmt='full')

Log after transformation

After the log is transformed by setting depth to 2, the following log is generated:

content:  {
  'referer': '-',
  'request': 'GET /phpMyAdmin',
  'status': 404,
  'data-1': {
    'aaa': 'Mozilla',
    'bbb': 'asde'
  },
  'data-2': {
    'up_adde': '-',
    'up_host': '-'
  }
}
content_json:  {
  "referer": "-",
  "request": "GET /phpMyAdmin",
  "status": 404,
  "data-1": {
    "aaa": "Mozilla",
    "bbb": "asde"
  },
  "data-2": {
    "up_adde": "-",
    "up_host": "-"
  }
}
content_json.data-1.aaa:  Mozilla
content_json.data-1.bbb:  asde
content_json.data-2.up_adde:  -
content_json.data-2.up_host:  -
content_json.referer:  -
content_json.request:  GET /phpMyAdmin
content_json.status:  404

Convert logs in other text formats to JSON and spread the data

To spread non-standard JSON data, you can flexibly combine rules.

Raw log

content : {
  "pod" => {
    "name" => "crm-learning-follow-7bc48f8b6b-m6kgb"
  }, "node" => {
    "name" => "tw5"
  }, "labels" => {
    "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow"
  }, "container" => {
    "name" => "crm-learning-follow"
  }, "namespace" => "testing1"
}

Data transformation statement

Convert the log to the JSON format by using the str_logtash_config_normalize function.
```
e_set("normalize_data",str_logtash_config_normalize(v("content")))
```

Use a JSON function to spread the data.

e_json("normalize_data",depth=1,fmt='full')

To sum up, use the following LOG DSL rules:

e_set("normalize_data",str_logtash_config_normalize(v("content")))
e_json("normalize_data",depth=1,fmt='full')

Log after transformation

content : {
  "pod" => {
    "name" => "crm-learning-follow-7bc48f8b6b-m6kgb"
  }, "node" => {
    "name" => "tw5"
  }, "labels" => {
    "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow"
  }, "container" => {
    "name" => "crm-learning-follow"
  }, "namespace" => "testing1"
}
normalize_data:  {
  "pod": {
    "name": "crm-learning-follow-7bc48f8b6b-m6kgb"
  },
  "node": {
    "name": "tw5"
  },
  "labels": {
    "pod-template-hash": "7bc48f8b6b",
    "app": "crm-learning-follow"
  },
  "container": {
    "name": "crm-learning-follow"
  },
  "namespace": "testing1"
}
normalize_data.container.container:  {"name": "crm-learning-follow"}
normalize_data.labels.labels:  {"pod-template-hash": "7bc48f8b6b", "app": "crm-learning-follow"}
normalize_data.namespace:  testing1
normalize_data.node.node:  {"name": "tw5"}
normalize_data.pod.pod:  {"name": "crm-learning-follow-7bc48f8b6b-m6kgb"}

Convert text written in special encoding formats

Hexadecimal characters that are recorded in daily work need to be decoded before they can be read. Use the str_hex_escape_encode function to perform the escape operation on hexadecimal characters.

Raw log
```
content : "\xe4\xbd\xa0\xe5\xa5\xbd"
```

Data transformation statement

e_set("hex_encode",str_hex_escape_encode(v("content")))

Log after transformation

content : "\xe4\xbd\xa0\xe5\xa5\xbd"
hex_encode : "Hello"

Spread XML fields

You may encounter various types of data during your daily work, such as XML data. To convert XML data to JSON, use the xml_to_json function.

Test log

str : <? xmlversion="1.0"? >
<data>
    <countryname="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighborname="Austria"direction="E"/>
        <neighborname="Switzerland"direction="W"/>
    </country>
    <countryname="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighborname="Malaysia"direction="N"/>
    </country>
    <countryname="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighborname="Costa Rica"direction="W"/>
        <neighborname="Colombia"direction="E"/>
    </country>
</data>

Data transformation statement
```
e_set("str_json",xml_to_json(v("str")))
```

Log after transformation

str : <? xmlversion="1.0"? >
<data>
    <countryname="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighborname="Austria"direction="E"/>
        <neighborname="Switzerland"direction="W"/>
    </country>
    <countryname="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighborname="Malaysia"direction="N"/>
    </country>
    <countryname="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighborname="Costa Rica"direction="W"/>
        <neighborname="Colombia"direction="E"/>
    </country>
</data>
str_dict :{
  "data": {
    "country": [{
      "@name": "Liechtenstein",
      "rank": "1",
      "year": "2008",
      "gdppc": "141100",
      "neighbor": [{
        "@name": "Austria",
        "@direction": "E"
      }, {
        "@name": "Switzerland",
        "@direction": "W"
      }]
    }, {
      "@name": "Singapore",
      "rank": "4",
      "year": "2011",
      "gdppc": "59900",
      "neighbor": {
        "@name": "Malaysia",
        "@direction": "N"
      }
    }, {
      "@name": "Panama",
      "rank": "68",
      "year": "2011",
      "gdppc": "13600",
      "neighbor": [{
        "@name": "Costa Rica",
        "@direction": "W"
      }, {
        "@name": "Colombia",
        "@direction": "E"
      }]
    }]
  }
}