The practice cases in this topic are based on the actual data transformation requirements submitted in tickets during daily work. This topic describes how to use LOG domain specific language (DSL) orchestration to transform logs to meet the requirements.
Convert non-standard JSON objects to JSON data and spread the objects
Assume that you want to perform secondary nesting on the collected dictionary data and spread the data. You can convert the dictionary data to JSON data and then use the e_json
function to spread the data.
Raw log
content: { 'referer': '-', 'request': 'GET /phpMyAdmin', 'status': 404, 'data-1': { 'aaa': 'Mozilla', 'bbb': 'asde' }, 'data-2': { 'up_adde': '-', 'up_host': '-' } }
Data transformation statement
Convert the single quotation marks in the
content
field to double quotation marks and transform the log into JSON format data.e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))
The log after processing is as follows:
content: { 'referer': '-', 'request': 'GET /phpMyAdmin', 'status': 404, 'data-1': { 'aaa': 'Mozilla', 'bbb': 'asde' }, 'data-2': { 'up_adde': '-', 'up_host': '-' } } content_json: { "referer": "-", "request": "GET /phpMyAdmin", "status": 404, "data-1": { "aaa": "Mozilla", "bbb": "asde" }, "data-2": { "up_adde": "-", "up_host": "-" } }
Spread the standardized
content_json
data generated after the preceding processing. For example, setdepth
in the JSON data to 1 to spread data at the first layer.e_json("content_json",depth=1,fmt='full')
The log after spreading is as follows:
content_json.data-1.data-1: {"aaa": "Mozilla", "bbb": "asde"} content_json.data-2.data-2: {"up_adde": "-", "up_host": "-"} content_json.referer: - content_json.request: GET /phpMyAdmin content_json.status: 404
If you set
depth
to 2, the log after spreading is as follows:content_json.data-1.aaa: Mozilla content_json.data-1.bbb: asde content_json.data-2.up_adde: - content_json.data-2.up_host: - content_json.referer: - content_json.request: GET /phpMyAdmin content_json.status: 404
To sum up, use the following LOG DSL rules:
e_set("content_json",str_replace(ct_str(v("content")),"'",'"')) e_json("content_json",depth=2,fmt='full')
Log after transformation
After the log is transformed by setting
depth
to 2, the following log is generated:content: { 'referer': '-', 'request': 'GET /phpMyAdmin', 'status': 404, 'data-1': { 'aaa': 'Mozilla', 'bbb': 'asde' }, 'data-2': { 'up_adde': '-', 'up_host': '-' } } content_json: { "referer": "-", "request": "GET /phpMyAdmin", "status": 404, "data-1": { "aaa": "Mozilla", "bbb": "asde" }, "data-2": { "up_adde": "-", "up_host": "-" } } content_json.data-1.aaa: Mozilla content_json.data-1.bbb: asde content_json.data-2.up_adde: - content_json.data-2.up_host: - content_json.referer: - content_json.request: GET /phpMyAdmin content_json.status: 404
Convert logs in other text formats to JSON and spread the data
To spread non-standard JSON data, you can flexibly combine rules.
Raw log
content : { "pod" => { "name" => "crm-learning-follow-7bc48f8b6b-m6kgb" }, "node" => { "name" => "tw5" }, "labels" => { "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow" }, "container" => { "name" => "crm-learning-follow" }, "namespace" => "testing1" }
Data transformation statement
Convert the log to the JSON format by using the
str_logtash_config_normalize
function.e_set("normalize_data",str_logtash_config_normalize(v("content")))
Use a JSON function to spread the data.
e_json("normalize_data",depth=1,fmt='full')
To sum up, use the following LOG DSL rules:
e_set("normalize_data",str_logtash_config_normalize(v("content"))) e_json("normalize_data",depth=1,fmt='full')
Log after transformation
content : { "pod" => { "name" => "crm-learning-follow-7bc48f8b6b-m6kgb" }, "node" => { "name" => "tw5" }, "labels" => { "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow" }, "container" => { "name" => "crm-learning-follow" }, "namespace" => "testing1" } normalize_data: { "pod": { "name": "crm-learning-follow-7bc48f8b6b-m6kgb" }, "node": { "name": "tw5" }, "labels": { "pod-template-hash": "7bc48f8b6b", "app": "crm-learning-follow" }, "container": { "name": "crm-learning-follow" }, "namespace": "testing1" } normalize_data.container.container: {"name": "crm-learning-follow"} normalize_data.labels.labels: {"pod-template-hash": "7bc48f8b6b", "app": "crm-learning-follow"} normalize_data.namespace: testing1 normalize_data.node.node: {"name": "tw5"} normalize_data.pod.pod: {"name": "crm-learning-follow-7bc48f8b6b-m6kgb"}
Convert text written in special encoding formats
Hexadecimal characters that are recorded in daily work need to be decoded before they can be read. Use the str_hex_escape_encode
function to perform the escape operation on hexadecimal characters.
Raw log
content : "\xe4\xbd\xa0\xe5\xa5\xbd"
Data transformation statement
e_set("hex_encode",str_hex_escape_encode(v("content")))
Log after transformation
content : "\xe4\xbd\xa0\xe5\xa5\xbd" hex_encode : "Hello"
Spread XML fields
You may encounter various types of data during your daily work, such as XML data. To convert XML data to JSON, use the xml_to_json
function.
Test log
str : <? xmlversion="1.0"? > <data> <countryname="Liechtenstein"> <rank>1</rank> <year>2008</year> <gdppc>141100</gdppc> <neighborname="Austria"direction="E"/> <neighborname="Switzerland"direction="W"/> </country> <countryname="Singapore"> <rank>4</rank> <year>2011</year> <gdppc>59900</gdppc> <neighborname="Malaysia"direction="N"/> </country> <countryname="Panama"> <rank>68</rank> <year>2011</year> <gdppc>13600</gdppc> <neighborname="Costa Rica"direction="W"/> <neighborname="Colombia"direction="E"/> </country> </data>
Data transformation statement
e_set("str_json",xml_to_json(v("str")))
Log after transformation
str : <? xmlversion="1.0"? > <data> <countryname="Liechtenstein"> <rank>1</rank> <year>2008</year> <gdppc>141100</gdppc> <neighborname="Austria"direction="E"/> <neighborname="Switzerland"direction="W"/> </country> <countryname="Singapore"> <rank>4</rank> <year>2011</year> <gdppc>59900</gdppc> <neighborname="Malaysia"direction="N"/> </country> <countryname="Panama"> <rank>68</rank> <year>2011</year> <gdppc>13600</gdppc> <neighborname="Costa Rica"direction="W"/> <neighborname="Colombia"direction="E"/> </country> </data> str_dict :{ "data": { "country": [{ "@name": "Liechtenstein", "rank": "1", "year": "2008", "gdppc": "141100", "neighbor": [{ "@name": "Austria", "@direction": "E" }, { "@name": "Switzerland", "@direction": "W" }] }, { "@name": "Singapore", "rank": "4", "year": "2011", "gdppc": "59900", "neighbor": { "@name": "Malaysia", "@direction": "N" } }, { "@name": "Panama", "rank": "68", "year": "2011", "gdppc": "13600", "neighbor": [{ "@name": "Costa Rica", "@direction": "W" }, { "@name": "Colombia", "@direction": "E" }] }] } }