More complex log in json format processing
I needed to summarize number of bugs per week per user
JSON log entry looks like:{"timestamp":"2024-06-07 00:10:56.525 CEST","user":"backend","dbname":"prd","pid":3557301,"remote_host":"199.16.203.8", "remote_port":31315,"session_id":"666229de.3647b5","line_num":3,"ps":"INSERT","session_start":"2024-06-06 23:27:58 CEST","vxid":"67/48252790", "txid":2033150293,"error_severity":"ERROR","state_code":"23505","message":"duplicate key value violates unique constraint \"payment_from_account_trid_key\"", "detail":"Key (trid)=(440607272834519) already exists.","statement":"insert into ... values ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10)\nRETURNING *","application_name":"PostgreSQL JDBC Driver", "backend_type":"client backend","query_id":8536052635720301574}
cat postgresql-*.json | \ jq -r 'select(.error_severity=="ERROR") | select (.application_name | (startswith("pgAdmin") or startswith("IntelliJ"))) ' | \ jq -rs 'group_by (.user, .message) | map({user: .[0].user, message: .[0].message, count: length}) | .[] | [.count, .user, .message] | @tsv'I had to learn so grouping inside
jq
is allowed only on json array (and result is an array again), so I needed to merge
input records to array by using -s
, --slurp
option. For final processing, I need to translate result array just
to set of records by syntax .[]
. Transformation to tsv (tab separated data) are used for better readability than csv.
result in tsv format can looks like:
1 lukas relation "item_image" does not exist 1 lukas relation "item_image_share" does not exist 3 petr UNION types "char" and text cannot be matched 2 petr canceling statement due to user request 6 petr current transaction is aborted, commands ignored until end of transaction block 1 petr duplicate key value violates unique constraint "idx_op" 1 prd_jenkins_flyway relation "item_shipping" does not exist
jq
language is powerful and strange. It is strange to use |
pipe symbol inside an expression - like
filter on application_name
looks:
.application_name | startswith("xxx") or startswith("yyyy")Unfortunately, I didn't find a documentation named "gentle introduction to jq for people who knows SQL and C", so using jq language looks scary , but it is working pretty well.