Friday, September 6, 2024

How to get info about relations between system tables?

One my customer asked me "what tables holds references to table pg_roles"?

The reply is simple - none. pg_roles is view. But this view uses very important table pg_authid. For custom tables we can use a query:

SELECT conname, conrelid::pg_catalog.regclass AS ontable,
       pg_catalog.pg_get_constraintdef(oid, true) AS condef
  FROM pg_catalog.pg_constraint c
 WHERE confrelid = 'a'::pg_catalog.regclass
       AND contype = 'f'
ORDER BY conname;
┌─────────────┬─────────┬─────────────────────────────────────┐
│   conname   │ ontable │               condef                │
╞═════════════╪═════════╪═════════════════════════════════════╡
│ b_a_id_fkey │ b       │ FOREIGN KEY (a_id) REFERENCES a(id) │
└─────────────┴─────────┴─────────────────────────────────────┘
(1 row)
But this method doesn't work for system tables. These tables doesn't use explicitly defined foreign keys. We should to use different method. The system function pg_get_catalog_foreign_keys returns all referencies between system tables, and we can filter result:
SELECT * 
 FROM pg_get_catalog_foreign_keys()
WHERE pktable = 'pg_authid'::regclass;

┌─────────────────────────┬──────────────┬───────────┬────────┬──────────┬────────┐
│         fktable         │    fkcols    │  pktable  │ pkcols │ is_array │ is_opt │
╞═════════════════════════╪══════════════╪═══════════╪════════╪══════════╪════════╡
│ pg_proc                 │ {proowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_type                 │ {typowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_class                │ {relowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_operator             │ {oprowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_opfamily             │ {opfowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_opclass              │ {opcowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_language             │ {lanowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_largeobject_metadata │ {lomowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_statistic_ext        │ {stxowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_event_trigger        │ {evtowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_namespace            │ {nspowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_conversion           │ {conowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_database             │ {datdba}     │ pg_authid │ {oid}  │ f        │ f      │
│ pg_db_role_setting      │ {setrole}    │ pg_authid │ {oid}  │ f        │ t      │
│ pg_tablespace           │ {spcowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_auth_members         │ {roleid}     │ pg_authid │ {oid}  │ f        │ f      │
│ pg_auth_members         │ {member}     │ pg_authid │ {oid}  │ f        │ f      │
│ pg_auth_members         │ {grantor}    │ pg_authid │ {oid}  │ f        │ f      │
│ pg_ts_config            │ {cfgowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_ts_dict              │ {dictowner}  │ pg_authid │ {oid}  │ f        │ f      │
│ pg_extension            │ {extowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_foreign_data_wrapper │ {fdwowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_foreign_server       │ {srvowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_user_mapping         │ {umuser}     │ pg_authid │ {oid}  │ f        │ t      │
│ pg_policy               │ {polroles}   │ pg_authid │ {oid}  │ t        │ t      │
│ pg_default_acl          │ {defaclrole} │ pg_authid │ {oid}  │ f        │ f      │
│ pg_collation            │ {collowner}  │ pg_authid │ {oid}  │ f        │ f      │
│ pg_publication          │ {pubowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_subscription         │ {subowner}   │ pg_authid │ {oid}  │ f        │ f      │
│ pg_variable             │ {varowner}   │ pg_authid │ {oid}  │ f        │ f      │
└─────────────────────────┴──────────────┴───────────┴────────┴──────────┴────────┘
(30 rows)

Thursday, June 6, 2024

More complex log in json format processing

 I needed to summarize number of bugs per week per user

JSON log entry looks like:
{"timestamp":"2024-06-07 00:10:56.525 CEST","user":"backend","dbname":"prd","pid":3557301,"remote_host":"199.16.203.8",
"remote_port":31315,"session_id":"666229de.3647b5","line_num":3,"ps":"INSERT","session_start":"2024-06-06 23:27:58 CEST","vxid":"67/48252790",
"txid":2033150293,"error_severity":"ERROR","state_code":"23505","message":"duplicate key value violates unique constraint \"payment_from_account_trid_key\"",
"detail":"Key (trid)=(440607272834519) already exists.","statement":"insert into ... values ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10)\nRETURNING *","application_name":"PostgreSQL JDBC Driver",
"backend_type":"client backend","query_id":8536052635720301574}
cat postgresql-*.json | \
jq -r 'select(.error_severity=="ERROR") | select (.application_name | (startswith("pgAdmin") or startswith("IntelliJ"))) ' | \
jq -rs 'group_by (.user, .message) | map({user: .[0].user, message: .[0].message, count: length}) | .[] | [.count, .user, .message] | @tsv'
I had to learn so grouping inside jq is allowed only on json array (and result is an array again), so I needed to merge input records to array by using -s, --slurp option. For final processing, I need to translate result array just to set of records by syntax .[]. Transformation to tsv (tab separated data) are used for better readability than csv. result in tsv format can looks like:
1	lukas	relation "item_image" does not exist
1	lukas	relation "item_image_share" does not exist
3	petr	UNION types "char" and text cannot be matched
2	petr	canceling statement due to user request
6	petr	current transaction is aborted, commands ignored until end of transaction block
1	petr	duplicate key value violates unique constraint "idx_op"
1	prd_jenkins_flyway	relation "item_shipping" does not exist
jq language is powerful and strange. It is strange to use | pipe symbol inside an expression - like filter on application_name looks:
.application_name | startswith("xxx") or startswith("yyyy")
Unfortunately, I didn't find a documentation named "gentle introduction to jq for people who knows SQL and C", so using jq language looks scary , but it is working pretty well.

Wednesday, March 20, 2024

compiled win x64 dll of orafce-4.9.3 and plpgsql_check-2.7.4 for PostgreSQL 15 and 16

I compiled and uploaded zip files with latest orafce and plpgsql_check for PostgreSQL 15 and PostgreSQL 16.

Setup:

  1. download orafce-4.9.3-x64.zip or plpgsql_check-2.7.4-x64.zip and extract files
  2. copy related dll file to PostgreSQL lib directory (NN is number of pg release)
    orafce-NN.dll -> "c:\Program Files\PostgreSQL\NN\lib"
  3. remove suffix "-15" or "-16" from dll file
    orafce-NN.dll -> orafce.dll
  4. copy *.sql and *.control files to extension directory
    *.sql, *.control -> "c:\Program Files\PostgreSQL\NN\share\extension"
  5. execute with super user rights SQL command CREATE EXTENSION
    CREATE EXTENSION orafce;

Tuesday, March 5, 2024

How fast is plpgsql?

Ten years ago I did some speed tests of plpgsql. I did same tests on same computer, and now, the plpgsql is about 3-4 times faster. There is still significant overhead against native (without plpgsql function) query, but it is reduced (little bit). I would to compare again with Python

create table foo(a int, b int, c int, d int, e int);
insert into foo select random()*100, random()*100, random()*100, random()*100, random()*100 from generate_series(1,100000);

CREATE OR REPLACE FUNCTION public.myleast1(VARIADIC integer[])
 RETURNS integer LANGUAGE plpgsql IMMUTABLE STRICT
AS $function$
declare result int;
a int;
begin
  foreach a in array $1
  loop
    if result is null then 
      result := a; 
    elseif a < result then
      result := a;
    end if;
  end loop;
  return result;
end;
$function$;

CREATE OR REPLACE FUNCTION public.myleast6(VARIADIC a integer[])
RETURNS integer LANGUAGE plpython3u IMMUTABLE STRICT
AS $function$
r = None
for x in a:
   if r is None or x < r:
      r = x
return r
$function$;

(2024-03-05 12:25:10) postgres=# select count(*) filter (where a = least(a,b,c,d,e)) from foo;
┌───────┐
│ count │
╞═══════╡
│ 20544 │
└───────┘
(1 row)

Time: 15,265 ms
(2024-03-05 12:25:53) postgres=# select count(*) filter (where a = myleast1(a,b,c,d,e)) from foo;
┌───────┐
│ count │
╞═══════╡
│ 20544 │
└───────┘
(1 row)

Time: 221,726 ms
(2024-03-05 12:25:59) postgres=# select count(*) filter (where a = myleast6(a,b,c,d,e)) from foo;
┌───────┐
│ count │
╞═══════╡
│ 20544 │
└───────┘
(1 row)

Time: 282,732 ms

Ten years ago plpgsql was about 30% slower, now it is about 20% faster.

Attention - this benchmark is pretty unrealistic - the best benefit of stored procedures is when they are used like glue of SQL statements. On second hand, can be interesting to know the overhead of stored procedures against native C implementation - this is like worst case.

Sunday, March 3, 2024

using jq for processing PostgreSQL logs in json format

PostgreSQL supports logging in json format. From my perspective json logs are badly readable, but allows machine processing, and with good tools, it is beautifully simple.

There are more tools for json processing - I use jq.

For simple analyze of errors in log, I can use sequence of commands:

For transformation to csv and viewing it in pspg:

cat postgresql-Sun.json | \
jq -r 'select(.error_severity=="ERROR") | [.timestamp, .user, .ps, .error_severity, .message ] | @csv' \
| pspg --csv

With these tools the work with log is "almost" effective and friendly (pspg supports sorting, searching, clipboard).

 

Thursday, February 8, 2024

new extension pgmeminfo

I wrote very simple extension pgmeminfo for PostgreSQL12+, that should to help with investigation of memory usage by Postgres. Function pgmeminfo returns glibc information about memory usage provided by function mallinfo(). It is much more precious than usage top or htop.

Second function pgmeminfo_contexts returns data about memory contexts - this is PostgreSQL internal memory management. Same data can be showed by view pg_get_backend_memory_contexts. The function pgmeminfo_contexts can simply accumulate data, and can simply limit deep of iteration over memory contexts. It is little bit faster than recursive query over view (but it is not too important).


(2024-02-08 19:16:12) postgres=# SELECT * FROM pgmeminfo();
┌─────────┬─────────┬────────┬───────┬────────┬─────────┬─────────┬──────────┬──────────┬──────────┐
│  arena  │ ordblks │ smblks │ hblks │ hblkhd │ usmblks │ fsmblks │ uordblks │ fordblks │ keepcost │
╞═════════╪═════════╪════════╪═══════╪════════╪═════════╪═════════╪══════════╪══════════╪══════════╡
│ 1118208 │       7 │      0 │     2 │ 401408 │       0 │       0 │  1056160 │    62048 │    56368 │
└─────────┴─────────┴────────┴───────┴────────┴─────────┴─────────┴──────────┴──────────┴──────────┘
(1 row)
(2024-02-08 19:18:44) postgres=# select * from pgmeminfo_contexts(deep => 1, accum_mode=>'off');
┌──────────────────────────────┬──────────────────────────┬──────────────────┬───────┬─────────────┬───────────────┬────────────┬────────────┐
│ name │ ident │ parent │ level │ total_bytes │ total_nblocks │ free_bytes │ used_bytes │
╞══════════════════════════════╪══════════════════════════╪══════════════════╪═══════╪═════════════╪═══════════════╪════════════╪════════════╡
│ TopMemoryContext │ │ │ 0 │ 97696 │ 5 │ 9840 │ 87856 │
│ TopTransactionContext │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 7760 │ 432 │
│ dynahash │ CFuncHash │ TopMemoryContext │ 1 │ 8192 │ 1 │ 576 │ 7616 │
│ dynahash │ Record information cache │ TopMemoryContext │ 1 │ 8192 │ 1 │ 1600 │ 6592 │
│ dynahash │ TableSpace cache │ TopMemoryContext │ 1 │ 8192 │ 1 │ 2112 │ 6080 │
│ RegexpCacheMemoryContext │ │ TopMemoryContext │ 1 │ 1024 │ 1 │ 784 │ 240 │
│ dynahash │ Type information cache │ TopMemoryContext │ 1 │ 24384 │ 2 │ 2640 │ 21744 │
│ dynahash │ Operator lookup cache │ TopMemoryContext │ 1 │ 24576 │ 2 │ 10776 │ 13800 │
│ RowDescriptionContext │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 6912 │ 1280 │
│ MessageContext │ │ TopMemoryContext │ 1 │ 32768 │ 3 │ 10392 │ 22376 │
│ search_path processing cache │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 5616 │ 2576 │
│ dynahash │ Operator class cache │ TopMemoryContext │ 1 │ 8192 │ 1 │ 576 │ 7616 │
│ dynahash │ smgr relation table │ TopMemoryContext │ 1 │ 32768 │ 3 │ 16848 │ 15920 │
│ PgStat Shared Ref Hash │ │ TopMemoryContext │ 1 │ 7232 │ 2 │ 704 │ 6528 │
│ PgStat Shared Ref │ │ TopMemoryContext │ 1 │ 8192 │ 4 │ 2952 │ 5240 │
│ PgStat Pending │ │ TopMemoryContext │ 1 │ 16384 │ 5 │ 15984 │ 400 │
│ TransactionAbortContext │ │ TopMemoryContext │ 1 │ 32768 │ 1 │ 32528 │ 240 │
│ dynahash │ Portal hash │ TopMemoryContext │ 1 │ 8192 │ 1 │ 576 │ 7616 │
│ TopPortalContext │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 7680 │ 512 │
│ dynahash │ Relcache by OID │ TopMemoryContext │ 1 │ 16384 │ 2 │ 3528 │ 12856 │
│ CacheMemoryContext │ │ TopMemoryContext │ 1 │ 1048576 │ 8 │ 399064 │ 649512 │
│ WAL record construction │ │ TopMemoryContext │ 1 │ 49976 │ 2 │ 6384 │ 43592 │
│ dynahash │ PrivateRefCount │ TopMemoryContext │ 1 │ 8192 │ 1 │ 2640 │ 5552 │
│ MdSmgr │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 7016 │ 1176 │
│ dynahash │ LOCALLOCK hash │ TopMemoryContext │ 1 │ 16384 │ 2 │ 4616 │ 11768 │
│ GUCMemoryContext │ │ TopMemoryContext │ 1 │ 24576 │ 2 │ 12000 │ 12576 │
│ dynahash │ Timezones │ TopMemoryContext │ 1 │ 104128 │ 2 │ 2640 │ 101488 │
│ ErrorContext │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 7952 │ 240 │
└──────────────────────────────┴──────────────────────────┴──────────────────┴───────┴─────────────┴───────────────┴────────────┴────────────┘
(28 rows)

(2024-02-08 19:18:46) postgres=# select * from pgmeminfo_contexts();
┌──────────────────┬───────┬────────┬───────┬─────────────┬───────────────┬────────────┬────────────┐
│ name │ ident │ parent │ level │ total_bytes │ total_nblocks │ free_bytes │ used_bytes │
╞══════════════════╪═══════╪════════╪═══════╪═════════════╪═══════════════╪════════════╪════════════╡
│ TopMemoryContext │ │ │ 0 │ 1945816 │ 248 │ 690856 │ 1254960 │
└──────────────────┴───────┴────────┴───────┴─────────────┴───────────────┴────────────┴────────────┘
(1 row)

Friday, December 29, 2023

2023 resume

  • rewriting dbms_pipe and dbms_alert in orafce. Now the conditions variables are used (for synchronization)  instead timeouts, and the lags in the communication are significantly reduced,
  • fix pspg and PDCursesMod so pspg can be used with other curses than ncurses. Originally pdcurses didn't support stream redirection. Still only VT environment is supported (far target is support pspg on new Microsoft Windows terminal),
  • Rewriting background of profiler, tracer routines in plpgsql_check. Now related code is significantly more readable and more robust. I introduced new pldbgapi2 - it is based on old plpgsqlapi and fmgr hook. The advantage against plpgsqlapi2 is possibility to simply handle an exception,
  • tracing constants in plpgsql_check (for some simple cases) - it reduces some false alarms related to dynamic SQL, and related assert pragmas: assert-schema, assert-table and assert-column,
  • reorganization and minor refactoring long patches for postgres - mainly for support of session variables,
  • my patch of `filter` option for pg_dump was committed to upstream,
  • detection of opened cursors in plpgsql_check