Tuesday, March 5, 2024

How fast is plpgsql?

Ten years ago I did some speed tests of plpgsql. I did same tests on same computer, and now, the plpgsql is about 3-4 times faster. There is still significant overhead against native (without plpgsql function) query, but it is reduced (little bit). I would to compare again with Python

create table foo(a int, b int, c int, d int, e int);
insert into foo select random()*100, random()*100, random()*100, random()*100, random()*100 from generate_series(1,100000);

CREATE OR REPLACE FUNCTION public.myleast1(VARIADIC integer[])
 RETURNS integer LANGUAGE plpgsql IMMUTABLE STRICT
AS $function$
declare result int;
a int;
begin
  foreach a in array $1
  loop
    if result is null then 
      result := a; 
    elseif a < result then
      result := a;
    end if;
  end loop;
  return result;
end;
$function$;

CREATE OR REPLACE FUNCTION public.myleast6(VARIADIC a integer[])
RETURNS integer LANGUAGE plpython3u IMMUTABLE STRICT
AS $function$
r = None
for x in a:
   if r is None or x < r:
      r = x
return r
$function$;

(2024-03-05 12:25:10) postgres=# select count(*) filter (where a = least(a,b,c,d,e)) from foo;
┌───────┐
│ count │
╞═══════╡
│ 20544 │
└───────┘
(1 row)

Time: 15,265 ms
(2024-03-05 12:25:53) postgres=# select count(*) filter (where a = myleast1(a,b,c,d,e)) from foo;
┌───────┐
│ count │
╞═══════╡
│ 20544 │
└───────┘
(1 row)

Time: 221,726 ms
(2024-03-05 12:25:59) postgres=# select count(*) filter (where a = myleast6(a,b,c,d,e)) from foo;
┌───────┐
│ count │
╞═══════╡
│ 20544 │
└───────┘
(1 row)

Time: 282,732 ms

Ten years ago plpgsql was about 30% slower, now it is about 20% faster.

Attention - this benchmark is pretty unrealistic - the best benefit of stored procedures is when they are used like glue of SQL statements. On second hand, can be interesting to know the overhead of stored procedures against native C implementation - this is like worst case.

Sunday, March 3, 2024

using jq for processing PostgreSQL logs in json format

PostgreSQL supports logging in json format. From my perspective json logs are badly readable, but allows machine processing, and with good tools, it is beautifully simple.

There are more tools for json processing - I use jq.

For simple analyze of errors in log, I can use sequence of commands:

For transformation to csv and viewing it in pspg:

cat postgresql-Sun.json | \
jq -r 'select(.error_severity=="ERROR") | [.timestamp, .user, .ps, .error_severity, .message ] | @csv' \
| pspg --csv

With these tools the work with log is "almost" effective and friendly (pspg supports sorting, searching, clipboard).

 

Thursday, February 8, 2024

new extension pgmeminfo

I wrote very simple extension pgmeminfo for PostgreSQL12+, that should to help with investigation of memory usage by Postgres. Function pgmeminfo returns glibc information about memory usage provided by function mallinfo(). It is much more precious than usage top or htop.

Second function pgmeminfo_contexts returns data about memory contexts - this is PostgreSQL internal memory management. Same data can be showed by view pg_get_backend_memory_contexts. The function pgmeminfo_contexts can simply accumulate data, and can simply limit deep of iteration over memory contexts. It is little bit faster than recursive query over view (but it is not too important).


(2024-02-08 19:16:12) postgres=# SELECT * FROM pgmeminfo();
┌─────────┬─────────┬────────┬───────┬────────┬─────────┬─────────┬──────────┬──────────┬──────────┐
│  arena  │ ordblks │ smblks │ hblks │ hblkhd │ usmblks │ fsmblks │ uordblks │ fordblks │ keepcost │
╞═════════╪═════════╪════════╪═══════╪════════╪═════════╪═════════╪══════════╪══════════╪══════════╡
│ 1118208 │       7 │      0 │     2 │ 401408 │       0 │       0 │  1056160 │    62048 │    56368 │
└─────────┴─────────┴────────┴───────┴────────┴─────────┴─────────┴──────────┴──────────┴──────────┘
(1 row)
(2024-02-08 19:18:44) postgres=# select * from pgmeminfo_contexts(deep => 1, accum_mode=>'off');
┌──────────────────────────────┬──────────────────────────┬──────────────────┬───────┬─────────────┬───────────────┬────────────┬────────────┐
│ name │ ident │ parent │ level │ total_bytes │ total_nblocks │ free_bytes │ used_bytes │
╞══════════════════════════════╪══════════════════════════╪══════════════════╪═══════╪═════════════╪═══════════════╪════════════╪════════════╡
│ TopMemoryContext │ │ │ 0 │ 97696 │ 5 │ 9840 │ 87856 │
│ TopTransactionContext │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 7760 │ 432 │
│ dynahash │ CFuncHash │ TopMemoryContext │ 1 │ 8192 │ 1 │ 576 │ 7616 │
│ dynahash │ Record information cache │ TopMemoryContext │ 1 │ 8192 │ 1 │ 1600 │ 6592 │
│ dynahash │ TableSpace cache │ TopMemoryContext │ 1 │ 8192 │ 1 │ 2112 │ 6080 │
│ RegexpCacheMemoryContext │ │ TopMemoryContext │ 1 │ 1024 │ 1 │ 784 │ 240 │
│ dynahash │ Type information cache │ TopMemoryContext │ 1 │ 24384 │ 2 │ 2640 │ 21744 │
│ dynahash │ Operator lookup cache │ TopMemoryContext │ 1 │ 24576 │ 2 │ 10776 │ 13800 │
│ RowDescriptionContext │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 6912 │ 1280 │
│ MessageContext │ │ TopMemoryContext │ 1 │ 32768 │ 3 │ 10392 │ 22376 │
│ search_path processing cache │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 5616 │ 2576 │
│ dynahash │ Operator class cache │ TopMemoryContext │ 1 │ 8192 │ 1 │ 576 │ 7616 │
│ dynahash │ smgr relation table │ TopMemoryContext │ 1 │ 32768 │ 3 │ 16848 │ 15920 │
│ PgStat Shared Ref Hash │ │ TopMemoryContext │ 1 │ 7232 │ 2 │ 704 │ 6528 │
│ PgStat Shared Ref │ │ TopMemoryContext │ 1 │ 8192 │ 4 │ 2952 │ 5240 │
│ PgStat Pending │ │ TopMemoryContext │ 1 │ 16384 │ 5 │ 15984 │ 400 │
│ TransactionAbortContext │ │ TopMemoryContext │ 1 │ 32768 │ 1 │ 32528 │ 240 │
│ dynahash │ Portal hash │ TopMemoryContext │ 1 │ 8192 │ 1 │ 576 │ 7616 │
│ TopPortalContext │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 7680 │ 512 │
│ dynahash │ Relcache by OID │ TopMemoryContext │ 1 │ 16384 │ 2 │ 3528 │ 12856 │
│ CacheMemoryContext │ │ TopMemoryContext │ 1 │ 1048576 │ 8 │ 399064 │ 649512 │
│ WAL record construction │ │ TopMemoryContext │ 1 │ 49976 │ 2 │ 6384 │ 43592 │
│ dynahash │ PrivateRefCount │ TopMemoryContext │ 1 │ 8192 │ 1 │ 2640 │ 5552 │
│ MdSmgr │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 7016 │ 1176 │
│ dynahash │ LOCALLOCK hash │ TopMemoryContext │ 1 │ 16384 │ 2 │ 4616 │ 11768 │
│ GUCMemoryContext │ │ TopMemoryContext │ 1 │ 24576 │ 2 │ 12000 │ 12576 │
│ dynahash │ Timezones │ TopMemoryContext │ 1 │ 104128 │ 2 │ 2640 │ 101488 │
│ ErrorContext │ │ TopMemoryContext │ 1 │ 8192 │ 1 │ 7952 │ 240 │
└──────────────────────────────┴──────────────────────────┴──────────────────┴───────┴─────────────┴───────────────┴────────────┴────────────┘
(28 rows)

(2024-02-08 19:18:46) postgres=# select * from pgmeminfo_contexts();
┌──────────────────┬───────┬────────┬───────┬─────────────┬───────────────┬────────────┬────────────┐
│ name │ ident │ parent │ level │ total_bytes │ total_nblocks │ free_bytes │ used_bytes │
╞══════════════════╪═══════╪════════╪═══════╪═════════════╪═══════════════╪════════════╪════════════╡
│ TopMemoryContext │ │ │ 0 │ 1945816 │ 248 │ 690856 │ 1254960 │
└──────────────────┴───────┴────────┴───────┴─────────────┴───────────────┴────────────┴────────────┘
(1 row)

Friday, December 29, 2023

2023 resume

  • rewriting dbms_pipe and dbms_alert in orafce. Now the conditions variables are used (for synchronization)  instead timeouts, and the lags in the communication are significantly reduced,
  • fix pspg and PDCursesMod so pspg can be used with other curses than ncurses. Originally pdcurses didn't support stream redirection. Still only VT environment is supported (far target is support pspg on new Microsoft Windows terminal),
  • Rewriting background of profiler, tracer routines in plpgsql_check. Now related code is significantly more readable and more robust. I introduced new pldbgapi2 - it is based on old plpgsqlapi and fmgr hook. The advantage against plpgsqlapi2 is possibility to simply handle an exception,
  • tracing constants in plpgsql_check (for some simple cases) - it reduces some false alarms related to dynamic SQL, and related assert pragmas: assert-schema, assert-table and assert-column,
  • reorganization and minor refactoring long patches for postgres - mainly for support of session variables,
  • my patch of `filter` option for pg_dump was committed to upstream,
  • detection of opened cursors in plpgsql_check 


Wednesday, December 6, 2023

Close cursors

One my customer reported very problematic memory issues of currently tested code freshly ported from Oracle's PL/SQL to PL/pgSQL. He rewrites very big application based on usage of thousands stored procedures and functions, and views. This application is ported to Postgres by ora2pg. It is working pretty well, but some patterns has different overhead, and some patterns can be fatal.

In your application he often (100 000x) call code (in long transaction):

OPEN qNAJUPOSPL FOR EXECUTE mSqNAJUPOSPL;
LOOP
  FETCH qNAJUPOSPL INTO mID_NAJVUPOSPL , mID_NAJDATSPLT , mID_PREDPIS;
  EXIT WHEN NOT FOUND; /* apply on qNAJUPOSPL */
END LOOP;

The problem is undisclosed cursor there. PLpgSQL cursors are just references to SQL cursors. The lifecycle of cursor variable is defined by scope, but lifecycle of SQL cursor is limited by transaction. Any active cursor can have allocated lot of resources (it is "snapshot" of active query). Unfortunately, after leaving of scope of cursor's variable, the related cursor is not closed automatically (and there is not some garbage collector too).  Inside large application is not easy to find this issue, mainly when you don't know what you should to find. So, I wrote new warning to plpgsql_check (2.7.0). This warning is enabled by default. 

Default check is designed to check previously opened cursor when cursor is opening.

CREATE OR REPLACE FUNCTION public.test()
 RETURNS void
 LANGUAGE plpgsql
AS $function$
declare
  c refcursor;
  q text;
  r record;
begin
  q := 'select * from pg_class limit 10';
  open c for execute q;
  loop
    fetch c into r;
    exit when not found;
    raise notice '%', r;
  end loop;
end;
$function$

do $$                            
begin             
  perform test();
  perform test();
end;
$$;
NOTICE:  (2619,pg_statistic,11,10029,0,10,2,2619,0,19,410,19,2840,t,f,p,r,31,0,f,f,f,f,f,t,n,f,0,728,1,{postgres=arwdDxt/postgres},,)
NOTICE:  (2619,pg_statistic,11,10029,0,10,2,2619,0,19,410,19,2840,t,f,p,r,31,0,f,f,f,f,f,t,n,f,0,728,1,{postgres=arwdDxt/postgres},,)
DO
When plpgsql_check is loaded, then the warning is displayed:
(2023-12-06 09:07:32) postgres=# load 'plpgsql_check';
LOAD
(2023-12-06 09:07:51) postgres=# do $$                
begin
  perform test();
  perform test();
end;
$$;
NOTICE:  (2619,pg_statistic,11,10029,0,10,2,2619,0,19,410,19,2840,t,f,p,r,31,0,f,f,f,f,f,t,n,f,0,728,1,{postgres=arwdDxt/postgres},,)
WARNING:  cursor is not closed
DETAIL:  PL/pgSQL function test() line 8 at OPEN
SQL statement "SELECT test()"
PL/pgSQL function inline_code_block line 4 at PERFORM
NOTICE:  (2619,pg_statistic,11,10029,0,10,2,2619,0,19,410,19,2840,t,f,p,r,31,0,f,f,f,f,f,t,n,f,0,728,1,{postgres=arwdDxt/postgres},,)
DO
There is possible to set strict mode, and then the cursors are checked at the function's exit:
(2023-12-06 09:07:52) postgres=# set plpgsql_check.strict_cursors_leaks to on;
SET
(2023-12-06 09:09:56) postgres=# do $$                                        
begin
  perform test();
  perform test();
end;
$$;
NOTICE:  (2619,pg_statistic,11,10029,0,10,2,2619,0,19,410,19,2840,t,f,p,r,31,0,f,f,f,f,f,t,n,f,0,728,1,{postgres=arwdDxt/postgres},,)
WARNING:  cursor is not closed
DETAIL:  PL/pgSQL function test() during function exit
SQL statement "SELECT test()"
PL/pgSQL function inline_code_block line 3 at PERFORM
NOTICE:  (2619,pg_statistic,11,10029,0,10,2,2619,0,19,410,19,2840,t,f,p,r,31,0,f,f,f,f,f,t,n,f,0,728,1,{postgres=arwdDxt/postgres},,)
WARNING:  cursor is not closed
DETAIL:  PL/pgSQL function test() during function exit
SQL statement "SELECT test()"
PL/pgSQL function inline_code_block line 4 at PERFORM
DO

Friday, October 13, 2023

compiled dll of plpgsql_check 2.5.4 and Orafce 4.7.0 for PostgreSQL 15 and 16

I compiled and uploaded zip files with latest orafce and plpgsql_check for PostgreSQL 15 and PostgreSQL 16.

Setup:

  1. download orafce-4.7.0-x64.zip or plpgsql_check-2.5.4-x64.zip and extract files
  2. copy related dll file to PostgreSQL lib directory (NN is number of pg release)
    orafce-NN.dll -> "c:\Program Files\PostgreSQL\NN\lib"
  3. remove suffix "-15" or "-16" from dll file
    orafce-NN.dll -> orafce.dll
  4. copy *.sql and *.control files to extension directory
    *.sql, *.control -> "c:\Program Files\PostgreSQL\NN\share\extension"
  5. execute with super user rights SQL command CREATE EXTENSION
    CREATE EXTENSION orafce;

Friday, December 2, 2022

prepared dll of orafce 4.0.1 and plpgsql_check 2.2.5 for PostgreSQL 14 and PostgreSQL 15

I compiled and uploaded zip files with latest orafce and plpgsql_check for PostgreSQL 14 and PostgreSQL 15.

Setup:

  1. download orafce-4.0.1.zip or plpgsql_check-2.2.5.zip and extract files
  2. copy related dll file to PostgreSQL lib directory (NN is number of pg release)
    orafce-NN.dll -> "c:\Program Files\PostgreSQL\NN\lib"
  3. remove suffix "-14" or "-15" from dll file
    orafce-NN.dll -> orafce.dll
  4. copy *.sql and *.control files to extension directory
    *.sql, *.control -> "c:\Program Files\PostgreSQL\NN\share\extension"
  5. execute with super user rights SQL command CREATE EXTENSION
    CREATE EXTENSION orafce;