Monday, May 18, 2015

faster plpgsql in PostgreSQL 9.5 (second part)

Tom Lane introduced new state for TOAST types - expanded state. The side effect of this change together with some changes in plpgsql internals has big impact to speed of array operation in plpgsql.
I have two plpgsql block with cycles. First enforces a array fields update, second a array append.
DO $$ 
DECLARE a int[] = array_fill(10, ARRAY[10000]);
BEGIN
FOR i IN 1 .. 10000 LOOP
  a[i] := 0;
END LOOP;
END;
$$;

DO $$ 
DECLARE a int[] = '{}';
BEGIN
FOR i IN 1 .. 10000 LOOP
  a := a || 10;
END LOOP;
END;
$$;
You can try this code on PostgreSQL 9.4 - and you can get time 450 and 220ms. Same code needs 6 and 5 ms only on PostgreSQL 9.5! It is more than one order speedup.

Sunday, May 10, 2015

faster plpgsql in PostgreSQL 9.5 (first part)

The PL/pgSQL is important part of PostgreSQL. A implementation of plpgsql is terrible simple, what has two impacts (one positive, one negative).
  • We are able to implement new features very simply - implementation of some new functionality like ASSER or RAISE WITH CONTEXT needs only few hours.
  • Some usual features from other languages are missing or it its implementation is suboptimal. But there are significant progress in prepared PostgreSQL 9.5.
One less known issue of PLpgSQL is casting. PLpgSQL uses IO casting - every time, when type of variables or typmod isnot identical, plpgsql runtime transforms values from source type to text and back to target type. This IO casting is very simple for implementation, but it should be very hidden performance killer. plpgsql_check can raise warnings about it. You can see a impact of this IO cast on following test:
-- tested on PostgreSQL 9.4 (synthetic test, worst case) 
-- IO cast
postgres=# do $$declare s numeric = 0; begin for i in 1..10000000 loop s := i; end loop; end $$;
DO
Time: 6708.727 ms
-- enforced binary casts
postgres=# do $$declare s numeric = 0; begin for i in 1..10000000 loop s := i::numeric; end loop; end $$;
DO
Time: 4738.093 ms
    Tom Lane wrote and commited patch, that enforce binary casting everywhere it is possible. When it is impossible, then IO casting is used as before.
    -- tested on PostgreSQL 9.5
    postgres=# do $$declare s numeric = 0; begin for i in 1..10000000 loop s := i; end loop; end $$;
    DO
    Time: 3992.551 ms
    postgres=# do $$declare s numeric = 0; begin for i in 1..10000000 loop s := i::numeric; end loop; end $$;
    DO
    Time: 3693.739 ms
    
    There is still some overhead, but it is significantly less 30% versus 7.5%. Other positive - PostgreSQL is in this test about 30% faster (it is synthetic test, so reality should be different, but, the expectation so 9.5 will be faster than 9.4 is valid).

    Saturday, March 14, 2015

    long term monitoring communication between PostgreSQL client and PostgreSQL server

    We have some issue with our application and pgbouncer. We detect some new errors with very low frequency. One possibility how to detect a reason of these issues is monitoring the communication between our application and Postgres. I found a great tool pgShark. But I had to solve two issues.
    1. I have to reduce logged content - lot of messages are unimportant for my purpose or generate lot of content.  pgs-debug hasn't any option for it, so I had to modify source code. You can comment unwanted method. I disabled: Bind, BindComplete, CopyData, DataRow, Describe, Parse, ParseComplete, RowDescription, Sync. After this change the compressed log was few GB per day.
    2. I had a output (log) with attached timestamp. I can do it simply in bash:
      | while read line; do echo `date +"%T.%3N"` $line; done | 
    I wrote a line:
    unbuffer ./pgs-debug --host 127.0.0.1 -i lo --port 6432 | while read line; do echo `date +"%d %T.%3N"` $line; done | gzip > /mnt/large/pgsharklog.gz

    It does what I need:
    12:55:13.407 P=1425556513.403313, s=288765600856048 type=SSLRequest, F -> B
    12:55:13.408 SSL REQUEST
    12:55:13.409
    12:55:13.411 P=1425556513.403392, s=288765600856048 type=SSLAnswer, B -> F
    12:55:13.412 SSL BACKEND ANSWER: N
    12:55:13.414
    12:55:13.415 P=1425556513.403486, s=288765600856048 type=StartupMessage, F -> B
    12:55:13.416 STARTUP MESSAGE version: 3
    12:55:13.418 database=db_lc3hfmn22q8vdt6mhopr2wj4zskyaous
    12:55:13.419 application_name=starjoin
    12:55:13.420 user=beard
    12:55:13.421
    12:55:13.423 P=1425556513.403526, s=288765600856048 type=AuthenticationMD5Password, B -> F
    12:55:13.424 AUTHENTIFICATION REQUEST code=5 (MD5 salt='fe45f1a1')
    12:55:13.425
    12:55:13.426 P=1425556513.403577, s=288765600856048 type=PasswordMessage, F -> B
    12:55:13.428 PASSWORD MESSAGE password=md5a0cd0711e0e191467bca6e94c03fb50f
    12:55:13.429
    12:55:13.430 P=1425556513.403614, s=288765600856048 type=AuthenticationOk, B -> F
    12:55:13.431 AUTHENTIFICATION REQUEST code=0 (SUCCESS)
    12:55:13.433
    12:55:13.434 P=1425556513.403614, s=288765600856048 type=ParameterStatus, B -> F
    12:55:13.435 PARAMETER STATUS name='integer_datetimes', value='on'
    12:55:13.436
    12:55:13.437 P=1425556513.403614, s=288765600856048 type=ParameterStatus, B -> F
    12:55:13.439 PARAMETER STATUS name='IntervalStyle', value='postgres'
    12:55:13.440
    

    Saturday, February 7, 2015

    plpgsql_check is available for Microsoft Windows

    I compiled the plpgsql_check by Visual Studio 2010 Express. The dll for PostgreSQL 9.2, 9.3, 9.4 and x86 and x64 platform are available in zip archive http://pgsql.cz/files/plpgsql_check-1.0.1-1-mswin.zip.

    Installation

    1. Download, unzip and choose related dll file
    2. rename to plpgsql_check.dll and copy to PostgreSQL's lib directory (Program Files/PostgreSQL/9.2/lib)
    3. copy plpgsql_check-1.0.sql and plpgsql_check.control to PostgreSQL's share/extension directory (PostgreSQL/9.2/share/extension).

    It can needed installed a Microsoft Visual C++ 2010 SP1 Redistributable Package http://www.microsoft.com/en-us/download/details.aspx?id=8328.

    Please, check it.

    Wednesday, February 4, 2015

    Simple multicolumn ouput in psql

    There are interesting idea on Depesz's blog .

    But some possibility has a Linux itself. There is simple pager column.

    You can try (blogger engine break formatting):
    postgres=# \pset tuples_only
    postgres=# \setenv PAGER column
    
    postgres=# select typname from pg_type limit 100;
     bool   pg_type  line   _bool   _varchar  _inet   _numeric
     bytea   pg_attribute  _line   _bytea   _int8   _cidr   timetz
     char   pg_proc  float4   _char   _point   _cstring  _timetz
     name   pg_class  float8   _name   _lseg   bpchar   bit
     int8   json   abstime  _int2   _path   varchar  _bit
     int2   xml   reltime  _int2vector  _box   date   varbit
     int2vector  _xml   tinterval  _int4   _float4  time   _varbit
     int4   _json   unknown  _regproc  _float8  timestamp  numeric
     regproc  pg_node_tree  circle   _text   _abstime  _timestamp  refcursor
     text   smgr   _circle  _oid   _reltime  _date   _refcursor
     oid   point   money   _tid   _tinterval  _time
     tid   lseg   _money   _xid   _polygon  timestamptz
     xid   path   macaddr  _cid   aclitem  _timestamptz
     cid   box   inet   _oidvector  _aclitem  interval
     oidvector  polygon  cidr   _bpchar  _macaddr  _interval
    
    postgres=# 
    
    It works together with less
    postgres=# \setenv PAGER '(column | less)'
    postgres=# select typname from pg_type;
     ...
    

    Do you know some other nice pagers?

    Monday, February 2, 2015

    template_fdw

    I wrote a template foreign data wrapper. It is very simple FDW, that doesn't allow any DML and SELECT operation over table. It is based on Andrew Dunstan's blackhole FDW. What is benefit of this strange data wrapper? I wrote this for possibility to check plpgsql code that uses temporary tables. plpgsql_check cannot do a static validation of functions that uses temporary tables.

    I have a function test:
    CREATE OR REPLACE FUNCTION public.test()
     RETURNS void
     LANGUAGE plpgsql
    AS $function$
    DECLARE r record;
    BEGIN
      BEGIN
        DELETE FROM foo; -- temp table
      EXCEPTION WHEN OTHERS THEN
        CREATE TEMP TABLE foo(a int, b int);
      END;
      INSERT INTO foo VALUES(10,20);
      FOR r IN SELECT * FROM foo
      LOOP
        RAISE NOTICE '% %', r.a, r.b;
      END LOOP;
    END;
    $function$
    
    This code I cannot to verify with plpgsql_check due dependency on temp table foo:
    postgres=# select plpgsql_check_function('test()', fatal_errors := false);
                               plpgsql_check_function                           
    ----------------------------------------------------------------------------
     error:42P01:5:SQL statement:relation "foo" does not exist
     Query: DELETE FROM foo
     --                 ^
     error:42P01:9:SQL statement:relation "foo" does not exist
     Query: INSERT INTO foo VALUES(10,20)
     --                 ^
     error:42P01:10:FOR over SELECT rows:relation "foo" does not exist
     Query: SELECT * FROM foo
     --                   ^
     error:55000:12:RAISE:record "r" is not assigned yet
     Detail: The tuple structure of a not-yet-assigned record is indeterminate.
     Context: SQL statement "SELECT r.a"
     error:55000:12:RAISE:record "r" is not assigned yet
     Detail: The tuple structure of a not-yet-assigned record is indeterminate.
     Context: SQL statement "SELECT r.b"
    (15 rows)
    

    I can create persistent table foo. But if I forgot to drop this table, I can have lot of problems, but some one can be invisible. So I created "template storage" that disallow any DML or SELECT. This decrease a risk and I can have these fake tables persistent:
    CREATE SERVER template FOREIGN DATA WRAPPER template_fdw;
    CREATE FOREIGN TABLE foo(a int, b int) SERVER template;
    
    postgres=# SELECT * FROM foo;
    ERROR:  cannot read from table "public.foo"
    DETAIL:  Table is template.
    HINT:  Create temp table by statement "CREATE TEMP TABLE foo(LIKE public.foo INCLUDING ALL);"
    
    -- but EXPLAIN is valid
    postgres=# EXPLAIN SELECT * FROM foo;
                          QUERY PLAN                       
    -------------------------------------------------------
     Foreign Scan on foo  (cost=0.00..0.00 rows=0 width=8)
    (1 row)
    
    -- and now I can do static validation
    postgres=# select plpgsql_check_function('test()', fatal_errors := false);
     plpgsql_check_function 
    ------------------------
    (0 rows)
    

    Sunday, January 18, 2015

    how to push parameters to DO statement from command line

    PostgreSQL DO statement doesn't support parametrization. But with psql variables we are able to "inject" do statement safely and we can do it:
    bash-4.1$ cat test.sh 
    #!/bin/bash
    
    echo "
    set myvars.msgcount TO :'msgcount'; 
    DO \$\$ 
    BEGIN 
      FOR i IN 1..current_setting('myvars.msgcount')::int LOOP 
        RAISE NOTICE 'Hello';
      END LOOP; 
    END \$\$" | psql postgres -v msgcount=$1
    

    Usage:
    bash-4.1$ ./test.sh 3
    SET
    Time: 0.386 ms
    NOTICE:  Hello
    NOTICE:  Hello
    NOTICE:  Hello
    DO
    Time: 1.849 ms