Sunday, May 17, 2020

pspg - two new features

I wrote two new features to pspg.

First feature is a option --skip-columns-like. Now, it can be used only for browsing csv or tsv documents. When this option is used, then specified columns (specified by substring of name) are not displayed. It can be useful for psql PostGIS data. More time some GEO data has not sense to display. If you have wide table with lot of columns, then can be nasty to write list of columns every time. Example

export PSPG="--skip-columns-like='geom$'

Then if you use psql, then you can write query like SELECT * FROM roads, and all columns like roads_geom will be removed from presented table. You have to switch output format to csv by \pset format csv (it is available from Postgres 12).

Second new feature is good for wide columns too. If you column is wide, and column name not, then you can see only white space and you have to scroll to left or to right to see column name. With last commit, the column name is almost all time visible - when it is possible.

Wednesday, May 13, 2020

dbms_sql

Like lot of other people I had different plans to spring this year. Instead a walk to Santiago, I sit at home. On second hand it is time to finish some repairing on my house and write some code.

I finished prototype of dbms_sql extension. This extension implements part of interface of dbms_sql package. A design of this package is related to requirements and possibilities at 30 years ago. Today is not necessity to use similar functionality on Postgres, and probably on Oracle too. But a goal of this extension is similar to goal of Orafce extension - reduce a work that is necessary for porting some applications from Oracle to Postgres. Only part of interface is implemented, but some interesting parts like bulk DML operations are implemented.

On second hand, this extension can be good example of implementation's patterns of PostgreSQL extensions. Because implemented interface is relative rich, then it uses lot of internal PostgreSQL's API.

This extension will be included to Orafce in future. Now it is separated extensions from two reasons:

  • the code is not mature (I just wrote prototype), but it should be good enough for testing and usage (the code is not too complex).
  • it requires PostgreSQL 11 and higher. Orafce requires 9.4 and higher. So I'll wait to end of support of PostgreSQL 10.

    Examples:
    do $$
    declare
      c int;
      a int[];
      b varchar[];
      ca numeric[];
    begin
      c := dbms_sql.open_cursor();
      call dbms_sql.parse(c, 'insert into foo values(:a, :b, :c)');
      a := ARRAY[1, 2, 3, 4, 5];
      b := ARRAY['Ahoj', 'Nazdar', 'Bazar'];
      ca := ARRAY[3.14, 2.22, 3.8, 4];
    
      call dbms_sql.bind_array(c, 'a', a, 2, 3);
      call dbms_sql.bind_array(c, 'b', b, 3, 4);
      call dbms_sql.bind_array(c, 'c', ca);
      raise notice 'inserted rows %d', dbms_sql.execute(c);
      call dbms_sql.close(c);
    end;
    $$;
    
    
    do $$
    declare
      c int;
      a int[];
      b varchar[];
      ca numeric[];
    begin
      c := dbms_sql.open_cursor();
      call dbms_sql.parse(c, 'select i, ''Ahoj'' || i, i + 0.003 from generate_series(1, 35) g(i)');
      call dbms_sql.define_array(c, 1, a, 10, 1);
      call dbms_sql.define_array(c, 2, b, 10, 1);
      call dbms_sql.define_array(c, 3, ca, 10, 1);
    
      perform dbms_sql.execute(c);
      while dbms_sql.fetch_rows(c) > 0
      loop
        call dbms_sql.column_value(c, 1, a);
        call dbms_sql.column_value(c, 2, b);
        call dbms_sql.column_value(c, 3, ca);
        raise notice 'a = %', a;
        raise notice 'b = %', b;
        raise notice 'c = %', ca;
      end loop;
      call dbms_sql.close_cursor(c);
    end;
    $$;
    

  • Friday, April 24, 2020

    pgcli integrated support for pspg

    pgcli is nice TUI client for Postgres. Against psql it has significantly modern look with more aggressive autocomplete support.

    When I tested this client, I found one significant issue (against psql). It was pretty slow when larger than small result had to be displayed. The most slow operation is result formatting to tabular format. This is hard task for interpret language and pgcli is written in Python. I proposed a move this operation to pspg. Yesterday a necessary patch was committed.

    It requires some pgcli configuration - see ~/.config/pgcli/config file:
    pager = pspg --csv --rr=2 --quit-if-one-screen --ignore-case --csv-header=on --double-header
    table_format = csv
    row_limit = 0
    

    Now, the result is processed with similar speed like in psql and work with pgcli is much more comfortable. Thanks to all authors of pgcli.

    Thursday, April 16, 2020

    split string to table

    Sometimes life can be funny. This week there was a question about XMLTABLE function in Postgres's maling list. It was related to migration from Oracle to Postgres.

    CREATE TABLE user_pool_clean (
        fk_user_pool_pk bytea NOT NULL,
        user_id character varying(255) NOT NULL,
        email_address character varying(250),
        is_mil numeric,
        is_civ numeric,
        is_ctr numeric,
        is_gov numeric,
        is_edu numeric,
        role_id character varying(50),
        user_profile_id character varying(50),
        service_branch_id character varying(50),
        mil_pay_grade_id character varying(50),
        my_auds character varying(4000),
        my_orgs character varying(4000),
        processed character(1) DEFAULT 'N'::bpchar NOT NULL
    );
    
    insert into user_pool_clean
    values('995CECDC1881375DE05312A270C7CF56','10015706','noemail@fter.dbrefresh.mil',0,0,0,0,0,'1','4','700006','3','1706882','1707720','Y');
    
    insert into user_pool_clean
    values('995CECDC1905375DE05312A270C7CF56','10015848','noemail@fter.dbrefresh.mil',0,0,0,0,0,'1','3','700015','11','
    1705562,1708486','1710621','Y');
    
    
    SQL> SELECT upc.is_mil,TRIM(column_value) src
             FROM   user_pool_clean upc
                   ,xmltable(('"'|| REPLACE( upc.my_auds, ',', '","')|| '"'));
    
        IS_MIL SRC
    ---------- ---------------
         0       1705562              --------O/P from the oracle database
         0        1708486
        0        1706882
    

    It is hard to understand, why somebody used a xmltable function. After some googling, I found, so this is a workaround for generating rows from some string. The original input string is modified to XQuery sequence format. When xmltable has not any other clauses with only one input parameter, then this parameter is used as XQuery expression. Unfortunately nothing similar is possible in Postgres. Postgres uses library libxml2 for xml support, and this library supports only older XPath language and doesn't support more powerful but more complex XQuery language.

    Fortunately, Postgres has more tools for split string to rows (it can be used cleanly without workarounds)

    First possibility is using string_to_array and unnest functions:

    postgres=# select is_mil, unnest(string_to_array(my_auds, ',')) from user_pool_clean ;
    ┌────────┬─────────┐
    │ is_mil │ unnest  │
    ╞════════╪═════════╡
    │      0 │ 1706882 │
    │      0 │        ↵│
    │        │ 1705562 │
    │      0 │ 1708486 │
    └────────┴─────────┘
    (3 rows)
    

    The result is broken by new line char (probably it is cleaned by XQuery evaluation, because new line is ignored in XML), so we should to clean this result:

    postgres=# select is_mil, trim(e' \n' from unnest(string_to_array(my_auds, ','))) from user_pool_clean ;
    ┌────────┬─────────┐
    │ is_mil │  btrim  │
    ╞════════╪═════════╡
    │      0 │ 1706882 │
    │      0 │ 1705562 │
    │      0 │ 1708486 │
    └────────┴─────────┘
    (3 rows)
    

    Now, the result is perfect. Is nice to see the power of postgres table functions (and possibilities). The form with lateral join like original query is possible too:
    postgres=# select is_mil, trim(e' \n' from  u) 
                 from user_pool_clean,
                      unnest(string_to_array(my_auds, ',')) u ;
    ┌────────┬─────────┐
    │ is_mil │  btrim  │
    ╞════════╪═════════╡
    │      0 │ 1706882 │
    │      0 │ 1705562 │
    │      0 │ 1708486 │
    └────────┴─────────┘
    (3 rows)
    

    Another way, but maybe little bit slower (but with more possibilities) is using regexp_split_to_table function:

    postgres=# select is_mil, trim(e' \n' from  u) 
                from user_pool_clean, 
                     regexp_split_to_table(my_auds, ',') u ;
    ┌────────┬─────────┐
    │ is_mil │  btrim  │
    ╞════════╪═════════╡
    │      0 │ 1706882 │
    │      0 │ 1705562 │
    │      0 │ 1708486 │
    └────────┴─────────┘
    (3 rows)
    

    Monday, April 6, 2020

    new pspg - version 3.0.2 with named pipe support

    Today I released version 3.0.2. The ui is same like older versions. New is a possibility to communicate with client via named pipe. Now pspg can work well with redirected content (\o command in psql and psql watch mode is supported too.

    Example:
    #create named pipe
    mkfifo ~/pipe
    
    #use psql
    [pavel@nemesis ~]$ psql
    psql (13devel)
    Type "help" for help.
    
    postgres=# \o ~/pokus
    postgres=# select * from obce limit 10;
    postgres=# --redirect content to pipe
    postgres=# \o ~/pipe
    postgres=# select current_timestamp;
    postgres=# -- repeat it every 1sec
    postgres=# \watch 1
    

    In other terminal - run pspg and read from pipe
    pspg -f ~/pipe
    

    By default pspg is ending when writer program is closed. This behave can be changed with option --hold-stream to 1 (reopen) or 2 (nonstop opened).




    Thursday, April 2, 2020

    simple monitoring of PostgreSQL incremental statistics

    Everybody should to know PostgreSQL's stats views: pg_stat_user_tables, pg_stat_database. There are lot of interesting values. For better interpretation we need a difference of these values in some time. It is impossible to get previous value with clean SQL, but in Postgres we can use PLpgSQL, and we can use custom configure variables as session variables. I wrote function buffer, that returns difference between current value and buffered, and save current value to buffer:

    CREATE OR REPLACE FUNCTION public.buffer_dif(text, bigint)
     RETURNS bigint
     LANGUAGE plpgsql
    AS $function$
    DECLARE r bigint;
    BEGIN
      r := $2 - current_setting($1, true)::bigint;
      PERFORM set_config($1, $2::text, false);
      RETURN r;
    END;
    $function$
    

    Now I can prepare query:
    SELECT datname,
           buffer_dif('x.xact_commit_' || datname, xact_commit) AS commits, 
           buffer_dif('x.tup_inserted_' || datname, tup_inserted) AS inserted
      FROM pg_stat_database
     WHERE datname IS NOT NULL;
    

    I can repeat this query every 1 second by psql command \watch:

    ^Cpostgres=# \watch 1
    Čt 2. dubna 2020, 21:33:59 (every 1s)
    
    ┌───────────┬─────────┬──────────┐
    │  datname  │ commits │ inserted │
    ╞═══════════╪═════════╪══════════╡
    │ postgres  │       1 │        0 │
    │ template1 │       0 │        0 │
    │ template0 │       0 │        0 │
    └───────────┴─────────┴──────────┘
    (3 rows)
    
    Čt 2. dubna 2020, 21:34:00 (every 1s)
    
    ┌───────────┬─────────┬──────────┐
    │  datname  │ commits │ inserted │
    ╞═══════════╪═════════╪══════════╡
    │ postgres  │       1 │        0 │
    │ template1 │       0 │        0 │
    │ template0 │       0 │        0 │
    └───────────┴─────────┴──────────┘
    (3 rows)
    
    Čt 2. dubna 2020, 21:34:01 (every 1s)
    
    ┌───────────┬─────────┬──────────┐
    │  datname  │ commits │ inserted │
    ╞═══════════╪═════════╪══════════╡
    │ postgres  │       1 │        0 │
    │ template1 │       0 │        0 │
    │ template0 │       0 │        0 │
    └───────────┴─────────┴──────────┘
    (3 rows)
    

    Now I can see 1 sec differences of incremental metrics without any other tools.

    Tuesday, March 31, 2020

    Postgres 13, new family of polymorphic types - anycompatible

    One very interesting PostgerSQL's feature are polymorphic types. It's interesting mainly for people who writes extensions.

    Example: I can write own isnull function:
    CREATE OR REPLACE FUNCTION isnull(anyelement, anyelement)
    RETURNS anyelement AS $$
    SELECT coalesce($1, $2)
    $$ LANGUAGE sql;
    
    postgres=# SELECT public.isnull(NULL, 1);
    ┌────────┐
    │ isnull │
    ╞════════╡
    │      1 │
    └────────┘
    (1 row)
    
    postgres=# SELECT public.isnull(NULL, CURRENT_DATE);
    ┌────────────┐
    │   isnull   │
    ╞════════════╡
    │ 2020-03-31 │
    └────────────┘
    (1 row)
    
    -- but
    postgres=# SELECT public.isnull(1, 1.1);
    ERROR:  function public.isnull(integer, numeric) does not exist
    LINE 1: SELECT public.isnull(1, 1.1);
                   ^
    HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
    

    It is working for all supported types. One and big limit of type family any* is rigidity. It is working for all types, but for previous example, first and second parameter has to have exactly same types. Similar buildin functions are more tolerant - usually requires same type category only. This limit should be removed with new family of polymorphic types anycompatible. When there are more arguments with this type, then most common type is calculated and all arguments of anycompatible are casted to selected most common type:

    CREATE OR REPLACE FUNCTION isnull(anycompatible, anycompatible)
    RETURNS anycompatible AS $$
    SELECT coalesce($1, $2)
    $$ LANGUAGE sql;
    
    postgres=# SELECT public.isnull(1, 1.1);
    ┌────────┐
    │ isnull │
    ╞════════╡
    │      1 │
    └────────┘
    (1 row)
    
    postgres=# SELECT public.isnull(NULL, 1.1);
    ┌────────┐
    │ isnull │
    ╞════════╡
    │    1.1 │
    └────────┘
    (1 row)
    

    Now, it is working as expected.

    This feature is interesting mainly for authors of extensions that has to emulate some different databases (like Orafce or similar).