Monday, August 19, 2019

show csv files by pspg pager

Long time I am thinking about support CSV format inside pspg. Still I have some other task, and I have not a quite time for this job. But I found nice application miller, that can be used for formatting CSV to MySQL tables format, and this format is supported by pspg already.

Data:
id,okres_id,nazev,pocet_muzu,pocet_zen,vek_muzu,vek_zen
1,CZ0100,Praha,608316,640710,39.8,43.2
2,CZ0201,Benešov,7875,8507,39.2,41.9
3,CZ0201,Bernartice,108,115,45.9,43.3
4,CZ0201,Bílkovice,93,89,41.4,46.8
5,CZ0201,Blažejovice,52,48,44.6,50.8
6,CZ0201,Borovnice,39,37,45.6,49.5
7,CZ0201,Bukovany,364,372,38.1,38.8
8,CZ0201,Bystřice,2124,2096,38.5,41.1
9,CZ0201,Ctiboř,55,50,38.6,42.0

Command:
mlr --icsv --opprint --barred put '' obce.csv | pspg --force-uniborder

Result:

Itegration with mc
  1. copy file from mc.ext from /etc/mc to ~/.config/mc directory
  2. append
    #csv
    regex/\.csv
        View=mlr --icsv --opprint --barred put '' %f | pspg --force-uniborder
  3. restart mc
     

Wednesday, July 24, 2019

new releases of plpgsql_check and pspg

Here are plpgsql_check and pspg.

I fixed few significant errors in profiler integrated to plpgsql_check. I hope, so profiler should to work well now.

The bugs fixed in pspg are minor bugfixes.

Monday, May 20, 2019

new bugfix release of plpgsql_check

Last week I fixed some critical bugs on profiler integrated to plpgsql_check.

Now, integrated profiler is tested on real bigger project.

Saturday, May 18, 2019

how to don't emulate schema (global) variables

Postgres has not global variables for PLpgSQL. This week I worked for one customer who had emulation of global variables based on temporary tables.

Depends on usage, this implementation can be very slow - more due often using temporary tables, a pg_class, pg_attribute tables can bloat, and these functions are slower and slower. So don't use it. There are alternative, much better solution, based on custom configuration variables:

Original code:

        CREATE OR REPLACE FUNCTION public.setvar(_varname text, _value text)
         RETURNS void
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         create temporary table if not exists variables(varname text unique, value text);
4         insert into variables (varname, value) values(_varname, _value) on conflict (varname) do update set value = _value;
5       end;
6       $function$

        CREATE OR REPLACE FUNCTION public.getvar(_varname text)
         RETURNS text
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         return (select value from variables where varname = _varname);
4       exception when others then
5         return null;
6       end;
7       $function$

There are more slow things: a) creating of table (first time in session can be slow (or very slow if system catalog is bloated), b) handling a exception (inside a safe point is created and dropped every time).

Better code:

        CREATE OR REPLACE FUNCTION public.setvar2(_varname text, _value text)
         RETURNS void
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         perform set_config('variables.' || _varname, _value, false);
4       end
5       $function$

        CREATE OR REPLACE FUNCTION public.getvar2(_varname text)
         RETURNS text
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         return current_setting('variables.' || _varname, true);
4       end;
5       $function$

The differences can be measured by few synthetic benchmarks (attention - because it is tested on fresh postgresql instance, the result is best case for temporary tables solution, reality can be much more worse):

explain analyze select setvar('x' || (random()*10)::int, 'value' || random()*10000) from generate_series(1,10000);
+-------------------------------------------------------------------------------------------------------------------------+
|                                                       QUERY PLAN                                                        |
+-------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..287.50 rows=1000 width=4) (actual time=1.785..256.857 rows=10000 loops=1) |
| Planning Time: 0.169 ms                                                                                                 |
| Execution Time: 258.202 ms                                                                                              |
+-------------------------------------------------------------------------------------------------------------------------+
(3 řádky)

explain analyze select getvar('x' || (random()*10)::int) from generate_series(1,10000);
+--------------------------------------------------------------------------------------------------------------------------+
|                                                        QUERY PLAN                                                        |
+--------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..275.00 rows=1000 width=32) (actual time=3.898..214.040 rows=10000 loops=1) |
| Planning Time: 0.101 ms                                                                                                  |
| Execution Time: 215.405 ms                                                                                               |
+--------------------------------------------------------------------------------------------------------------------------+
(3 řádky)

explain analyze select setvar2('x' || (random()*10)::int, 'value' || random()*10000) from generate_series(1,10000);
+------------------------------------------------------------------------------------------------------------------------+
|                                                       QUERY PLAN                                                       |
+------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..287.50 rows=1000 width=4) (actual time=1.724..86.223 rows=10000 loops=1) |
| Planning Time: 0.140 ms                                                                                                |
| Execution Time: 87.590 ms                                                                                              |
+------------------------------------------------------------------------------------------------------------------------+
(3 řádky)
explain analyze select getvar2('x' || (random()*10)::int) from generate_series(1,10000);
+-------------------------------------------------------------------------------------------------------------------------+
|                                                       QUERY PLAN                                                        |
+-------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..275.00 rows=1000 width=32) (actual time=1.648..33.300 rows=10000 loops=1) |
| Planning Time: 0.106 ms                                                                                                 |
| Execution Time: 34.639 ms                                                                                               |
+-------------------------------------------------------------------------------------------------------------------------+
(3 řádky)

Everything depends on context. If you call function once per top query, miliseconds play nothing. When you call function once per row of bigger report (or inside some complex query), a miliseconds can play big game.


Thursday, May 2, 2019

pspg on Solaris

I fixed some issues and pspg can be used on Solaris too. I found some issues on Solaris side on utf8 support - but it is related just for subset of chars. Due this issues, don't use psql unicode borders.

Thursday, April 18, 2019

new release of plpgsql_check - possibility to check SQL injection issue

Yesterday I released next version of plpgsql_check.

With this release a developer can check some well known patterns of SQL injection vulnerabilities. The code of stored procedures of native languages like PL/SQL, T-SQL or PL/pgSQL is secure, and there is not a risk of SQL injection until dynamic SQL is used (the EXECUTE command in PL/pgSQL). The safe programming requires sanitization of all string variables. Anybody can use functions: quote_literal, quote_ident or format. This check can be slow, so it should be enabled by setting security_warnings parameter:

CREATE OR REPLACE FUNCTION public.foo1(a text)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
DECLARE result text;
BEGIN
  -- secure
  EXECUTE 'SELECT $1' INTO result USING a;
  -- secure
  EXECUTE 'SELECT ' || quote_literal(a) INTO result;
  -- secure
  EXECUTE format('SELECT %L', a) INTO result;
  -- unsecure
  EXECUTE 'SELECT ''' || a || '''' INTO result;
  -- unsecure
  EXECUTE format(e'SELECT \'%s\'', a) INTO result;
  RETURN result;
END;
$function$

postgres=# select * from plpgsql_check_function('foo1');
┌────────────────────────┐
│ plpgsql_check_function │
╞════════════════════════╡
└────────────────────────┘
(0 rows)

postgres=# select * from plpgsql_check_function('foo1', security_warnings => true);
┌─────────────────────────────────────────────────────────────────────────────┐
│                           plpgsql_check_function                            │
╞═════════════════════════════════════════════════════════════════════════════╡
│ security:00000:11:EXECUTE:text type variable is not sanitized               │
│ Query: SELECT 'SELECT ''' || a || ''''                                      │
│ --                           ^                                              │
│ Detail: The EXECUTE expression is SQL injection vulnerable.                 │
│ Hint: Use quote_ident, quote_literal or format function to secure variable. │
│ security:00000:13:EXECUTE:text type variable is not sanitized               │
│ Query: SELECT format(e'SELECT \'%s\'', a)                                   │
│ --                                     ^                                    │
│ Detail: The EXECUTE expression is SQL injection vulnerable.                 │
│ Hint: Use quote_ident, quote_literal or format function to secure variable. │
└─────────────────────────────────────────────────────────────────────────────┘
(10 rows)

Thanks to Adam Bartoszewicz's work, this extension compiled for MS Windows is available https://groups.google.com/d/msg/postgresql-extensions-hacking/U_ZJACaDz60/BYagJi9iDAAJ

Thursday, March 21, 2019

How to split string to array by individual characters?

Postgres has too many features, so sometimes is good to remind some.

Function string_to_array is well known. This function has two or three parameters. If second parameter (delimiter) is null, then input string is separated to array of characters.

postgres=# select string_to_array('Pavel Stěhule',null);
┌───────────────────────────────┐
│        string_to_array        │
╞═══════════════════════════════╡
│ {P,a,v,e,l," ",S,t,ě,h,u,l,e} │
└───────────────────────────────┘
(1 row)