Saturday, May 18, 2019

how to don't emulate schema (global) variables

Postgres has not global variables for PLpgSQL. This week I worked for one customer who had emulation of global variables based on temporary tables.

Depends on usage, this implementation can be very slow - more due often using temporary tables, a pg_class, pg_attribute tables can bloat, and these functions are slower and slower. So don't use it. There are alternative, much better solution, based on custom configuration variables:

Original code:

        CREATE OR REPLACE FUNCTION public.setvar(_varname text, _value text)
         RETURNS void
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         create temporary table if not exists variables(varname text unique, value text);
4         insert into variables (varname, value) values(_varname, _value) on conflict (varname) do update set value = _value;
5       end;
6       $function$

        CREATE OR REPLACE FUNCTION public.getvar(_varname text)
         RETURNS text
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         return (select value from variables where varname = _varname);
4       exception when others then
5         return null;
6       end;
7       $function$

There are more slow things: a) creating of table (first time in session can be slow (or very slow if system catalog is bloated), b) handling a exception (inside a safe point is created and dropped every time).

Better code:

        CREATE OR REPLACE FUNCTION public.setvar2(_varname text, _value text)
         RETURNS void
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         perform set_config('variables.' || _varname, _value, false);
4       end
5       $function$

        CREATE OR REPLACE FUNCTION public.getvar2(_varname text)
         RETURNS text
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         return current_setting('variables.' || _varname, true);
4       end;
5       $function$

The differences can be measured by few synthetic benchmarks (attention - because it is tested on fresh postgresql instance, the result is best case for temporary tables solution, reality can be much more worse):

explain analyze select setvar('x' || (random()*10)::int, 'value' || random()*10000) from generate_series(1,10000);
+-------------------------------------------------------------------------------------------------------------------------+
|                                                       QUERY PLAN                                                        |
+-------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..287.50 rows=1000 width=4) (actual time=1.785..256.857 rows=10000 loops=1) |
| Planning Time: 0.169 ms                                                                                                 |
| Execution Time: 258.202 ms                                                                                              |
+-------------------------------------------------------------------------------------------------------------------------+
(3 řádky)

explain analyze select getvar('x' || (random()*10)::int) from generate_series(1,10000);
+--------------------------------------------------------------------------------------------------------------------------+
|                                                        QUERY PLAN                                                        |
+--------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..275.00 rows=1000 width=32) (actual time=3.898..214.040 rows=10000 loops=1) |
| Planning Time: 0.101 ms                                                                                                  |
| Execution Time: 215.405 ms                                                                                               |
+--------------------------------------------------------------------------------------------------------------------------+
(3 řádky)

explain analyze select setvar2('x' || (random()*10)::int, 'value' || random()*10000) from generate_series(1,10000);
+------------------------------------------------------------------------------------------------------------------------+
|                                                       QUERY PLAN                                                       |
+------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..287.50 rows=1000 width=4) (actual time=1.724..86.223 rows=10000 loops=1) |
| Planning Time: 0.140 ms                                                                                                |
| Execution Time: 87.590 ms                                                                                              |
+------------------------------------------------------------------------------------------------------------------------+
(3 řádky)
explain analyze select getvar2('x' || (random()*10)::int) from generate_series(1,10000);
+-------------------------------------------------------------------------------------------------------------------------+
|                                                       QUERY PLAN                                                        |
+-------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..275.00 rows=1000 width=32) (actual time=1.648..33.300 rows=10000 loops=1) |
| Planning Time: 0.106 ms                                                                                                 |
| Execution Time: 34.639 ms                                                                                               |
+-------------------------------------------------------------------------------------------------------------------------+
(3 řádky)

Everything depends on context. If you call function once per top query, miliseconds play nothing. When you call function once per row of bigger report (or inside some complex query), a miliseconds can play big game.


Thursday, May 2, 2019

pspg on Solaris

I fixed some issues and pspg can be used on Solaris too. I found some issues on Solaris side on utf8 support - but it is related just for subset of chars. Due this issues, don't use psql unicode borders.

Thursday, April 18, 2019

new release of plpgsql_check - possibility to check SQL injection issue

Yesterday I released next version of plpgsql_check.

With this release a developer can check some well known patterns of SQL injection vulnerabilities. The code of stored procedures of native languages like PL/SQL, T-SQL or PL/pgSQL is secure, and there is not a risk of SQL injection until dynamic SQL is used (the EXECUTE command in PL/pgSQL). The safe programming requires sanitization of all string variables. Anybody can use functions: quote_literal, quote_ident or format. This check can be slow, so it should be enabled by setting security_warnings parameter:

CREATE OR REPLACE FUNCTION public.foo1(a text)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
DECLARE result text;
BEGIN
  -- secure
  EXECUTE 'SELECT $1' INTO result USING a;
  -- secure
  EXECUTE 'SELECT ' || quote_literal(a) INTO result;
  -- secure
  EXECUTE format('SELECT %L', a) INTO result;
  -- unsecure
  EXECUTE 'SELECT ''' || a || '''' INTO result;
  -- unsecure
  EXECUTE format(e'SELECT \'%s\'', a) INTO result;
  RETURN result;
END;
$function$

postgres=# select * from plpgsql_check_function('foo1');
┌────────────────────────┐
│ plpgsql_check_function │
╞════════════════════════╡
└────────────────────────┘
(0 rows)

postgres=# select * from plpgsql_check_function('foo1', security_warnings => true);
┌─────────────────────────────────────────────────────────────────────────────┐
│                           plpgsql_check_function                            │
╞═════════════════════════════════════════════════════════════════════════════╡
│ security:00000:11:EXECUTE:text type variable is not sanitized               │
│ Query: SELECT 'SELECT ''' || a || ''''                                      │
│ --                           ^                                              │
│ Detail: The EXECUTE expression is SQL injection vulnerable.                 │
│ Hint: Use quote_ident, quote_literal or format function to secure variable. │
│ security:00000:13:EXECUTE:text type variable is not sanitized               │
│ Query: SELECT format(e'SELECT \'%s\'', a)                                   │
│ --                                     ^                                    │
│ Detail: The EXECUTE expression is SQL injection vulnerable.                 │
│ Hint: Use quote_ident, quote_literal or format function to secure variable. │
└─────────────────────────────────────────────────────────────────────────────┘
(10 rows)

Thanks to Adam Bartoszewicz's work, this extension compiled for MS Windows is available https://groups.google.com/d/msg/postgresql-extensions-hacking/U_ZJACaDz60/BYagJi9iDAAJ

Thursday, March 21, 2019

How to split string to array by individual characters?

Postgres has too many features, so sometimes is good to remind some.

Function string_to_array is well known. This function has two or three parameters. If second parameter (delimiter) is null, then input string is separated to array of characters.

postgres=# select string_to_array('Pavel Stěhule',null);
┌───────────────────────────────┐
│        string_to_array        │
╞═══════════════════════════════╡
│ {P,a,v,e,l," ",S,t,ě,h,u,l,e} │
└───────────────────────────────┘
(1 row)

Saturday, March 2, 2019

compiled dll of plpgsql 1.6 for PostgreSQL 10, 11

Adam Bartoszewicz prepared dll. Please, read a message.

Thank you, Adam

pspg is available in Fedora 29, Fedora 30 repository

If you use fresh Fedora distribution, you can install a pspg pager very simply:

dnf install pspg

after this:
export PSQL_PAGER=pspg #for Postgres 11 
export PAGER=pspg
psql dbname

Thursday, February 28, 2019

don't use SQL keywords as PLpgSQL variable names

Yesterday I had a possibility to see some strange runtime error

CREATE OR REPLACE FUNCTION public.fx()
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
DECLARE offset int DEFAULT 0;
BEGIN
  RETURN offset + 1;
END;
$function$

postgres=# SELECT fx();
ERROR:  query "SELECT offset + 1" returned 0 columns
CONTEXT:  PL/pgSQL function fx() line 4 at RETURN

What is problem? On first view, the RETURN returns 1 column, so error message is strange.

But any PLpgSQL expression is a SQL expression - more it is SQL SELECT statement. So SELECT OFFSET 1 really returns 0 columns.

The basic issue is on bad name of variable - it is same like SQL reserved keyword. OFFSET is unhappy word.

I wrote new check to plpgsql_check, that raises a warning in this situation.