Saturday, May 18, 2019

how to don't emulate schema (global) variables

Postgres has not global variables for PLpgSQL. This week I worked for one customer who had emulation of global variables based on temporary tables.

Depends on usage, this implementation can be very slow - more due often using temporary tables, a pg_class, pg_attribute tables can bloat, and these functions are slower and slower. So don't use it. There are alternative, much better solution, based on custom configuration variables:

Original code:

        CREATE OR REPLACE FUNCTION public.setvar(_varname text, _value text)
         RETURNS void
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         create temporary table if not exists variables(varname text unique, value text);
4         insert into variables (varname, value) values(_varname, _value) on conflict (varname) do update set value = _value;
5       end;
6       $function$

        CREATE OR REPLACE FUNCTION public.getvar(_varname text)
         RETURNS text
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         return (select value from variables where varname = _varname);
4       exception when others then
5         return null;
6       end;
7       $function$

There are more slow things: a) creating of table (first time in session can be slow (or very slow if system catalog is bloated), b) handling a exception (inside a safe point is created and dropped every time).

Better code:

        CREATE OR REPLACE FUNCTION public.setvar2(_varname text, _value text)
         RETURNS void
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         perform set_config('variables.' || _varname, _value, false);
4       end
5       $function$

        CREATE OR REPLACE FUNCTION public.getvar2(_varname text)
         RETURNS text
         LANGUAGE plpgsql
1       AS $function$
2       begin
3         return current_setting('variables.' || _varname, true);
4       end;
5       $function$

The differences can be measured by few synthetic benchmarks (attention - because it is tested on fresh postgresql instance, the result is best case for temporary tables solution, reality can be much more worse):

explain analyze select setvar('x' || (random()*10)::int, 'value' || random()*10000) from generate_series(1,10000);
+-------------------------------------------------------------------------------------------------------------------------+
|                                                       QUERY PLAN                                                        |
+-------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..287.50 rows=1000 width=4) (actual time=1.785..256.857 rows=10000 loops=1) |
| Planning Time: 0.169 ms                                                                                                 |
| Execution Time: 258.202 ms                                                                                              |
+-------------------------------------------------------------------------------------------------------------------------+
(3 řádky)

explain analyze select getvar('x' || (random()*10)::int) from generate_series(1,10000);
+--------------------------------------------------------------------------------------------------------------------------+
|                                                        QUERY PLAN                                                        |
+--------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..275.00 rows=1000 width=32) (actual time=3.898..214.040 rows=10000 loops=1) |
| Planning Time: 0.101 ms                                                                                                  |
| Execution Time: 215.405 ms                                                                                               |
+--------------------------------------------------------------------------------------------------------------------------+
(3 řádky)

explain analyze select setvar2('x' || (random()*10)::int, 'value' || random()*10000) from generate_series(1,10000);
+------------------------------------------------------------------------------------------------------------------------+
|                                                       QUERY PLAN                                                       |
+------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..287.50 rows=1000 width=4) (actual time=1.724..86.223 rows=10000 loops=1) |
| Planning Time: 0.140 ms                                                                                                |
| Execution Time: 87.590 ms                                                                                              |
+------------------------------------------------------------------------------------------------------------------------+
(3 řádky)
explain analyze select getvar2('x' || (random()*10)::int) from generate_series(1,10000);
+-------------------------------------------------------------------------------------------------------------------------+
|                                                       QUERY PLAN                                                        |
+-------------------------------------------------------------------------------------------------------------------------+
| Function Scan on generate_series  (cost=0.00..275.00 rows=1000 width=32) (actual time=1.648..33.300 rows=10000 loops=1) |
| Planning Time: 0.106 ms                                                                                                 |
| Execution Time: 34.639 ms                                                                                               |
+-------------------------------------------------------------------------------------------------------------------------+
(3 řádky)

Everything depends on context. If you call function once per top query, miliseconds play nothing. When you call function once per row of bigger report (or inside some complex query), a miliseconds can play big game.


5 Comments:

At May 19, 2019 at 3:22 AM , Blogger Carlos Blos said...

Hi, I have a similar function (wrong one) for session variables. Can I use your suggestion for function for session just sending True on last parameter (to be local)? Is there any problem using it this way?

 
At May 19, 2019 at 4:38 AM , Blogger Pavel Stěhule said...

When local (last parameter) is true, then a value is limited to current transaction.

 
At May 19, 2019 at 5:47 AM , Blogger Carlos Blos said...

I saw just now the documentation, the last parameter name is is_local, so I misunderstood. Thanks so much for replying.

 
At May 25, 2019 at 1:10 AM , Anonymous Anonymous said...

Is there any way to store data in shared memory without casting to text?

 
At May 25, 2019 at 4:56 AM , Blogger Pavel Stěhule said...

It is not possible without extensions.

 

Post a Comment

Subscribe to Post Comments [Atom]

<< Home