Pages

Sunday, August 29, 2010

performance issue of SQL functions

I though so sql functions is always faster than plpgsql function. But it's not true. Only when sql function is inlined, then function is faster, else plpgsql can be faster. Now I know, so sql functions where parameter is twice and more used are not inlined - because it means double parameter evaluation. you can see:
CREATE OR REPLACE FUNCTION is_empty(text) 
RETURNS bool AS $$ 
SELECT $1 IS NULL OR $1 = '' 
$$ LANGUAGE sql;
This function returns true when parameter is empty string or is null. The first parameter is used twice and then the body of this function isn't inlined.
postgres=# EXPLAIN VERBOSE SELECT count(is_empty(CASE WHEN random() < 0.5 THEN NULL ELSE 'x' END)) FROM generate_series(1,100000);
                                                 QUERY PLAN                                                 
------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=12.50..12.77 rows=1 width=0)
   Output: count(is_empty(CASE WHEN (random() < 0.5::double precision) THEN NULL::text ELSE 'x'::text END))
   ->  Function Scan on pg_catalog.generate_series  (cost=0.00..10.00 rows=1000 width=0)
         Output: generate_series
         Function Call: generate_series(1, 100000)
(5 rows)
The execution time is 458ms. When I modify the body for one time paremeter using:
CREATE OR REPLACE FUNCTION is_empty(text) 
RETURNS bool AS $$
SELECT COALESCE($1,'') = '' 
$$ LANGUAGE sql;
Then function is inlined:
postgres=# EXPLAIN VERBOSE SELECT count(is_empty(CASE WHEN random() < 0.5 THEN NULL ELSE 'x' END)) FROM generate_series(1,100000);
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=12.50..12.52 rows=1 width=0)
   Output: count((COALESCE(CASE WHEN (random() < 0.5::double precision) THEN NULL::text ELSE 'x'::text END, ''::text) = ''::text))
   ->  Function Scan on pg_catalog.generate_series  (cost=0.00..10.00 rows=1000 width=0)
         Output: generate_series
         Function Call: generate_series(1, 100000)
(5 rows)
and execution time is only 68ms (5 x faster than not inlined function). Attention: this note is valid for some more complex parameters - like volatile functions. The life isn't simple - for basic parameters (like columns or simple expression), the most simple SQL function (in my article function with OR operator and twice used parameter) can be better - wirking better with planner.

2 comments:

  1. Good point!
    Thank you!

    ReplyDelete
  2. Hm, is there any documentation, if when the inlining is done and when not?

    ReplyDelete