Pages

Tuesday, March 26, 2013

frequent mistakes in plpgsql design

Hello
last days I had to correct some typical errors in audited code. There are:
  • non-use coalesce:
    -- bad
    BEGIN
      IF x1 IS NOT NULL THEN
        s := s || x1 || ',';
      ELSE
        s := s || 'NULL, ';
      END IF;
      IF x2 IS NOT NULL THEN ...
    
    -- good
    BEGIN
      s := coalesce(x1 || ',', 'NULL,') || ...
    
  • using implicit cast date to text:
    -- bad
    BEGIN
      month = substring(current_date::text FROM 6 FOR 2)::int;
    
    -- good
    BEGIN
      month = substring(to_char(current_date,'YYYY-MM-DD') FROM 6 FOR 2)::int;
    
    -- but better
    BEGIN
      month = EXTRACT(month FROM current_date);
    
  • using EXECUTE instead PERFORM
  • using explicit cursor instead FOR IN SELECT
  • unsecured dynamic SQL:
    --bad
    EXECUTE 'SELECT ' || column_name || ' FROM ' || table_name
               || ' WHERE ' || column_name || e'=\'' || some_variable || e'\''
      INTO var;
    
    --good
    EXECUTE 'SELECT ' quote_ident(column_name) || ' FROM ' || quote_ident(table_name)
               || WHERE ' || quote_ident(column_name) || '=' || quote_literal(some_variable)
      INTO var;
    
    -- or on 8.4 and higher
    EXECUTE 'SELECT ' quote_ident(column_name) || ' FROM ' || table_name::regclass
               || WHERE ' || quote_ident(column_name) || '= $1'
      INTO var
      USING some_variable
    
    --bad
    CREATE OR REPLACE FUNCTION create_table(schemaname varchar, tablename varchar)
    RETURNS void AS $$
    BEGIN
      EXECUTE 'CREATE TABLE ' || coalesce(schemaname || '.','') || tablename;
      RETRUN;
    END;
    $$ LANGUAGE plpgsql;
    
    --good
    CREATE OR REPLACE FUNCTION create_table(schemaname varchar, tablename varchar)
    RETURNS void AS $$
    BEGIN
      EXECUTE 'CREATE TABLE ' || coalesce(quote_ident(schemaname) || '.','') || quote_ident(tablename);
      RETURN;
    END;
    $$ LANGUAGE plpgsql;
    
  • long lines - 100 chars per line is perfect
  • too short functions - Don't wrap any and only one SQL statement
  • Don't use PL/pgSQL for deep recursion calls
    Due internal design principles PL/pgSQL should not be well optimized for recursion calls. PL/pgSQL supports recursion, and for not too deeps calls can provide enough to satisfy sb performance, but it is very slow for deep recursion. Nice example is taken from presentation http://plv8-talk.herokuapp.com. It is perfect example how don't use PL/pgSQL ever.
    -- non recursion form
    CREATE OR REPLACE FUNCTION public.psqlfibnr(n integer)
     RETURNS integer
     LANGUAGE plpgsql
     IMMUTABLE STRICT
    AS $function$
    DECLARE 
      prev1 int = 0;
      prev2 int = 1;
      result int = 0;
    BEGIN
      FOR i IN 1..n
      LOOP
        result := prev1 + prev2;
        prev2 := prev1;
        prev1 := result;
      END LOOP;
      RETURN result;
    END;
    $function$
    
    -- recursion form
    CREATE OR REPLACE FUNCTION public.psqlfibr(n integer)
     RETURNS integer
     LANGUAGE plpgsql
     IMMUTABLE STRICT
    AS $function$
    BEGIN
      IF n < 2 THEN
        RETURN n;
      END IF;
      RETURN psqlfib(n-1) + psqlfib(n-2);
    END;
    $function$
    
    -- non recursive calls
    postgres=# select n, psqlfibnr(n)
                  from generate_series(0,35,5) as n;
     n  | psqlfibnr 
    ----+-----------
      0 |         0
      5 |         5
     10 |        55
     15 |       610
     20 |      6765
     25 |     75025
     30 |    832040
     35 |   9227465
    (8 rows)
    
    Time: 1.178 ms
    
    -- recursive calls
    postgres=# select n, psqlfib(n)
                  from generate_series(0,35,5) as n;
     n  | psqlfib
    ----+---------
      0 |       0
      5 |       5
     10 |      55
     15 |     610
     20 |    6765
     25 |   75025
     30 |  832040
     35 | 9227465
    (8 rows)
    
    Time: 282992.820 ms
    
    Speed of recursion calls in PL/pgSQL is not comparatable with recursion optimized languages like Javascript.

Don't use procedural code when you can use natural SQL tools. Use UNIQUE INDEX instead custom triggers for ensuring unequivocalness.
-- bad usage
CREATE OR REPLACE FUNCTION ensure_unique()
RETURNS TRIGGER AS $$
BEGIN
  IF EXISTS(SELECT * FROM tab WHERE id = NEW.id) THEN
    RAISE ERROR 'id is not unique';
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- good (use unique index)
CREATE UNIQUE INDEX ON tab(id);

4 comments:

  1. That last example needs 1 or 2 corrections. The input parameter "schema" needs to be "schemaname". And could you use "schemaname varchar DEFAULT 'public')"? Or if not, default with a NULL schemaname so it falls back to the schema search path?

    ReplyDelete
  2. Actually, I can see it already deals with that with the coalesce. Ignore that last part.

    ReplyDelete
  3. 'to short function' in our system we wrap all SQL requests in DB procedure. Usually even very simple 'select' afer few month can be tranformed in something more complex. My opinion is that - DB procedures should by developed by DB people , typical PHP/Java developer knows to little to create optimal DB queries.

    ReplyDelete