Pages

Sunday, October 23, 2011

Change a unknown record's field in PL/pgSQL

Hello
Week ago was a request on stackoverflow about change of composite variable's field by dynamic SQL. I wrote a first very simply solution:
CREATE OR REPLACE FUNCTION public.setfield(a anyelement, text, text)
RETURNS anyelement
LANGUAGE plpgsql
AS $function$
begin
  create temp table aux as select $1.*;
  execute 'update aux set ' || quote_ident($2) || ' = ' || quote_literal($3);
  select into $1 * from aux;
  drop table aux;
  return $1;
end;
$function$
It's works, but it's slow and it consumes lot of shared memory (impracticable for repeated using in one transaction). Next version was better - it is faster and isn't hungry:
CREATE OR REPLACE FUNCTION x.setfield2(anyelement, text, text)
 RETURNS anyelement
 LANGUAGE plpgsql
AS $function$
DECLARE 
  _name text;
  _values text[];
  _value text;
  _attnum int;
BEGIN
  FOR _name, _attnum
     IN SELECT a.attname, a.attnum
           FROM pg_catalog.pg_attribute a 
          WHERE a.attrelid = (SELECT typrelid
                                 FROM pg_type
                                WHERE oid = pg_typeof($1)::oid) 
  LOOP
    IF _name = $2 THEN
      _value := $3;
    ELSE
      EXECUTE 'SELECT (($1).' || quote_ident(_name) || ')::text' INTO _value USING $1;
    END IF;
    _values[_attnum] :=  COALESCE('"' || replace(replace(_value, '"', '""'), '''', '''''') || '"', '');
  END LOOP;
  EXECUTE 'SELECT (' || pg_typeof($1)::text || '''(' || array_to_string(_values,',') || ')'').*' INTO $1; 
  RETURN $1;
END;
$function$;
There are a few slower points: EXECUTE in loop, array's update in loop. But Erwin Brandstetter found a probably best and most simply solution - there are no more ways on PL/pgSQL level.
CREATE OR REPLACE FUNCTION public.setfield3(anyelement, text, text)
RETURNS anyelement
AS $body$
DECLARE
 _list text;

BEGIN
_list := (
   SELECT string_agg(x.fld, ',')
   FROM   (
      SELECT CASE WHEN a.attname = $2
              THEN quote_literal($3)
              ELSE quote_ident(a.attname)
             END AS fld
      FROM   pg_catalog.pg_attribute a 
      WHERE  a.attrelid = (SELECT typrelid
                           FROM   pg_type
                           WHERE  oid = pg_typeof($1)::oid) 
      ORDER BY a.attnum
   ) x
);

EXECUTE '
SELECT ' || _list || '
FROM   (SELECT $1.*) x'
USING  $1
INTO   $1;

RETURN $1;
END;
$body$ LANGUAGE plpgsql;
There is no PL/pgSQL's loop and there is just one EXECUTE. More - this code is just simple - without lot of string (quoting) operations.

Update: last Ervin' superfast version
CREATE FUNCTION f_setfield(INOUT _comp_val anyelement, _field text, _val text)
  RETURNS anyelement AS
$func$
BEGIN

EXECUTE 'SELECT ' || array_to_string(ARRAY(
      SELECT CASE WHEN attname = _field
                THEN '$2'
                ELSE '($1).' || quote_ident(attname)
             END AS fld
      FROM   pg_catalog.pg_attribute
      WHERE  attrelid = pg_typeof(_comp_val)::text::regclass
      AND    attnum > 0
      AND    attisdropped = FALSE
      ORDER  BY attnum
      ), ',')
USING  _comp_val, _val
INTO   _comp_val;

END
$func$ LANGUAGE plpgsql;
Very nice, Erwin!

5 comments:

  1. You can also move the ORDER BY a.attnum into the string_agg directly, so will eliminate the subquery :D

    ReplyDelete
  2. @Valentine: Yeah, yet another small improvement. Nice catch!

    ReplyDelete
  3. @Valentine: Nice catch. :) However, while that would remove the subquery, it would hardly simplifies the syntax.

    More importantly, the query would be much slower. ~40 % slowdown in my tests with PostgreSQL 9.0. If you should come up with different results, I would love to know about it. Maybe post your results at http://stackoverflow.com?

    ReplyDelete
  4. @Ervin: Probably, there is significant overhead of additional "tuplesort" for aggregation on very small resultset.

    ReplyDelete
  5. Consider the new version I posted on Stackoverflow (at the linked location). It's substantially shorter and faster.

    ReplyDelete