Pages

Sunday, October 20, 2013

Using a custom composite types in PostgreSQL

I like a custom composite types - in PL/pgSQL. I don't use a composite types in tables ever. There is a risk of some issues with composite types on client side - a client language drivers usually doesn't work simply with custom composite types - so I use it only on server side. This functionality allow me to build simply simple OOP system, that it is enough strong for implementation some not trivial application:

I declare a composite type and auxiliary constructors:
CREATE TYPE foo_type AS (name text, size numeric, description text);

CREATE OR REPLACE FUNCTION public._foo_type(name text DEFAULT NULL::text, size numeric DEFAULT 0.0, description text DEFAULT NULL::text)
 RETURNS foo_type
 LANGUAGE plpgsql
AS $function$
DECLARE r foo_type;
BEGIN
  r.name = _foo_type.name;
  r.size = _foo_type.size;
  r.description = _foo_type.description;

  IF COALESCE(name, '') = '' OR size IS NULL THEN
    RAISE EXCEPTION 'mandatory field (name or size) is empty';
  END IF;

  IF size < 0 THEN
    RAISE EXCEPTION 'size is negative';
  END IF;

  RETURN r;
END;
$function$;
Own constructors should to cross a limited functionality - custom composite types doesn't support default values and CHECK constraints. Both should be placed in constructors.
postgres=# SELECT _foo_type(name := 'Trumpeta');
    _foo_type    
─────────────────
 (Trumpeta,0.0,)
(1 row)

Time: 3.820 ms
postgres=# SELECT _foo_type(size := 175, name := 'Trumpeta');
    _foo_type    
─────────────────
 (Trumpeta,175,)
(1 row)

Time: 1.014 ms
postgres=# SELECT _foo_type(size := -1, name := 'Trumpeta');
ERROR:  size is negative
Time: 0.877 ms
postgres=# SELECT _foo_type();
ERROR:  mandatory field (name or size) is empty
Time: 1.160 ms
I like a named parameters - code is much more self documented, and I removed a risk of swapped parameters. Later I implement a custom casting methods (functions) - sometimes I join it with postgres custom casting:
CREATE OR REPLACE FUNCTION public.cast_to_xml(foo_type)
 RETURNS xml
 LANGUAGE sql
AS $function$
  SELECT xmlforest($1.name AS "Name", $1.size AS "Size", $1.description AS "Description")
$function$;

CREATE CAST (foo_type AS xml) WITH FUNCTION cast_to_xml(foo_type);

postgres=# SELECT _foo_type(name := 'Trumpeta')::xml;
                    _foo_type                     
──────────────────────────────────────────────────
 <Name>Trumpeta</Name><Size>0.0</Size>
(1 row)
I don't like use a custom types in tables - so I need a functionality for merging to table type and casting from table type:
CREATE TABLE boo (
   alfa text,
   beta text, 
   name text NOT NULL CHECK(name <> ''),
   size numeric NOT NULL DEFAULT 0.0 CHECK (size >= 0),
  description text DEFAULT NULL
);

CREATE OR REPLACE FUNCTION public.merge(boo, foo_type)
 RETURNS boo
 LANGUAGE plpgsql
AS $function$
BEGIN
  $1.name := $2.name;
  $1.size := $2.size;
  $1.description := $2.description;
  RETURN $1;
END;
$function$;

CREATE OR REPLACE FUNCTION public.cast_to_foo_type(boo)
 RETURNS foo_type
 LANGUAGE plpgsql
AS $function$
DECLARE r foo_type;
BEGIN
  r.name := $1.name;
  r.size := $1.size;
  r.description := $1.description;
  RETURN r;
END;
$function$

CREATE CAST (boo AS foo_type) WITH FUNCTION cast_to_foo_type(boo);

CREATE OR REPLACE FUNCTION public._boo(alfa text, beta text, foo foo_type)
 RETURNS boo
 LANGUAGE plpgsql
AS $function$
DECLARE r boo;
BEGIN
  r.alfa := _boo.alfa;
  r.beta := _boo.beta;
  r := merge(r, foo);
  RETURN r;
END;
$function$
Now I can do more usual things with table type boo - insert value or transform to xml:
CREATE OR REPLACE FUNCTION public.new_boo(b boo)
 RETURNS text
 LANGUAGE plpgsql
AS $function$
DECLARE b boo; 
  BEGIN
  INSERT INTO boo VALUES((b).*);
  RETURN b.alfa;
END;
$function$

postgres=# SELECT new_boo(_boo(alfa := 'XSY',beta := 'JJS', foo := _foo_type(name:='Trumpeta')));
 new_boo 
─────────
 XSY
(1 row)

postgres=# SELECT * FROM boo;
 alfa │ beta │   name   │ size │ description 
──────┼──────┼──────────┼──────┼─────────────
 XSY  │ JJS  │ Trumpeta │  0.0 │ [null]
(1 row)

CREATE OR REPLACE FUNCTION cast_to_xml(boo)
RETURNS xml AS $$
  SELECT xmlelement(NAME "boo", xmlattributes($1.alfa AS "Alfa"), 
            xmlelement(NAME "Beta", $1.beta), $1::foo_type::xml);
$$ LANGUAGE sql;

CREATE CAST (boo AS xml) WITH FUNCTION cast_to_xml(boo)

postgres=# SELECT boo::xml FROM boo;
                                          boo                                           
────────────────────────────────────────────────────────────────────────────────────────
 <boo Alfa="XSY"><Beta>JJS</Beta><Foo><Name>Trumpeta</Name><Size>0.0</Size></Foo></boo>
(1 row)
With described conventions and patterns I was able to implement and simply manage complex stored procedures centric system that implements a number portability between N telco operators.

Second advantage of this code - it is robust and resistant to addition any column anywhere. I newer use a INSERT statement from application, so it is easy and safe to modify database schema.

7 comments:

  1. It's a shame you have to avoid custom types just because client language bindings are often underpowered.

    I actually think it's a real benefit to be able to use abstract data types in your language, and to be able to take that all the way to your database (or the other way around, if you prefer). I've written up an example in Chicken Scheme on how to do this. I believe Python's Psycopg bindings are of equally high quality, which also allow you to do things like this.

    Don't let your programming language impose arbitrary limitations on your data model!

    ReplyDelete
  2. A flatten tables is only one feature of proposed solution - others are default values, checks and constraints for composite types (although some can be solved with some magic with domains).

    A my system is open and possibly accessed from lot of platform and languages - other motivation why don't push custom types to client.

    ReplyDelete
  3. That sounds like a similar reasoning to "we don't use advanced Postgres features because we want to support multiple RDBMSes". Ie, because MySQL is sucky, you can't use recursive queries in your application.

    I understand the importance of portability, but it's an important trade-off that must be consciously made.

    ReplyDelete
  4. No, I use a PostgreSQL features, and I wrote some from them - as plpgsql hacker, but I must not use a all, where I am not sure, so benefits are higher than costs.

    ReplyDelete
  5. Pavel,

    Correct me if I am wrong, but I think that it is not necessary to avoid inserts (or encapsulate them into procedures) in fear of possible future addition of columns to affected tables. It is only necessary to avoid "blind" inserts, ie those without target columns specified. Example:

    insert into table values (val1, val2, val3);

    If the table "table" had originally 3 columns and later somebody adds another columns, this insert will fail, of course.

    But inserts with exactly specified target columns cannot be endangered by adding new columns, and they will work exactly as intended, even if somebody will add new columns to a table in the future. Example:

    insert into table (col1, col2, col3) values (val1, val2, val3);

    Another story would be renaming of columns, but this would affect both simple inserts as well as those encapsulated in a stored procedure.

    ReplyDelete
  6. Pavel,

    Correct me if I am wrong, but I think that it is not necessary to avoid inserts (or encapsulate them into procedures) in fear of possible future addition of columns to affected tables. It is only necessary to avoid "blind" inserts, ie those without target columns specified. Example:

    insert into table values (val1, val2, val3);

    If the table "table" had originally 3 columns and later somebody adds another columns, this insert will fail, of course.

    But inserts with exactly specified target columns cannot be endangered by adding new columns, and they will work exactly as intended, even if somebody will add new columns to a table in the future. Example:

    insert into table (col1, col2, col3) values (val1, val2, val3);

    Another story would be renaming of columns, but this would affect both simple inserts as well as those encapsulated in a stored procedure.

    ReplyDelete
  7. @Roman Golis - No, you can use a INSERTs still. The core of this story is different. I used a custom composite types. These types are little bit limited - doesn't support default values. Next issue is casting - you should to write more code to ensure casting. Demonstrated custom functions can help with it.

    If you don't use a custom types, then don't need it - although I am thinking so it is preferable too (against DML), because you creates a API.

    ReplyDelete