Using cursors for generating cross tables
I am working on procedure support in PostgreSQL more than one year. Missing procedures is one disadvantage of PostgreSQL. Because procedures are not called from SELECT statement, then it couldn't respect some rules like functions. Procedures are mainly used for explicit transaction controlling and for generating mutable results (like dynamic record set or stacked record set). Dynamic record set is interesting feature when we nothing know about result set's columns. It's typical for cross tables. Because PostgreSQL doesn't support procedures, we cannot return dynamic (mutable number of columns) queries directly, but we can return dynamic cursor. This method of generating cross tables is inspired by Roland Bauman's http://rpbouman.blogspot.com blog.
Cross table is based on query's pattern:
SELECT shop, SUM(CASE gender WHEN 'f' THEN salary ELSE 0 END) AS f, SUM(CASE gender WHEN 'm' THEN salary ELSE 0 END) AS m, SUM(salary) AS total FROM employees INNER JOIN shops USING (shop_id) GROUP BY shop
for data:
CREATE TABLE employees ( id serial PRIMARY KEY, shop_id int, gender char(1), name varchar(32), salary int ); CREATE TABLE shops ( id serial PRIMARY KEY, shop varchar(32) ); INSERT INTO shops(shop) VALUES('Zurich'),('New York'),('London'); INSERT INTO employees (shop_id, gender, name, salary) VALUES (1, 'm', 'Jon Simpson', 4500), (1, 'f', 'Barbara Breitenmoser',4700), (2, 'f', 'KirstenRuegg',5600), (3, 'm', 'Ralp Teller',5100), (3, 'm', 'Peter Jonson',4700);I write function, that generate necessary SELECT statement and open dynamic cursor.
CREATE OR REPLACE FUNCTION do_cross_cursor(dimx_name varchar, dimx_source varchar, dimy_name varchar, dimy_source varchar, expr varchar) RETURNS refcursor AS $$ DECLARE col_list text[] := '{}'; query text; r RECORD; result refcursor := 'result'; BEGIN FOR r IN EXECUTE 'SELECT DISTINCT ' || dimx_name || '::text AS val ' || dimx_source LOOP col_list := array_append(col_list, 'SUM(CASE ' || dimx_name || ' WHEN ' || quote_literal(r.val) || ' THEN ' || expr || ' ELSE 0 END) AS ' || quote_ident(r.val) || ''); END LOOP; query := 'SELECT ' || dimy_name || ', ' || array_to_string(col_list, ',') || ', SUM(' || expr || ') AS Total ' || dimy_source || ' GROUP BY ' || dimy_name; OPEN result NO SCROLL FOR EXECUTE query; RETURN result; END; $$ LANGUAGE plpgsql STRICT;Because cursors should be used only in transaction, I have to use explicit transaction:
BEGIN; SELECT do_cross_cursor('gender', 'FROM employees','shop', 'FROM employees e JOIN shops s ON s.id = e.shop_id', 'salary'); FETCH ALL FROM result; END; BEGIN; SELECT do_cross_cursor('shop', 'FROM shops','gender', 'FROM employees e JOIN shops s ON s.id = e.shop_id', 'salary'); FETCH ALL FROM result; END;There is result:
postgres=# BEGIN; BEGIN postgres=# SELECT do_cross_cursor('gender', 'FROM employees', 'shop', 'FROM employees e JOIN shops s ON s.id = e.shop_id', 'salary'); do_cross_cursor ----------------- result (1 row) postgres=# FETCH ALL FROM result; shop | m | f | total ----------+------+------+------- New York | 0 | 5600 | 5600 Zurich | 4500 | 4700 | 9200 London | 9800 | 0 | 9800 (3 rows) postgres=# END;BEGIN; COMMIT BEGIN postgres=# SELECT do_cross_cursor('shop', 'FROM shops','gender', 'FROM employees e JOIN shops s ON s.id = e.shop_id', 'salary'); do_cross_cursor ----------------- result (1 row) postgres=# FETCH ALL FROM result; gender | New York | Zurich | London | total --------+----------+--------+--------+------- m | 0 | 4500 | 9800 | 14300 f | 5600 | 4700 | 0 | 10300 (2 rows) postgres=# END; COMMITA alternative for this solution is using a tablefuc extension:
postgres=# CREATE VIEW report AS SELECT shop, gender, salary FROM shops, employees WHERE shop_id = shops.id; CREATE VIEW postgres=# SELECT shop, gender, sum(salary) FROM report GROUP BY 1,2 UNION ALL SELECT shop, 'X', sum(salary) FROM report GROUP BY 1 UNION ALL SELECT NULL, gender, sum(salary) FROM report GROUP BY 2 UNION ALL SELECT NULL, 'X', sum(salary) FROM report ORDER BY 1; shop | gender | sum ----------+--------+------- London | m | 9800 London | X | 9800 New York | f | 5600 New York | X | 5600 Zurich | m | 4500 Zurich | X | 9200 Zurich | f | 4700 | m | 14300 | f | 10300 | X | 24600 (10 rows) postgres=# SELECT * FROM crosstab( 'SELECT shop, gender, sum(salary) FROM report GROUP BY 1,2 UNION ALL SELECT shop, ''X'', sum(salary) FROM report GROUP BY 1 UNION ALL SELECT NULL, gender, sum(salary) FROM report GROUP BY 2 UNION ALL SELECT NULL, ''X'', sum(salary) FROM report ORDER BY 1', 'VALUES(''f''),(''m''),(''X'')') AS ct(shop varchar, f bigint, m bigint, "total" bigint); shop | f | m | total ----------+-------+-------+------- London | | 9800 | 9800 New York | 5600 | | 5600 Zurich | 4700 | 4500 | 9200 | 10300 | 14300 | 24600 (4 rows)
15 Comments:
At least on Firefox, the SQL text in the page is truncated. May it be possible to fix the CSS or to wrap the text so it is readable ?
Thank you for your valuable work on postgreSQL
Hello,
I reformated article. Wrong style :(
Your article was very helpful to me, thanks a lot :-)
w
Hi!
Very good article! It's just what I've looking for.
I have one question: probably I'll use this technique in an application I'm developing and want to know about the impact in terms of performance with a large ammount of data. Is it fine?
Thank you!
To Gustavo
Hello
what I know, this method is the best on SQL level. But on very large tables this query should run long time. PostgreSQL missing OLAP or ROLAP support. Try it and you will see. This solution is very simple, but if it will be too slow you can use some general OLAP sw - Mondrian, Pentaho
regards
Pavel
How about implementing a new keyword in Postgres, like EXECUTE PROCEDURE 'function_name()' which returns "SETOF RECORD", and which is equivalent to SELECT * FROM 'function_name()', but doesn't give you an error?
regards,
davor
This is in my ToDo (when I meet some sponsor). It is standardized: CREATE PROCEDURE and CALL statement.
Brilliant article... it is simpler to do crosstab ;;)
Best regards,
Irman
I am a molecular biologist and have minimal Postgres knowledge, so my question is very naive. I pasted your code into pgAdmin sql window, and everything works fine, your function is being created. But when I execute your procedure, PG says "3 rows discarded" and shows nothing. I am sure that this is because of my ignorance - can you tell me how to do it right? My goal is to create simple dynamic pivot queries like with Access.
To Adriano: you can test it in console? What are you doing exactly? This is pgAdmin message or PostgreSQL message?
Dear Pawel
thank you for your help. When I paste the function into pgAdmin without wrapping it into the transaction, it works fine. Within the transaction block, it rejects the output. I will try to read more about transactions in order to understand this behavior.
I have a big question though. Could you write a very short (3-4 lines) documentation of the parameters of your function? What is the admissible parameter syntax, etc? A million thanks in advance!!!
well, I tried the following:
SELECT do_cross_cursor( 'varid',
'FROM loggingdb_ips_boolean As log',
'to_char(ipstimestamp, ''mon DD HH24h'')',
'FROM loggingdb_ips_boolean As log',
'COUNT(*)::integer');
FETCH ALL FROM result;
But it aborts with the following message:
ERROR: aggregate function calls cannot be nested
SQL state: 42803
Context: PL/pgSQL function "do_cross_cursor" line 19 at OPEN
Does this mean that one cannot use aggregate functions within this cursor?
To Adriano: You can use aggregate function with cursors - but you cannot nested aggregates - if you can understand code - it is based on dynamic SQL - then some query is dynamically created - and it use SUM as example. When you use COUNT(*), then it creates query with ..SUM( .. COUNT(*)) that is not valid. If you can use COUNT, then you have to modify source code. /parameters - it is column name and source, column name and source.
Hi Pavel Stěhule
its a beautiful code to create cross relation pivot table. i need to change all diagonal values to 'N/A'. i have been trying to make some changes but could not succeeded. can you plz tell me how can i do this.
hi there . your blog has helped me alot thank you very much . now i need some altretaion in this besause my requirmnts are slightly different . my scenario is i have three columns say
A(text) B(text) Value(text)
AA1 AA1 0
AA2 AA1 100
AA3 AA1 100
AA1 AA2 100
AA2 AA2 0
AA3 AA2 100
AA1 AA3 100
AA2 AA3 100
AA3 AA3 0
now i want a cross table that looks like
AA1 AA2 AA3
AA1 0 100 100
AA2 100 0 100
AA3 100 100 0
and then it should replace o with not applicable (N/A) so i am looking a cross table like
AA1 AA2 AA3
AA1 N/A 100 100
AA2 100 N/A 100
AA3 100 100 N/A
i have achieved the table with zeros but i am unable to place N/A value where row and column name is same i,e fro AA1 to AA1 or AA2 to AA2 . thans i will wait for your reply
Post a Comment
Subscribe to Post Comments [Atom]
<< Home