Using cursors for generating cross tables
I am working on procedure support in PostgreSQL more than one year. Missing procedures is one disadvantage of PostgreSQL. Because procedures are not called from SELECT statement, then it couldn't respect some rules like functions. Procedures are mainly used for explicit transaction controlling and for generating mutable results (like dynamic record set or stacked record set). Dynamic record set is interesting feature when we nothing know about result set's columns. It's typical for cross tables. Because PostgreSQL doesn't support procedures, we cannot return dynamic (mutable number of columns) queries directly, but we can return dynamic cursor. This method of generating cross tables is inspired by Roland Bauman's http://rpbouman.blogspot.com blog.
Cross table is based on query's pattern:
SELECT shop,
SUM(CASE gender WHEN 'f' THEN salary ELSE 0 END) AS f,
SUM(CASE gender WHEN 'm' THEN salary ELSE 0 END) AS m,
SUM(salary) AS total
FROM employees INNER JOIN shops USING (shop_id)
GROUP BY shop
for data:
CREATE TABLE employees (I write function, that generate necessary SELECT statement and open dynamic cursor.
id serial PRIMARY KEY,
shop_id int,
gender char(1),
name varchar(32),
salary int
);
CREATE TABLE shops (
id serial PRIMARY KEY,
shop varchar(32)
);
INSERT INTO shops(shop) VALUES('Zurich'),('New York'),('London');
INSERT INTO employees (shop_id, gender, name, salary)
VALUES
(1, 'm', 'Jon Simpson', 4500),
(1, 'f', 'Barbara Breitenmoser',4700),
(2, 'f', 'KirstenRuegg',5600),
(3, 'm', 'Ralp Teller',5100),
(3, 'm', 'Peter Jonson',4700);
CREATE OR REPLACE FUNCTION do_cross_cursor(dimx_name varchar,Because cursors should be used only in transaction, I have to use explicit transaction:
dimx_source varchar, dimy_name varchar,
dimy_source varchar, expr varchar)
RETURNS refcursor AS $$
DECLARE
col_list text[] := '{}';
query text;
r RECORD;
result refcursor := 'result';
BEGIN
FOR r IN EXECUTE 'SELECT DISTINCT '
|| dimx_name || '::text AS val ' || dimx_source
LOOP
col_list := array_append(col_list, 'SUM(CASE ' || dimx_name
|| ' WHEN ' || quote_literal(r.val) || ' THEN ' || expr
|| ' ELSE 0 END) AS ' || quote_ident(r.val) || '');
END LOOP;
query := 'SELECT ' || dimy_name || ', '
|| array_to_string(col_list, ',')
|| ', SUM(' || expr || ') AS Total '
|| dimy_source || ' GROUP BY ' || dimy_name;
OPEN result NO SCROLL FOR EXECUTE query;
RETURN result;
END;
$$ LANGUAGE plpgsql STRICT;
BEGIN;There is result:
SELECT do_cross_cursor('gender', 'FROM employees','shop',
'FROM employees e JOIN shops s ON s.id = e.shop_id',
'salary');
FETCH ALL FROM result;
END;
BEGIN;
SELECT do_cross_cursor('shop', 'FROM shops','gender',
'FROM employees e JOIN shops s ON s.id = e.shop_id',
'salary');
FETCH ALL FROM result;
END;
postgres=# BEGIN;
BEGIN
postgres=# SELECT do_cross_cursor('gender', 'FROM employees',
'shop',
'FROM employees e JOIN shops s ON s.id = e.shop_id', 'salary');
do_cross_cursor
-----------------
result
(1 row)
postgres=# FETCH ALL FROM result;
shop | m | f | total
----------+------+------+-------
New York | 0 | 5600 | 5600
Zurich | 4500 | 4700 | 9200
London | 9800 | 0 | 9800
(3 rows)
postgres=# END;BEGIN;
COMMIT
BEGIN
postgres=# SELECT do_cross_cursor('shop', 'FROM shops','gender',
'FROM employees e JOIN shops s ON s.id = e.shop_id', 'salary');
do_cross_cursor
-----------------
result
(1 row)
postgres=# FETCH ALL FROM result;
gender | New York | Zurich | London | total
--------+----------+--------+--------+-------
m | 0 | 4500 | 9800 | 14300
f | 5600 | 4700 | 0 | 10300
(2 rows)
postgres=# END;
COMMIT


5 Comments:
At least on Firefox, the SQL text in the page is truncated. May it be possible to fix the CSS or to wrap the text so it is readable ?
Thank you for your valuable work on postgreSQL
Hello,
I reformated article. Wrong style :(
Your article was very helpful to me, thanks a lot :-)
w
Hi!
Very good article! It's just what I've looking for.
I have one question: probably I'll use this technique in an application I'm developing and want to know about the impact in terms of performance with a large ammount of data. Is it fine?
Thank you!
To Gustavo
Hello
what I know, this method is the best on SQL level. But on very large tables this query should run long time. PostgreSQL missing OLAP or ROLAP support. Try it and you will see. This solution is very simple, but if it will be too slow you can use some general OLAP sw - Mondrian, Pentaho
regards
Pavel
Post a Comment
Subscribe to Post Comments [Atom]
<< Home