I found very strange PL/pgSQL code. PL/pgSQL has some specifics, and is good, if programmer has good knowledge about it. The basic rule - don't repeat string or array concat too often inside loop in function. So bad code:
CREATE OR REPLACE FUNCTION generate_uuid_v4() RETURNS uuid AS $$
DECLARE
value VARCHAR(36);
BEGIN
value = lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || '-';
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || '-';
value = value || lpad((to_hex((ceil(random() * 255)::int & 15) | 64)), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || '-';
value = value || lpad((to_hex((ceil(random() * 255)::int & 63) | 128)), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || '-';
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
value = value || lpad(to_hex(ceil(random() * 255)::int), 2, '0');
RETURN value::uuid;
END;
$$ LANGUAGE 'plpgsql';
postgres=# select generate_uuid_v4() from generate_series(1,10000);
generate_uuid_v4
--------------------------------------
9833de6f-4ba6-49f1-8d5f-d98d10d79346
340c6008-b8fe-4267-bcc5-96bcddb5f675
...
6f54c524-bd02-4efc-84b3-b56d1c93f3ec
96ee245e-2990-4842-9d36-45a66dab8adb
Time: 407,082 ms
This is good sample of wrong using PL/pgSQL language. PLPerl, maybe PLPython will do much better work. When we cannot use these languages, we have to use some tricks. With some modifications original code - we can have 2x faster code.
CREATE OR REPLACE FUNCTION _generate_uuid_v4() RETURNS uuid AS $$
BEGIN
RETURN (lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| '-'
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| '-'
|| lpad((to_hex((ceil(random() * 255)::int & 15) | 64)), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| '-'
|| lpad((to_hex((ceil(random() * 255)::int & 63) | 128)), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| '-'
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0'))::uuid;
END;
$$ LANGUAGE 'plpgsql'
postgres=# select _generate_uuid_v4() from generate_series(1,10000);
_generate_uuid_v4
--------------------------------------
5134fe69-5d2a-4419-bf84-d72dc290f72f
919e8663-b384-4eae-89d9-1026d6be4127
f23f8f4f-6914-47e8-983f-155ace0d8860
...
39951d8f-fbd1-48bf-9c5d-157eefbd5af6
73318784-2cfa-4bcc-ad42-307fc4c173fc
Time: 203,516 ms
Original needs 400ms for ten thousand values. Edited code needs only 200ms for same number of values.
This code should be inlined - when we use SQL language:
CREATE OR REPLACE FUNCTION _generate_uuid_v4() RETURNS uuid AS $$
SELECT (lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| '-'
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| '-'
|| lpad((to_hex((ceil(random() * 255)::int & 15) | 64)), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| '-'
|| lpad((to_hex((ceil(random() * 255)::int & 63) | 128)), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| '-'
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0')
|| lpad(to_hex(ceil(random() * 255)::int), 2, '0'))::uuid;
$$ LANGUAGE SQL;
postgres=# select _generate_uuid_v4() from generate_series(1,10000);
_generate_uuid_v4
--------------------------------------
e725eb84-67ad-485e-bec2-d73485db4e69
49f605d1-226e-4d9e-8457-30a925c98a0c
ed759055-23e7-4322-a98a-552f65a297ae
...
414e62c3-1b84-4bc9-8fd7-ba8bd0a72403
03e27594-42c0-4ec9-a493-636b6d130aad
Time: 148,184 ms
With SQL function we safe other 50ms.
PLpgSQL is good language - but it isn't C language or Fortran. It needs some different manners.
If you can use the pgcrypto contrib module, you can get a bit more performance out of it. This is the function I use for generating random uuids:
ReplyDeletecreate or replace function generate_uuid_v4() returns uuid as $$
select encode(gen_random_bytes(16), 'hex')::uuid
$$ language sql;
On my test system, this takes 88ms to generate 10,000 uuids vs 227ms for the SQL function in your post.
or you can use uuid-ossp contrib :).
ReplyDelete