Thursday, April 16, 2020

split string to table

Sometimes life can be funny. This week there was a question about XMLTABLE function in Postgres's maling list. It was related to migration from Oracle to Postgres.

CREATE TABLE user_pool_clean (
    fk_user_pool_pk bytea NOT NULL,
    user_id character varying(255) NOT NULL,
    email_address character varying(250),
    is_mil numeric,
    is_civ numeric,
    is_ctr numeric,
    is_gov numeric,
    is_edu numeric,
    role_id character varying(50),
    user_profile_id character varying(50),
    service_branch_id character varying(50),
    mil_pay_grade_id character varying(50),
    my_auds character varying(4000),
    my_orgs character varying(4000),
    processed character(1) DEFAULT 'N'::bpchar NOT NULL
);

insert into user_pool_clean
values('995CECDC1881375DE05312A270C7CF56','10015706','noemail@fter.dbrefresh.mil',0,0,0,0,0,'1','4','700006','3','1706882','1707720','Y');

insert into user_pool_clean
values('995CECDC1905375DE05312A270C7CF56','10015848','noemail@fter.dbrefresh.mil',0,0,0,0,0,'1','3','700015','11','
1705562,1708486','1710621','Y');


SQL> SELECT upc.is_mil,TRIM(column_value) src
         FROM   user_pool_clean upc
               ,xmltable(('"'|| REPLACE( upc.my_auds, ',', '","')|| '"'));

    IS_MIL SRC
---------- ---------------
     0       1705562              --------O/P from the oracle database
     0        1708486
    0        1706882

It is hard to understand, why somebody used a xmltable function. After some googling, I found, so this is a workaround for generating rows from some string. The original input string is modified to XQuery sequence format. When xmltable has not any other clauses with only one input parameter, then this parameter is used as XQuery expression. Unfortunately nothing similar is possible in Postgres. Postgres uses library libxml2 for xml support, and this library supports only older XPath language and doesn't support more powerful but more complex XQuery language.

Fortunately, Postgres has more tools for split string to rows (it can be used cleanly without workarounds)

First possibility is using string_to_array and unnest functions:

postgres=# select is_mil, unnest(string_to_array(my_auds, ',')) from user_pool_clean ;
┌────────┬─────────┐
│ is_mil │ unnest  │
╞════════╪═════════╡
│      0 │ 1706882 │
│      0 │        ↵│
│        │ 1705562 │
│      0 │ 1708486 │
└────────┴─────────┘
(3 rows)

The result is broken by new line char (probably it is cleaned by XQuery evaluation, because new line is ignored in XML), so we should to clean this result:

postgres=# select is_mil, trim(e' \n' from unnest(string_to_array(my_auds, ','))) from user_pool_clean ;
┌────────┬─────────┐
│ is_mil │  btrim  │
╞════════╪═════════╡
│      0 │ 1706882 │
│      0 │ 1705562 │
│      0 │ 1708486 │
└────────┴─────────┘
(3 rows)

Now, the result is perfect. Is nice to see the power of postgres table functions (and possibilities). The form with lateral join like original query is possible too:
postgres=# select is_mil, trim(e' \n' from  u) 
             from user_pool_clean,
                  unnest(string_to_array(my_auds, ',')) u ;
┌────────┬─────────┐
│ is_mil │  btrim  │
╞════════╪═════════╡
│      0 │ 1706882 │
│      0 │ 1705562 │
│      0 │ 1708486 │
└────────┴─────────┘
(3 rows)

Another way, but maybe little bit slower (but with more possibilities) is using regexp_split_to_table function:

postgres=# select is_mil, trim(e' \n' from  u) 
            from user_pool_clean, 
                 regexp_split_to_table(my_auds, ',') u ;
┌────────┬─────────┐
│ is_mil │  btrim  │
╞════════╪═════════╡
│      0 │ 1706882 │
│      0 │ 1705562 │
│      0 │ 1708486 │
└────────┴─────────┘
(3 rows)

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home