Pages

Saturday, April 8, 2017

How to find unindexed foreign keys

Often performance issue is missing index on foreign keys. With few queries is very simple to find foreign keys without index:

  1. Create a demo:
    create table a(id serial primary key, v text);
    create table b(id serial primary key, id_a integer references a(id), v text);
    
  2. Create auxiliary functions:
    -- returns a subset of column names specified by their nums in array
    create or replace function attnames(oid, smallint[])
    returns name[] as $$
    select array_agg(attname )
      from (select attname
              from pg_attribute 
             where attrelid = $1 and attnum = any($2)
             order by attnum) s
    $$ language sql;
    
    -- ensure safe quoting of column names for usage in SQL
    create or replace function qnames(name[])
    returns text as $$
    select string_agg(v, ',')
      from (select quote_ident(n) v
              from unnest($1) g(n)) s
    $$ language sql;
    
  3. Following query returns foreign key constraint name, table name and related columns:
    select conname, conrelid::regclass, attnames(conrelid, conkey) 
      from pg_constraint
     where contype = 'f';
    
    ┌─────────────┬──────────┬──────────┐
    │   conname   │ conrelid │ attnames │
    ╞═════════════╪══════════╪══════════╡
    │ b_id_a_fkey │ b        │ {id_a}   │
    └─────────────┴──────────┴──────────┘
    (1 row)
    
  4. We can check if any attnames from this result are covered by some index:
    select conname, conrelid::regclass, attnames(conrelid, conkey)
      from pg_constraint
     where contype = 'f'
       and not exists (select *
                         from pg_index
                        where indrelid = conrelid
                          and attnames(conrelid,conkey) = attnames(conrelid, indkey::smallint[]));
    
    -- list of foreign keys without index
    ┌─────────────┬──────────┬──────────┐
    │   conname   │ conrelid │ attnames │
    ╞═════════════╪══════════╪══════════╡
    │ b_id_a_fkey │ b        │ {id_a}   │
    └─────────────┴──────────┴──────────┘
    (1 row)
    
  5. Next step can be generating CREATE INDEX commands:
    select format('create index on %s(%s)',
                   conrelid::regclass,
                   qnames(attnames(conrelid, conkey)))
      from pg_constraint
     where contype = 'f'
       and not exists (select *
                         from pg_index
                        where indrelid = conrelid
                          and attnames(conrelid,conkey) = attnames(conrelid, indkey::smallint[]));
    
    ┌─────────────────────────┐
    │         format          │
    ╞═════════════════════════╡
    │ create index on b(id_a) │
    └─────────────────────────┘
    (1 row)
    
  6. Inside modern psql we can execute last query with \gexec command, that ensures executing result too.
Small note about placeholders symbols in format function. We should to use %I for SQL identifiers to be result correctly escaped.
select format('(1):%I, (2):%s', 'BadTableName', 'BadTableName');
┌──────────────────────────────────────┐
│                format                │
╞══════════════════════════════════════╡
│ (1):"BadTableName", (2):BadTableName │
└──────────────────────────────────────┘
(1 row)
This case is exception, because I use casting from Oid to regclass, that enforce correct escaping. Two times escaping can produce wrong result. So %I cannot be used here.
create table "BadName"(a int);

select '"BadName"'::regclass::oid;
┌───────┐
│  oid  │
╞═══════╡
│ 16419 │
└───────┘
(1 row)

select 16419::regclass;
┌───────────┐
│ regclass  │
╞═══════════╡
│ "BadName" │
└───────────┘
(1 row)

postgres=# select format('%I, %s', 16419::regclass, 16419::regclass);
┌──────────────────────────┐
│          format          │
╞══════════════════════════╡
│ """BadName""", "BadName" │
└──────────────────────────┘
(1 row)

4 comments:

  1. much appreciated. used it with a large DB with hundreds of tables and some 150+ possible indices were missing. some parts of the app seem to respond faster, especially with large (6-12) table joins. kudos!

    ReplyDelete
  2. Just FYI, This fails to account for the case where you have a multi-column index covering the foreign key column(s). To fix that, the final query should use "<@" instead of "=" to compare the set of keys:

    select conname, conrelid::regclass, attnames(conrelid, conkey)
    from pg_constraint
    where contype = 'f'
    and not exists (select *
    from pg_index
    where indrelid = conrelid
    and attnames(conrelid,conkey) <@ attnames(conrelid, indkey::smallint[]));

    So in other words, only show entries where the foreign key columns are not "contained by" an index (instead of an index that contains exactly these columns).

    Of course you may desire to create a separate index for the foreign key columns, bug I'd argue that generally you do not want that.

    ReplyDelete
  3. In the previous comment, I meant to say "the case where the foreign key columns are a subset of a multi-column index".

    ReplyDelete
  4. I understand. I know this limit and your idea is one possibility. There is another risk - the subset is not 100% correct. On following example the subset operator returns true, although if we use columns names there, the result will be wrong. So '=' can enforce false alarm. Using '<@' can miss alarm. For this situation some new operator can be better - some like same value on same possition - but I was little but lazy to write it.

    postgres=# select ARRAY[1,2,3] <@ ARRAY[0,0,1,2,3,4,5];
    ┌──────────┐
    │ ?column? │
    ╞══════════╡
    │ t │
    └──────────┘
    (1 row)

    ReplyDelete