the longest common substring of two strings
Hello I had to solve one simple task - to find the longest common substring only in SQL: I have a two functions - one returns all substrings of entered string:
CREATE OR REPLACE FUNCTION substrings(text) RETURNS text[] AS $$ SELECT array(SELECT DISTINCT substring($1 from i for j) FROM generate_series(1, length($1)) g(i), generate_series(1, length($1)) h(j) ) $$ LANGUAGE sql; postgres=# SELECT substrings('pavel'); substrings ----------------------------------------------------- {e,vel,avel,p,pav,a,pavel,pa,av,ve,ave,l,pave,el,v} (1 row)Second function returns the longest common substrings:
CREATE OR REPLACE FUNCTION max_common_substring(text, text) RETURNS text[] AS $$ SELECT ARRAY(SELECT x FROM (SELECT $1, x, length(x), rank() OVER (ORDER BY length(x) DESC) FROM unnest(substrings($2)) g(x) WHERE strpos($1,x) > 0) xx WHERE rank = 1) $$ LANGUAGE sql; postgres=# SELECT max_common_substring('stěhula','stěhule'); max_common_substring ---------------------- {stěhul} (1 row)
2 Comments:
Using the prefix_range extension:
prefix=# select 'stěhula'::prefix_range | 'stěhule' as union;
union
-------------
stěhul[a-e]
(1 row)
Then in my use case taking apart the common prefix and the range limits was not expected, I could add it pretty easily though.
Very useful post. Thank you.
Post a Comment
Subscribe to Post Comments [Atom]
<< Home