Skip to content

Instantly share code, notes, and snippets.

@yozzz
Last active December 24, 2018 09:02
Show Gist options
  • Select an option

  • Save yozzz/da9023a322e660270b3c7fa2571da636 to your computer and use it in GitHub Desktop.

Select an option

Save yozzz/da9023a322e660270b3c7fa2571da636 to your computer and use it in GitHub Desktop.
Elasticsearch suggestions. Example of Postgresql function that split text into the separated phrases
-- If have to deal with elasticsearch suggestoins
-- there is nice article about them https://hackernoon.com/elasticsearch-using-completion-suggester-to-build-autocomplete-e9c120cf6d87
-- but you may face one tricky issue about them
-- e.g. you have completion field 'album' with value "The Prodigy: Reunion"
-- if you will search with suggestion "The Prodigy" it will return result for you
-- but if you will try to search by just "Prodigy" you will get no results
-- this is beacause copmpletion suggester start matchig only from the start of the string
-- https://hackernoon.com/elasticsearch-using-completion-suggester-to-build-autocomplete-e9c120cf6d87#2f5e
-- to fix this problem we can split string into the three separated phrases
-- 1) "The Prodigy: Reunion"
-- 2) "Prodigy: Reunion"
-- 3) "Reunion"
-- and put them inside completion field into index as an array ["The Prodigy Reunion", "Prodigy Reunion", "Reunion"]
CREATE OR REPLACE FUNCTION split_words (words CHAR)
RETURNS text[] AS $$
DECLARE
arr TEXT[] := regexp_split_to_array(words, E'\\s+');
total_words INTEGER := array_length(arr, 1);
arr2 TEXT[];
words TEXT;
iterator INTEGER := 0;
iterator2 INTEGER := 0;
step INTEGER := 1;
words_count INTEGER := 0;
BEGIN
FOR i IN 1 .. total_words
LOOP
words_count := total_words - iterator;
iterator2 := step;
iterator := iterator + 1;
words := NULL;
FOR j IN 1 .. words_count
LOOP
CASE
WHEN j < words_count
THEN words := CONCAT(words, arr[iterator2], ' ');
WHEN j = words_count
THEN words := CONCAT(words, arr[iterator2]);
END CASE;
iterator2 := iterator2 + 1;
END LOOP;
arr2 := array_append(arr2, words);
step := step + 1;
END LOOP ;
RETURN arr2 ;
END ;
$$ LANGUAGE plpgsql;
SELECT split_words('The Prodigy: Reunion');
-- {The Prodigy Reunion,Prodigy Reunion,Reunion}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment