Skip to content

Instantly share code, notes, and snippets.

@laugustyniak
Last active January 7, 2018 23:21
Show Gist options
  • Select an option

  • Save laugustyniak/d0a559fecc6833380b535257bf3c8be7 to your computer and use it in GitHub Desktop.

Select an option

Save laugustyniak/d0a559fecc6833380b535257bf3c8be7 to your computer and use it in GitHub Desktop.
most_similar_words_spacy
nlp = spacy.load('en_vectors_web_lg')
def most_similar(word):
queries = [w for w in word.vocab if w.is_lower == word.is_lower and w.prob >= -15]
by_similarity = sorted(queries, key=lambda w: word.similarity(w), reverse=True)
return by_similarity[:10]
word_to_lookup = ['insurance', 'extended', 'phone', 'coverage', 'device']
for word in word_to_lookup:
print([w.lower_ for w in most_similar(nlp.vocab[word])])
# ['insurance', 'mortgage', 'liability', 'loan', 'loans', 'auto', 'health', 'credit', 'mortgages', 'coverage']
# ['extended', 'extend', 'long', 'extension', 'longer', 'short', 'shorter', 'periods', 'limited', 'end']
# ['phone', 'phones', 'telephone', 'mobile', 'cell', 'call', 'calling', 'calls', 'email', 'touch']
# ['coverage', 'insurance', 'covering', 'covered', 'reporting', 'covers', 'cover', 'policy', 'liability', 'benefit']
# ['device', 'devices', 'interface', 'invention', 'controller', 'method', 'system', 'wireless', 'module', 'user']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment