Skip to content

Instantly share code, notes, and snippets.

@kristiyanto
Created July 3, 2024 14:30
Show Gist options
  • Select an option

  • Save kristiyanto/07a1f4fd5b1097071a3815cc5ddc5fcb to your computer and use it in GitHub Desktop.

Select an option

Save kristiyanto/07a1f4fd5b1097071a3815cc5ddc5fcb to your computer and use it in GitHub Desktop.
NLP: Custom Lemma for Medium Article
# Refer to the Jupyter Notebook and article for package imports and the complete code.
nlp = spacy.load("en_core_web_sm")
@Language.component("custom_lemma_component")
def custom_lemma_component(doc):
custom_lemmas = {
"br": "bedroom",
"apt": "apartment",
"st": "street",
"min": "minute",
"w/": "with",
}
for token in doc:
lower_text = token.text.lower()
if lower_text in custom_lemmas:
token.lemma_ = custom_lemmas[lower_text]
return doc
# this NLP instance will be shared and used throught the project
nlp.add_pipe('custom_lemma_component', after='tagger')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment