Skip to content

Instantly share code, notes, and snippets.

@kristiyanto
Created July 3, 2024 14:41
Show Gist options
  • Select an option

  • Save kristiyanto/fd26423d127ea0822b40ff67c01b8bc4 to your computer and use it in GitHub Desktop.

Select an option

Save kristiyanto/fd26423d127ea0822b40ff67c01b8bc4 to your computer and use it in GitHub Desktop.
NLP: LLM for text summarization Medium Article
# Refer to the Jupyter Notebook and article for package imports and the complete code.
model_name = "t5-small"
tokenizer = T5Tokenizer.from_pretrained(model_name, legacy=False)
model = T5ForConditionalGeneration.from_pretrained(model_name)
def summarize_with_t5(text, max_length=80):
if len(text) < max_length:
return text
input_text = "summarize: " + text
input_ids = tokenizer.encode(input_text, return_tensors="pt", truncation=True)
summary_ids = model.generate(input_ids, max_length=max_length, min_length=10, length_penalty=0.5, num_beams=4, early_stopping=True)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
return summary
data['t5_summary'] = data.description.apply(summarize_with_t5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment