Skip to content

Instantly share code, notes, and snippets.

@SandeepGusain
Forked from j4mie/normalise.py
Created October 18, 2019 18:52
Show Gist options
  • Select an option

  • Save SandeepGusain/6d1b9ebf1d67f98048069f78314d50e7 to your computer and use it in GitHub Desktop.

Select an option

Save SandeepGusain/6d1b9ebf1d67f98048069f78314d50e7 to your computer and use it in GitHub Desktop.

Revisions

  1. @j4mie j4mie created this gist Aug 30, 2010.
    11 changes: 11 additions & 0 deletions normalise.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,11 @@
    # -*- coding: utf-8 -*-

    import unicodedata

    """ Normalise (normalize) unicode data in Python to remove umlauts, accents etc. """

    data = u'naïve café'
    normal = unicodedata.normalize('NFKD', data).encode('ASCII', 'ignore')
    print normal

    # prints "naive cafe"