Skip to content

Instantly share code, notes, and snippets.

@ambv
Created April 3, 2012 14:10
Show Gist options
  • Select an option

  • Save ambv/2292291 to your computer and use it in GitHub Desktop.

Select an option

Save ambv/2292291 to your computer and use it in GitHub Desktop.

Revisions

  1. ambv revised this gist Apr 3, 2012. No changes.
  2. ambv revised this gist Apr 3, 2012. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion choices.rst
    Original file line number Diff line number Diff line change
    @@ -455,7 +455,7 @@ Why not use an existing Django enum implementation?

    The most popular are:

    - `django-choices <http://pypi.python.org/pypi/django-choices`_ by Jason Webb.
    - `django-choices <http://pypi.python.org/pypi/django-choices>`_ by Jason Webb.
    The home page link is dead, the implementation lacks grouping support,
    internationalization support and automatic integer ID incrementation. Also,
    I find the reversed syntax a bit unnatural. There are some tests.
  3. ambv revised this gist Apr 3, 2012. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions choices.rst
    Original file line number Diff line number Diff line change
    @@ -191,7 +191,7 @@ format can be customized, for instance for ``CharField`` usage::
    >>> Gender(item=lambda c: (c.name, c.desc))
    [(u'male', u'male'), (u'female', u'female'), (u'not_specified', u'not specified')]

    But that will probably make more visible sense with a foreign translation::
    But that will probably make more sense with a foreign translation::

    >>> from django.utils.translation import activate
    >>> activate('pl')
    @@ -216,7 +216,7 @@ It contains a bunch of attributes, e.g. its name::
    >>> Gender.female.name
    u'female'

    which probably makes more visible sense if access from a database model (in the
    which probably makes more sense if accessed from a database model (in the
    following example, using a ``ChoiceField``)::

    >>> user.gender.name
  4. ambv revised this gist Apr 3, 2012. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion choices.rst
    Original file line number Diff line number Diff line change
    @@ -209,7 +209,7 @@ A single Choice
    Every ``Choice`` is an object::

    >>> Gender.female
    <Choice: emale (id: 2, name: female)>
    <Choice: female (id: 2, name: female)>

    It contains a bunch of attributes, e.g. its name::

  5. ambv revised this gist Apr 3, 2012. 1 changed file with 8 additions and 5 deletions.
    13 changes: 8 additions & 5 deletions choices.rst
    Original file line number Diff line number Diff line change
    @@ -52,9 +52,9 @@ Use ``get_FIELD_display``::
    else:
    return 'Hey there, user!'

    This will fail once you start translating the choice descriptions, if you rename
    them and is generally brittle. So we have to improve on it by using the numeric
    identifiers.
    This will fail once you start translating the choice descriptions or if you
    rename them and is generally brittle. So we have to improve on it by using the
    numeric identifiers.

    Way 2
    ~~~~~
    @@ -148,7 +148,7 @@ and lets you write the above example as::
    else:
    return 'Hey there, user!'

    or when using the provided ``ChoiceField`` (fully compatible with
    or using the provided ``ChoiceField`` (fully compatible with
    ``IntegerFields``)::

    from dj.choices import Choices, Choice
    @@ -170,7 +170,10 @@ or when using the provided ``ChoiceField`` (fully compatible with
    return 'Hello, girl.'
    else:
    return 'Hey there, user!'

    BTW, the reason choices need names as arguments is so they can be picked up py
    ``makemessages`` and translated.

    But it's much more than that so read on.

    The Choices class
  6. ambv revised this gist Apr 3, 2012. 1 changed file with 6 additions and 2 deletions.
    8 changes: 6 additions & 2 deletions choices.rst
    Original file line number Diff line number Diff line change
    @@ -72,8 +72,12 @@ Use numeric IDs::
    default=2)

    def greet(self):
    if self.gender == 0: return 'Hi, boy.' elif self.gender == 1: return
    'Hello, girl.' else: return 'Hey there, user!'
    if self.gender == 0:
    return 'Hi, boy.'
    elif self.gender == 1:
    return 'Hello, girl.'
    else:
    return 'Hey there, user!'

    This is just as bad because once the identifiers change, it's not trivial to
    grep for existing usage, let alone 3rd party usage. This looks less wrong when
  7. ambv created this gist Apr 3, 2012.
    505 changes: 505 additions & 0 deletions choices.rst
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,505 @@
    ==========================================
    Upgrading the choices machinery for Django
    ==========================================

    Specifying choices for form fields and models currently does not do much justice
    to the `DRY
    <https://docs.djangoproject.com/en/1.4/misc/design-philosophies/#don-t-repeat-yourself-dry>`_
    philosophy Django is famous for. Everybody seems to either have their own way of
    working around it or live with the suboptimal tuple-based pairs. This Django
    enhancement proposal presents a comprehensive solution based on an existing
    implementation, explaining reasons behind API decisions and their implications
    on the framework in general.

    Current status
    --------------

    The current way of specifying choices for a field is as follows::

    GENDER_CHOICES = (
    (0, 'male'),
    (1, 'female'),
    (2, 'not specified'),
    )

    class User(models.Model):
    gender = models.IntegerField(choices=GENDER_CHOICES)
    When I then want to implement behaviour which depends on the ``User.gender``
    value, I have a couple of possibilities. I've seen all in production code which
    makes it all the more scary.

    Way 1
    ~~~~~

    Use ``get_FIELD_display``::

    GENDER_CHOICES = (
    (0, 'male'),
    (1, 'female'),
    (2, 'not specified'),
    )

    class User(models.Model):
    gender = models.IntegerField(choices=GENDER_CHOICES)

    def greet(self):
    gender = self.get_gender_display()
    if gender == 'male':
    return 'Hi, boy.'
    elif gender == 'female':
    return 'Hello, girl.'
    else:
    return 'Hey there, user!'

    This will fail once you start translating the choice descriptions, if you rename
    them and is generally brittle. So we have to improve on it by using the numeric
    identifiers.

    Way 2
    ~~~~~

    Use numeric IDs::

    GENDER_CHOICES = (
    (0, _('male')),
    (1, _('female')),
    (2, _('not specified')),
    )

    class User(models.Model):
    gender = models.IntegerField(choices=GENDER_CHOICES,
    default=2)

    def greet(self):
    if self.gender == 0: return 'Hi, boy.' elif self.gender == 1: return
    'Hello, girl.' else: return 'Hey there, user!'

    This is just as bad because once the identifiers change, it's not trivial to
    grep for existing usage, let alone 3rd party usage. This looks less wrong when
    using ``CharFields`` instead of ``IntegerFields`` but the problem stays the
    same. So we have to improve on it by explicitly naming the options.

    Way 3
    ~~~~~

    Explicit choice values::

    GENDER_MALE = 0
    GENDER_FEMALE = 1
    GENDER_NOT_SPECIFIED = 2

    GENDER_CHOICES = (
    (GENDER_MALE, _('male')),
    (GENDER_FEMALE, _('female')),
    (GENDER_NOT_SPECIFIED, _('not specified')),
    )

    class User(models.Model):
    gender = models.IntegerField(choices=GENDER_CHOICES,
    default=GENDER_NOT_SPECIFIED)

    def greet(self):
    if self.gender == GENDER_MALE:
    return 'Hi, boy.'
    elif self.gender == GENDER_NOT_SPECIFIED:
    return 'Hello, girl.'
    else: return 'Hey there, user!'

    This is a saner way but starts getting overly verbose and redundant. You can
    improve encapsulation by moving the choices into the User class but that on the
    other hand beats reusability.

    The real problem however is that there is no `One Obvious Way To Do It
    <http://www.python.org/dev/peps/pep-0020/>`_ and newcomers are likely to choose
    poorly.

    tl;dr or The Gist of It
    -----------------------

    My proposal suggests embracing a solution that is already implemented and tested
    with 100% statement coverage. It's easily installable::

    pip install dj.choices

    and lets you write the above example as::

    from dj.choices import Choices, Choice

    class Gender(Choices):
    male = Choice("male")
    female = Choice("female")
    not_specified = Choice("not specified")

    class User(models.Model):
    gender = models.IntegerField(choices=Gender(),
    default=Gender.not_specified.id)

    def greet(self):
    gender = Gender.from_id(self.gender)
    if gender == Gender.male:
    return 'Hi, boy.'
    elif gender == Gender.female:
    return 'Hello, girl.'
    else:
    return 'Hey there, user!'

    or when using the provided ``ChoiceField`` (fully compatible with
    ``IntegerFields``)::

    from dj.choices import Choices, Choice
    from dj.choices.fields import ChoiceField

    class Gender(Choices):
    male = Choice("male")
    female = Choice("female")
    not_specified = Choice("not specified")

    class User(models.Model):
    gender = ChoiceField(choices=Gender,
    default=Gender.not_specified)

    def greet(self):
    if self.gender == Gender.male:
    return 'Hi, boy.'
    elif self.gender == Gender.female:
    return 'Hello, girl.'
    else:
    return 'Hey there, user!'
    But it's much more than that so read on.

    The Choices class
    -----------------

    By default the ``Gender`` class has its choices enumerated similarly to how
    Django models order their fields. This can be seen by instantiating it::

    >>> Gender()
    [(1, u'male'), (2, u'female'), (3, u'not specified')]

    By default an item contains the numeric ID and the localized description. The
    format can be customized, for instance for ``CharField`` usage::

    >>> Gender(item=lambda c: (c.name, c.desc))
    [(u'male', u'male'), (u'female', u'female'), (u'not_specified', u'not specified')]

    But that will probably make more visible sense with a foreign translation::

    >>> from django.utils.translation import activate
    >>> activate('pl')
    >>> Gender(item=lambda c: (c.name, c.desc))
    [(u'male', u'm\u0119\u017cczyzna'), (u'female', u'kobieta'), (u'not_specified', u'nie podano')]

    It sometimes makes sense to provide only a subset of the defined choices::

    >>> Gender(filter=('female', 'not_specified'), item=lambda c: (c.name, c.desc))
    [(u'female', u'kobieta'), (u'not_specified', u'nie podano')]

    A single Choice
    ---------------

    Every ``Choice`` is an object::

    >>> Gender.female
    <Choice: emale (id: 2, name: female)>

    It contains a bunch of attributes, e.g. its name::

    >>> Gender.female.name
    u'female'

    which probably makes more visible sense if access from a database model (in the
    following example, using a ``ChoiceField``)::

    >>> user.gender.name
    u'female'

    Other attributes include the numeric ID, the localized and the raw description
    (the latter is the string as present before the translation)::

    >>> Gender.female.id
    2
    >>> Gender.female.desc
    u'kobieta'
    >>> Gender.female.raw
    'female'

    Within a Python process, choices can be compared using identity comparison::

    >>> u.gender
    <Choice: male (id: 1, name: male)>
    >>> u.gender is Gender.male
    True

    Across processes the serializable value (either ``id`` or ``name``) should be
    used. Then a choice can be retrieved using a class-level getter::

    >>> Gender.from_id(3)
    <Choice: not specified (id: 3, name: not_specified)>
    >>> Gender.from_name('male')
    <Choice: male (id: 1, name: male)>

    Grouping choices
    ----------------

    One of the problems with specifying choice lists is their weak extensibility.
    For instance, an application defines a group of possible choices like this::

    >>> class License(Choices):
    ... gpl = Choice("GPL")
    ... bsd = Choice("BSD")
    ... proprietary = Choice("Proprietary")
    ...
    >>> License()
    [(1, u'GPL'), (2, u'BSD'), (3, u'Proprietary')]

    All is well until the application goes live and after a while the developer
    wants to include LGPL. The natural choice would be to add it after gpl but when
    we do that, the indexing would break. On the other hand, adding the new entry at
    the end of the definition looks ugly and makes the resulting combo boxes in the
    UI sorted in a counter-intuitive way. Grouping lets us solve this problem by
    explicitly defining the structure within a class of choices::

    >>> class License(Choices):
    ... COPYLEFT = Choices.Group(0)
    ... gpl = Choice("GPL")
    ...
    ... PUBLIC_DOMAIN = Choices.Group(100)
    ... bsd = Choice("BSD")
    ...
    ... OSS = Choices.Group(200)
    ... apache2 = Choice("Apache 2")
    ...
    ... COMMERCIAL = Choices.Group(300)
    ... proprietary = Choice("Proprietary")
    ...
    >>> License()
    [(1, u'GPL'), (101, u'BSD'), (201, u'Apache 2'), (301, u'Proprietary')]

    This enables the developer to include more licenses of each group later on::

    >>> class License(Choices):
    ... COPYLEFT = Choices.Group(0)
    ... gpl_any = Choice("GPL, any")
    ... gpl2 = Choice("GPL 2")
    ... gpl3 = Choice("GPL 3")
    ... lgpl = Choice("LGPL")
    ... agpl = Choice("Affero GPL")
    ...
    ... PUBLIC_DOMAIN = Choices.Group(100)
    ... bsd = Choice("BSD")
    ... public_domain = Choice("Public domain")
    ...
    ... OSS = Choices.Group(200)
    ... apache2 = Choice("Apache 2")
    ... mozilla = Choice("MPL")
    ...
    ... COMMERCIAL = Choices.Group(300)
    ... proprietary = Choice("Proprietary")
    ...
    >>> License()
    [(1, u'GPL, any'), (2, u'GPL 2'), (3, u'GPL 3'), (4, u'LGPL'),
    (5, u'Affero GPL'), (101, u'BSD'), (102, u'Public domain'),
    (201, u'Apache 2'), (202, u'MPL'), (301, u'Proprietary')]

    The behaviour in the example above was as follows:

    - the developer renamed the GPL choice but its meaning and ID remained stable

    - BSD, Apache and proprietary choices have their IDs unchanged

    - the resulting class is self-descriptive, readable and extensible

    The explicitly specified groups can be used as other means of filtering, etc.::

    >>> License.COPYLEFT
    <ChoiceGroup: COPYLEFT (id: 0)>
    >>> License.gpl2 in License.COPYLEFT.choices
    True
    >>> [(c.id, c.desc) for c in License.COPYLEFT.choices]
    [(1, u'GPL, any'), (2, u'GPL 2'), (3, u'GPL 3'), (4, u'LGPL'),
    (5, u'Affero GPL')]

    Pushing polymorphism to the limit - extra attributes
    ----------------------------------------------------

    Let's see our original example once again::

    from dj.choices import Choices, Choice
    from dj.choices.fields import ChoiceField

    class Gender(Choices):
    male = Choice("male")
    female = Choice("female")
    not_specified = Choice("not specified")

    class User(models.Model):
    gender = ChoiceField(choices=Gender,
    default=Gender.not_specified)

    def greet(self):
    if self.gender == Gender.male:
    return 'Hi, boy.'
    elif self.gender == Gender.female:
    return 'Hello, girl.'
    else:
    return 'Hey there, user!'


    If you treat DRY really seriously, you'll notice that actually separation
    between the choices and the greetings supported by each of them may be
    a violation of the "Every distinct concept and piece of data should live in one,
    and only one, place" rule. You might want to move this information up::

    from dj.choices import Choices, Choice
    from dj.choices.fields import ChoiceField

    class Gender(Choices):
    male = Choice("male").extra(
    hello='Hi, boy.')
    female = Choice("female").extra(
    hello='Hello, girl.')
    not_specified = Choice("not specified").extra(
    hello='Hey there, user!')

    class User(models.Model):
    gender = ChoiceField(choices=Gender,
    default=Gender.not_specified)

    def greet(self):
    return self.gender.hello

    As you see, the ``User.greet()`` method is now virtually gone. Moreover, it
    already supports whatever new choice you will want to introduce in the future.
    This way the choices class starts to be the canonical place for the concept it
    describes. Getting rid of chains of ``if`` - ``elif`` statements is now just
    a nice side effect.

    .. note::

    I'm aware this is advanced functionality. This is not a solution for every
    case but when it is needed, it's priceless.


    Advantages of merging this solution
    -----------------------------------

    1. Having a single source of information, whether it is a list of languages,
    genders or states in the USA, is DRY incarnate.

    2. If necessary, this source can later be filter and reformatted upon
    instatiation.

    3. Using ``ChoiceFields`` or explicit ``from_id()`` class methods on choices
    enables cleaner user code with less redundancy and dependency on hard-coded
    values.

    4. Upgrading the framework's choices to use class-based choices will increase
    its DRYness and enable better future extensibility.

    5. Bikeshedding on the subject will hopefully stop.

    Disadvantages
    -------------

    1. A new concept in the framework increases the vocabulary necessary to
    understand what's going on.

    2. Because of backwards compatibility in order to provide a *One Obvious Way To
    Do It* we actually add another way to do it. We can however endorse it as the
    recommended way.

    Performance considerations
    --------------------------

    Creation of the proper ``Choices`` subclass uses a metaclass to figure out
    choice order, group membership, etc. This is done once when the module loading
    the class is loaded.

    After instantiation the resulting object is little more than a raw list of
    pairs.

    Using ``ChoiceFields`` instead of raw ``IntegerFields`` introduces automatic
    choice unboxing and moves the field ID checks up the stack introducing
    negligible performance costs. I don't personally believe in microbenchmarks so
    didn't try any but can do so if requested.

    FAQ
    ---

    Is it used anywhere already?
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Yes, it is. It grew out of a couple of projects I did over the years, for
    instance `allplay.pl <http://allplay.pl/>`_ or `spiralear.com
    <http://spiralear.com/en/>`_. Various versions of the library are also used by
    my former and current employers. It has 100% statement coverage in unit tests.

    Has anyone evaluated it yet?
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    I have shown this implementation to various people during PyCon US 2012 and it
    gathered some enthusiasm. Jannis Leidel, Carl Meyer and Julien Phalip seemed
    interested in the idea at the very least.

    Why not use an existing Django enum implementation?
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    The most popular are:

    - `django-choices <http://pypi.python.org/pypi/django-choices`_ by Jason Webb.
    The home page link is dead, the implementation lacks grouping support,
    internationalization support and automatic integer ID incrementation. Also,
    I find the reversed syntax a bit unnatural. There are some tests.

    - `django-enum <pypi.python.org/pypi/django-enum>`_ by Jacob Smullyan. There is
    no documentation nor tests, the API is based on a syntax similar to
    namedtuples with a single string of space-separated choices. Doesn't support
    groups nor internationalization.

    Why not use a generic enum implementation?
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    The most popular are:

    - `enum <http://pypi.python.org/pypi/enum>`_ by Ben Finney (also a rejected `PEP
    <http://www.python.org/dev/peps/pep-0354/>`_) - uses the "object
    instantiation" approach which I find less readable and extensible.

    - `flufl.enum <https://launchpad.net/flufl.enum>`_ by Barry Warsaw - uses the
    familiar "class definition" approach but doesn't support internationalization
    and grouping.

    Naturally, none of the generic implementations provide shortcuts for Django
    forms and models.

    Why not use namedtuples?
    ~~~~~~~~~~~~~~~~~~~~~~~~

    This doesn't solve anything because a ``namedtuple`` defines a type that is
    later populated with data on a per-instance basis. Defining a type first and
    then instantiating it is already clumsy and redundant.

    Do you have a field like ``ChoiceField`` but holding characters?
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Not yet but this is planned.

    I don't like having to write ``Choice()`` all the time.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    You can use a trick::

    class Gender(Choices):
    _ = Choices.Choice

    male = _("male")
    female = _("female")
    not_specified = _("not specified")

    Current documentation for the project uses that because outside of Django core
    this is necessary for ``makemessages`` to pick up the string for translation.
    Once merged this will not be necessary so **this is not a part of the
    proposal**. You're free to use whatever you wish, like importing ``Choice`` as
    ``C``, etc.