|
An "enum" for Python 3
Designing an enumeration type (i.e. "enum") for a language may seem like a
straightforward exercise, but the recently "completed" discussions over
Python's PEP 435
show that it has a few wrinkles. The discussion spanned several long
threads in two mailing lists
(python-ideas, python-devel) going back to January in this particular
iteration, but the
idea is far older than that. A different approach was suggested in PEP 354, which was
proposed in
2005 but rejected at that time, largely due to lack of widespread interest.
A 2010
discussion also led nowhere (at least in terms of the standard
library), but the most recent discussions finally bore fruit: Guido van
Rossum accepted PEP 435 on May 9.
The basic idea is to have a class that implements an enum, which, in
Python, might look a lot like:
from enum import Enum
class Color(Enum):
red = 1
green = 2
blue = 3
That would allow using Color.green (and the others) as a constant,
effectively.
Not only would Color.blue have a value, but it would also have a
name ('blue') and an order (based on the declaration order). Enums can
also be iterated over, so that:
for color in Color:
print(color, color.name, color.value)
gives:
Color.red red 1
Color.green green 2
Color.blue blue 3
Along the way, there were several different enum proposals made. Ethan
Furman offered one that incorporated
multiple types of enum, including ones for bit flags, string-valued enums,
and automatically numbered sequences. Alex Stewart came up with a different syntax for defining
enums to avoid the requirement to specify each numeric value. Neither made
it to the PEP
stage, though pieces of both were adopted
into the first draft of PEP 435, which was
authored by Eli Bendersky and Barry Warsaw.
There are a couple of fairly obvious motivations for adding enums, which
were laid out in the PEP. An immutable set of related, constant values
is a useful construct. Making them their own type, rather than just using
sequences of some other basic type (like integer or string) means that
error checking can be done (i.e. no duplicates) and that nonsensical
operations can raise errors (e.g. Color.blue * 42).
Finally, it is convenient to be able to declare enum members once but to
still be able to get a string representation of the member name
(i.e. without some kind of overt assignment like: green.name='green').
Some of the use cases mentioned early in
the discussion of the PEP are for values like stdin and
stdout, the
flags for socket()orseek() calls, HTTP error codes,
opcodes from the
dis (Python bytecode disassembly) module, and so forth. One of
the questions that was
immediately raised about the original
version of the PEP was its insistence that "Enums are not
integers!", so ordered comparisons like:
Color.red < Color.green
would raise an exception, though equality tests would not:
print(Color.green == 2)
True
To some, that seemed to run directly counter to the whole idea of an enum
type, but allowing ordered comparisons has some unexpected consequences as Warsaw
described. Two different enums could be compared with potentially
nonsensical results:
print(MyAnimal.cat == YourAnimal.dog)
True
In general, the belief is that "named integers" is a small subset of the
use cases for enums, and that most uses do not need ordered comparisons.
But, the final accepted PEP does have an IntEnum variant
that provides the ordering desired by some. IntEnum members are also a
subclass of int, so they can be used to replace user-facing
constants in the
standard library that are already treated as integers (e.g. HTTP error codes,
socket() and seek() flags, etc.).
Asecond revision of the PEP was posted in
April, after lengthy discussion both in python-devel and python-ideas.
Furman offered up another proposal, this
time as an unnumbered
PEP with four separate classes for different types of enums. Two
different views of enums
arose in the discussion, as Furman summarized:
There seems to be two basic camps: those that think an enum
should be valueless, and have nothing to do with an integer besides using
it to select the appropriate enumerator [...] and those for whom the
integer is an integral part
of the enumeration, whether for sorting, comparing, selecting an index, or
whatever.
The critical aspect of using or not using an integer as the base type is:
what happens when an enumerator from one class is compared to an enumerator
from another class? If the base type is int and they both have the same
value, they'll be equal -- so much for type safety; if the base type is
object, they won't be equal, but then you lose your easy to use int aspect,
your sorting, etc.
Worse, if you have the base type be an int, but check for enumeration
membership such that Color.red == 1 == Fruit.apple, but Color.red !=
Fruit.apple, you open a big can of worms because you just broke equality
transitivity (or whatever it's called). We don't want that.
Furman's proposal looked overly complex to Bendersky and others commenting
on a fairly short python-ideas thread. Meanwhile in python-devel, another
monster thread was spinning up. The first objection to the revised PEP was
in raising a
NotImplementedError when doing ordered comparisons of enum
members. That was quickly dispatched with a recognition that
TypeError made far more sense. Other issues, such as the ordered
comparison issue that was handled with IntEnum in the final version, did
not resolve quite as
quickly.
One question, originally raised by Antoine
Pitrou, concerned the type of the enum members. The early
PEP revisions considered Color.red to not be an instance of the
Color class, and Warsaw strongly defended that view. At some
level, that makes sense (since the members
are actually attributes of the class), but it is confusing in other ways.
In a sub-thread, Van Rossum, Warsaw, and
others looked at the pros and cons of the types of enum members, as well as
implementation details of various options. In the end, Van Rossum made some pronouncements on various
features, including the question of member type, so:
isinstance(Color.blue, Color)
True
is now an official part of the specification.
As Python's benevolent dictator for life (BDFL), which is Van Rossum's only-semi-joking title, he can put an end to arguments and/or "bikeshedding"
about language features. In the same thread, he made some further pronouncements (along with a plea for
a halt to the bikeshedding). It is a privilege
that he exercises infrequently, but it is clearly useful to the project to
have someone in that role. Much like Linus Torvalds for the kernel, it can
be quite helpful to have someone who can stop a seemingly endless thread.
Van Rossum's edicts came after Furman summarized the outstanding issues (after a
summary request from Georg Brandl). That is a fairly common occurrence in
long-running Python threads: someone will try to boil down the differences
into a concise list of outstanding issues. Another nice feature of Python
discussions is their tone, which is generally respectful and flame-free.
Participants certainly disagree, sometimes strenuously, but the tone
is refreshingly different from many other projects' mailing lists.
Not everyone is happy with the end result for enums, however. Andrew Cooke
is particularly sad
about the outcome. He points out that several expected behaviors for
enums are not present in PEP 435:
class Color(Enum):
red = 1
green = 1
is not an error; Color.green is an alias for Color.red
(a dubious "feature", he noted with a bit of venom).
In addition, there is a way to avoid having to assign values for each enum
member (auto-numbering, essentially), but its syntax is clunky:
Color = Enum('Color', 'red green blue')
Beyond having to repeat the class name as a string (which violates the
"don't repeat yourself" (DRY) principle), it starts the numbering from one,
rather than zero. Nick Coghlan responded
to Cooke's complaints by more or less agreeing with the criticism. There
is still room for improvement in Python enums, but PEP 435 represents a
solid step forward, according to Coghlan.
It is instructive to watch the design of a language feature play out in
public as they do for Python (and other languages). Enums are something
that the developers will
have to live with for a long time, so it is not surprising that there would
be lots of participation and examination of the feature from many different
angles. While PEP 435 probably didn't completely satisfy anyone's full set
of requirements, there is still room for more features, both in the
standard library and elsewhere, as Coghlan pointed out. The story of enums
in Python likely does not end here.
(Log in to post comments)
Actually, enums are a hack for people that have never heard of constructed types and patterns. For example in OCaml:
type color = Black | White | Color of r * g * b
and r = float and g = float and b = float
let string_of_color = function
| Black -> "black"
| White -> "white"
| Color (1.0, 0.0, 0.0) -> "red"
| Color (r, g, b) when r = g && g = b -> "grey"
| Color (r, g, b) -> sprintf "(%g, %g, %g)" r g b
| Color (r, g, b) when r = g && g = b -> "grey"
If you want to see some more serious use of matching:
https://code.google.com/p/bitstring/
The string representation is likely to change. The symbol should preferably not be renamed everytime.
Black -> "svartur" orBlack -> "noir"
|
|