Posts/2011/02

Status quo wins a stalemate

Sometimes language design arguments can reach a point of stalemate. The status quo is only arguably flawed, and there are also perceived flaws in any or all of the proposed alternatives. An appropriate shared design principle can help identify when this point has been reached, and let the discussion die a natural death rather than endlessly rehashing the same points without anyone changing their opinion.

Every time we (python-dev) change anything significant, no matter how positive the end result, it can create a lot of churn in the community. Books need to be rewritten, other implementations modified, advice, recipes and examples updated, questions clarified as to which version they relate to, and version compatibility issues need to be monitored closely for projects that need to cope with older execution environments.

So, before any significant changes are made, we want to be fairly certain that the gain in clarity for future Python programs is worth the inevitable near term costs as the update ripples across the Python ecosystem. Sometimes newcomers have some interesting ideas, but still fail to clear this hurdle. The simple "it's not worth the hassle" response they're likely to receive may then come across as stodgy developers rejecting an outsider's ideas without adequate consideration.

This was something that came up fairly often during the Python 3000 mailing list discussions, to the point where I posted a message explaining why the principle of "Status quo wins a stalemate" is a very practical way to avoid meaningless churn in the language design and to cut short design discussions that obviously aren't going anywhere productive.

Python 3000 was already going to have a lot of major changes (most notably, finally improving the non-ASCII text handling story, in a way that means most Python 3 libraries and applications will be more likely to get it right). We needed to ride close herd on the design discussions to try to make sure that gratuitous changes with insufficient long term benefits were avoided.

So, lambda eventually stayed and map() and filter() were retained as builtins, while the attractive nuisance that is reduce() was merely banished to the functools module rather than getting dropped entirely as was originally proposed. PEP 348 was rejected to be replaced by the far less ambitious PEP 352. str.format() was still added, but as a complement to the legacy percent formatting mechanism rather than as a wholesale replacement.

Untold numbers of ideas on the mailing lists and the tracker were dropped with "too much pain for not enough benefit" as the rationale. More recently, PEP 3003 was instituted to enforce a moratorium on core language changes for Python 3.2 in order to give the rest of the community more time to catch up to Python 2.7 and the 3.x series, even though we knew it meant delaying good ideas like the improved generator refactoring capabilities provided by PEP 380.

The fact that Python 3 migration support tools like 2to3, 3to2 and the six module work as well as they do is probably due to this principle of language design as much as it is to any other factor (not to take anything away from the fine work that has gone into implementing them, of course!).

Posting code and syntax highlighting

Before publishing the previous post, I looked into recommendations for syntax highlighting in coding-oriented blogs. In a quick search, syntaxhighlighter showed up repeatedly as the preferred choice, so that's what I went with.

It looks like I'm not the only one that isn't entirely happy with that solution (although by using the "pre" tags rather than "script", my code should at least appear in the RSS feed).

Working with ReST would certainly be easier than the semi-HTML I'm currently using. Still, I think I have plenty to learn about Blogger's formatting tools before I abandon them entirely in favour of preformatted posts (which have their own drawbacks).

Justifying Python language changes

A few years back, I chipped in on python-dev with a review of syntax change proposals that had made it into the language over the years. With Python 3.3 development starting and the language moratorium being lifted, I thought it would be a good time to tidy that up and republish it as a blog post.

Generally speaking, syntactic sugar (or new builtins) need to take a construct in idiomatic Python that is fairly obvious to an experienced Python user and make it obvious to even new users, or else take an idiom that is easy to get wrong when writing (or miss when reading) and make it trivial to use correctly.

Providing significant performance improvements (usually in the form of reduced memory usage or increased speed) also counts heavily in favour of new constructs.

I strongly suggest browsing through past PEPs (both accepted and rejected ones) before proposing syntax changes, but here are some examples of syntactic sugar proposals that were accepted.

List/set/dict comprehensions
(and the reduction builtins any(), all(), min(), max(), sum())

target = [op(x) for x in source]
instead of:
target = []
for x in source:
target.append(op(x))
The transformation (`op(x)`) is far more prominent in the comprehension version, as is the fact that all the loop does is produce a new list. I include the various reduction builtins here, since they serve exactly the same purpose of taking an idiomatic looping construct and turning it into a single expression.

Generator expressions
total = sum(x*x for x in source)
instead of:
def _g(source):
for x in source:
yield x*x
total = sum(_g(x))
or:
total = sum([x*x for x in source])
Here, the GE version has obvious readability gains over the generator function version (as with comprehensions, it brings the operation being applied to each element front and centre instead of burying it in the middle of the code, as well as allowing reduction operations like sum() to retain their prominence), but doesn't actually improve readability significantly over the second LC-based version. The gain over the latter, of course, is that the GE based version needs a lot less memory than the LC version, and, as it consumes the source data
incrementally, can work on source iterators of arbitrary (even infinite) length, and can also cope with source iterators with large time gaps between items (e.g. reading from a socket) as each item will be returned as it becomes available (obviously, the latter two features aren't useful when used in conjunction with reduction operations like sum, but they can be helpful in other contexts).

With statements
with lock:
# perform synchronised operations
instead of:
lock.acquire()
try:
# perform synchronised operations
finally:
lock.release()
This change was a gain for both readability and writability - there were plenty of ways to get this kind of code wrong (e.g. leave out the try-finally altogether, acquire the resource inside the try block instead of before it, call the wrong method or spell the variable name wrong when attempting to release the resource in the finally block), and it wasn't easy to audit because the resource acquisition and release could be separated by an arbitrary number of lines of code. By combining all of that into a single line of code at the beginning of the block, the with statement eliminated a lot of those issues, making the code much easier to write correctly in the first place, and also easier to audit for correctness later (just make sure the code is using the correct context manager for the task at hand).

Function decorators
@classmethod
def f(cls):
# Method body
instead of:
def f(cls):
# Method body
f = classmethod(f)
Easier to write (function name only written once instead of three times), and easier to read (decorator names up top with the function signature instead of buried after the function body). Some folks still dislike the use of the @ symbol, but compared to the drawbacks of the old approach, the dedicated function decorator syntax is a huge improvement.

Conditional expressions
x = A if C else B
instead of:
x = C and A or B
The addition of conditional expressions arguably wasn't a particularly big win for readability, but it was a big win for correctness. The and/or based workaround for the lack of a true conditional expression was not only hard to read if you weren't already familiar with the construct, but using it was also a potential source of bugs if A could ever be False while C was True (in such cases, B would be returned from the expression instead of A).

Except clause
except Exception as ex:
instead of:
except Exception, ex:
Another example of changing the syntax to reduce the potential for non-obvious bugs (in this case, except clauses like `except TypeError, AttributeError:`, that would actually never catch AttributeError, and would locally do AttributeError=TypeError if a TypeError was caught).

Bye-bye Blogilo

OK, when a blogging app can't figure out my blog identity automatically and crashes every time I submit a post (but after submitting the post to blogger), 'tis clearly not the app for me.

I'm just happy the first 3 posts didn't properly include the 'python' tag either, so at least Planet Python shouldn't have been spammed with any noise.

Back to the in-browser editor for now...

To Pycon and beyond...

All these Planet Python posts about interesting talks and info at Pycon finally tipped me over the edge into making the trek across the Pacific to meet some of these people I've been working with online for the past half-dozen years or so.

With 3.3 still 18-24 months away, we should be able to get a pretty good road map thrashed out for ideas we want to explore for possible inclusion. Some face-to-face discussions will be especially handy for me, given the things I'd like to see sorted out: module aliasing to clean up __main__ handling once and for all, bringing back implicit context managers now we have more collective experience with explicit ones, an alternative to PEP 377 that will allow context managers to do some additional setup inside the scope of the try block, clarifying the semantic questions raised by discrepancies between the PEP 3118 buffer API spec and its implementation.

I still have some paperwork to sort out once my renewed passport arrives, but aside from that, the trip is good to go. I did stuff my travel dates up a bit and will have a day to kill in Atlanta on the 9th, but I'm sure I'll be able to figure out something interesting to do :)

Linking sites in blog posts

Call me paranoid, but the idea of trusting a blogging app with my Google account details really doesn't appeal to me. So, "BlogThis!" on the links bar it is.

It would be nice if BlogThis! popped up the full Blogger editor instead of a partial one (missing features like editing the post tags), but using it to save pre-linked drafts should be more than adequate for those occasions when I'm commenting on a link rather than writing something from scratch.

Test: editing an existing post...