Python VM Summit - Somewhat Coherent Thoughts

Yay, sleep :)

Last night I just dumped my relatively raw notes into a post. This review is more about distilling what was discussed over the day into a few key themes.

Speed Good

One major point was to do with "How do we make Python fast?". Dave Mandelin (Mozilla Javascript dev) was asking how open CPython was to people tinkering with JIT and other technologies to try and speed up execution, and it was acknowledged that python-dev's reaction to such proposals is rarely more than lukewarm. A large part of that resistance comes from the fact that CPython is generally portable to many more architectures than the real speed hacks (which are generally x86 + x86-64 + ARM at best, and sometimes not even all 3 of those). Unladen Swallow also lost a lot of steam, as so much of their effort was going into tasks not directly related to "make CPython faster" (e.g. fixing LLVM upstream bugs, getting benchmarks working on new versions).

Instead, we tend to push people more towards PyPy if they're really interested in that kind of thing. Armin decided years ago (when switching his efforts from psyco to PyPy) that "we can't get there from here", and it's hard to argue with him, especially given the recent results from the benchmarks executed by speed.pypy.org.

There was definitely interest in expanding the speed.pypy.org effort to cover more versions of more interpeters. We don't actually have any solid data in CPython regarding the performance differences between 2.x and 3.x (aside from an expectation that 3.x is slower for many workloads due to the loss of optimised 32 bit integers, additional encoding/decoding overhead when working with ASCII text, the new IO stack, etc). We aren't even sure of the performance changes within the 2.x series.

That last is the most amenable to resolution in the near term - the benchmarks run by speed.pypy.org are all 2.x applications, so the creation of a speed.python.org for the 2.x series could use the benchmarks as is. Covering 3.x as well would probably be possible with a subset of the benchmarks, but others would require a major porting effort (especially the ones that depend on twisted).

Champions and specific points of contact for this idea aren't particularly obvious at this stage. Jesse is definitely a fan of the idea, but has plenty on his plate already, so it isn't clear how that will work out from a time point of view. There'll likely need to be some self-organisation from folks that are both interested in the project and aren't already devoting their Python-relates energies to something else.

The Python Software Foundation, not the CPython Software Foundation

The second major key point was the PSF (as represented by Jesse Noller from the board, and several other PSF members, including me, from multiple VMs) wanting to do more to support and promote implementations other than CPython. We are definitely at the point where all 4 big implementations are an excellent choice depending on the target environment:

  • CPython: the venerable granddaddy, compatible with the most C extensions and target environments, most amenable to "stripping" (i.e. cutting it down to a minimal core), likely the easiest sell in a corporate environment (due to age and historically closest ties to the PSF)
  • Jython: the obvious choice when using Python as a glue language for Java components, or as a scripting language embedded in a Java environment
  • IronPython: ditto for .NET components and applications
  • PyPy: now at the point where deployments on standard server and desktop environments should seriously consider it as an alternative to CPython. It's not really appropriate for embedded environments, but when sufficient resources are available to let it shine, it will run most workloads significantly faster than CPython. It even has some support for C extensions, although big ticket items like full NumPy support are still a work in progress. However, if you're talking something like a Django-based web app, then "CPython or PyPy" is now becoming a question that should be asked.

It didn't actually come up yesterday, but Stackless probably deserves a prominent mention as well, given the benefits that folks such as CCP are able to glean from the microthreading architecture.

Currently, however, python.org is still very much the CPython website. It will require a lot of work it to get to a place where the other implementations are given appropriate recognition. It also isn't clear whether or not the existing pydotorg membership will go along with a plan to modernise the website design to something that employs more modern web technologies, and better provides information on the various Python implementation and the PSF. While the current site is better than what preceded it, a lot of pydotorg members are still gun shy due to the issues in managing that last transition (even the recent migration of the development process docs over to a developer-mainted system on docs.python.org encountered some resistance). However, when the broader Python community includes some of the best web developers on the planet, we can and should do better. (A personal suggestion that I didn't think of until this morning: perhaps a way forward on this would be to first build a new site as "beta.python.org", without making a firm commitment to switch until after the results are available for all to see. It's a pretty common way for organisations to experiment with major site revamps, after all, and would also give the pydotorg folks a chance to see what they think of the back-end architecture)

Standardising the Standard Library

Finally, with the hg transition now essentially done, efforts to better consolidate development effort on the standard library (especially the pure Python sections) and the associated documentation will start to gather steam again. As a preliminary step, commit rights (now more accurately called "push rights") to the main CPython repository are again being offered to maintainers from the other major interpreter implementations so they can push fixes upstream, rather than needing to maintain them as deltas in their own repositories and/or submit patches via the CPython tracker.

Comments

Comments powered by Disqus