Python Language Summit - Rough Notes

Same drill, different day, more people, more notes :)

Still just my interpetation, though. Will probably highlight a few things I find particularly interesting again tomorrow (as I did for the VM summit).

PSF Communications (Doug Hellman)
- currently writing about PSF funding and similar activities
- would like to include more technical material summarising python-dev discussions
- how best to go about that
- new blog, not existing PSF blog
- existing PSF board not in the position to do it
- Guido: core devs already do a lot via PEPs and mailing list, likely not keen to write blog as well
- may be better to get others to do it, willing to follow discussions of interest
- posts may be primarily pointers to other resources (e.g. PEPs, mailing list posts)
- all implementations
- major new releases should go on python.org as NEWS items

Warnings for things that may cause problems on different implementations
- ResourceWarning helps to pick up reliance on CPython specific refcounting
- CompatibilityWarning for reliance on non-strings in namespaces (e.g. classes)
- update Language Spec to clarify ambiguous situations
- like ResourceWarning, silence CompatibilityWarning by default
- what to do about builtin functions that currently aren't descriptors (i.e. doesn't change behaviour when retrieved from a class)
- e.g. make staticmethod objects directly callable
- big gray area in language spec - callables may not be descriptors
- perhaps change CPython builtin descriptors to warn about this situation
- another use case for CompatibilityWarning
- Guido not convinced the builtin function problem can be handled in general
- a better callable variant of staticmethod may be better, that allows the default descriptor behaviour to be easily stripped from any function
- doesn't want to require that all builtin functions follow descriptor protocol, since it is already the case that many callables don't behave like methods
- a better staticmethod would allow the descriptor protocol to be stripped, ensuring such functions can be safely stored in classes without changing behaviour

Standard Library separation
- see VM summit notes
- over time, migrate over to separate repository for standard lib, update
- need Python and C modules stay in sync
- buildbots for standard library
- challenge of maintaining compatibility as standard lib adopts new language changes
- need a PEP to provide guarantees that C accelerators are kept in sync (Brett Cannon volunteered to write test)
- bringing back pure Python alternatives to C standard library is encouraged, but both need to be tested
- accelerator modules should be subsets of the Python API
- Brett will resurrect standard library PEP once importlib is done
- full consolidation unlikely to be possible for 2.7 (due to CPython maintenance freeze)

Speed Benchmarking
- see VM summit notes
- really good for tracking performance changes across versions
- common set of benchmarks
- OSU OSL are willing to host it
- backend currently only compares two versions
- first step is to get up and running with Linux comparisons first, look at other OS comparisons later
- hypervisors mess with performance benchmarks, hence need real machines
- should set up some infrastructure on python.org (benchmark SIG mailing list, hg repository)
- eventually, redirect speed.pypy.org to new speed.python.org
- longer term, may add new benchmarks

Exception data
- need to eliminate need to parse error strings to get info from exceptions
- should be careful that checks of message content aren't overly restrictive
- PEP 3151 to improve IO error handling? (Guido still has some reservations)
- ImporError needs to name module
- KeyError, IndexError, ValueError?
- need to be careful when it comes to creating reference loops
- exception creation API also an issue, since structured data needs to be provided

Contributor Licensing Agreements
- Jesse and Van looking to get electronic CLAs set up
- will ensure adequately covers non-US jurisdictions

Google Summer of Code
- encouraging proposals under the PSF umbrella

Packaging
- distutils2 should land in 3.3 during the sprints
- namespace packages (PEP 382) will land in 3.3
- external name for backports should be different from internal name
- too late to introduce a standard top level parent for stdlib packages
- external backports for use in older versions is OK
- external maintenance is bad
- hence fast development cycles incompatible with stdlib
- want to give distutils2 a new name in stdlib for 3.3, so future backports based on version in 3.4 won't conflict with the standard version in 3.3

Python 3 adoption
- py3ksupport.appspot.com (Brett Cannon)
- supplements inadequate trope data on PyPI with manual additions
- Georg Brandl has graphical tracker of classification data on PyPI over time
- Allison Randall/Barry Warsaw have been doing similar dependency tracking and migration info for Ubuntu
- giant wiki page for Fedora Python app packaging
- good dependency info would provide a good ranking system for effectively targeting grants
- 3.python.org? getpython3.com? need to choose an official URL
- funding may help with PyPy migration
- IronPython will be looking at 3.x support once 2.7 is available (this week/next week timeframe)
- Jython focused on 2.6 now, may go direct to 3.x after that (haven't decided yet)
- PSF funding needs a specific proposal with specific developer resources with the necessary expertise and available time
- CObject->Capsule change is a compatibility issue for C extension modules
- Django targeting Python 3 support by the end of summer
- zc.buildout is a dependency of note that hasn't been ported yet (Pycon sprint topic)
- other migration projects being tackled at Pycon sprints (webop?)

Python upstream and distro packaging
- PEP 394 - recommendations for symlinks practices
- PEP 3147 and 3149 were heavily targeted at helping distros share directories across versions
- namespace packages (PEP 382)
- PEP 384 stable ABI (done for 3.2)
- better tools needed to help with migration to stable ABI

Baseline Python distro installs
- system python varies in terms of what is installed
- challenging to target, as available modules vary
- "build from source" is only a partial answer as some build dependencies are optional
- distros make some changes to support differences in directory layouts
- some changes affect Python app dependencies (e.g. leaving out distutils)
- conflict between "system Python" use case of what is needed to run distro utilities and "arbitrary app target" for running third party apps
- distributing separate Python under app control is not ideal, due to security patch management issues
- specific problems are caused by removal of stuff from base install (e.g. distutils)
- other problem is when distro uses old versions of packages (but virtualenv can help with that)
- may help if a "python-minimal" was used for the essential core, with "python" installing all the extras (including distutils, tkinter, etc)
- then have a further python-extras (or equivalent) that adds everything else the distro needs for its own purposes
- distros tend to work by taking a CPython build and then splitting it up into various distro packages
- to handle additions, would be good to be able to skip site-packages inclusion in sys.path (ala virtualenv).
- "-S" turns off too much (skips site.py entirely, not just adding site-packages to sys.path)
- "-s" only turns off user site-packages, not system site-packages

Python 3.3 proposed changes to strings to reduce typical memory usage
- PEP 393 changes to internal string representation (implementation as GSoC project)
- Unicode memory layout currently split in order to more easily support resizing and subclassing in C
- need to build and measure to see speed and memory impacts
- alternative idea may be to explore multiple implementation techniques (similar to PyPy)

Speed (again!)
- Unladen Swallow dormant. Major maintainers moved on to other things, fair bit of work in picking it up
- even trying to glean piecemeal upgrades (e.g. to cPickle) is a challenge
- interest in speeding up Python has really shifted to PyPy
- for CPython, gains would need to be really substantial to justify additional complexity
- really need to get the macro benchmarks available on 3.x
- Guido: pickle speedup experience is to be cautious, even when the speed gains are large.
- speed hack attempts on CPython are still of interest, especially educational ones
- speeding up overall is a very hard problem, but fixing specific bottlenecks is good
- stable ABI will help
- PyPy far more sensitive to refcounting bugs than CPython
- static analysis to pick up refcounting bugs could help a great deal
- "Here there be dragons": Unladen Swallow shows that overall speedups are not easy to come by

Regex engine upgrade
- new regex library proposed
- added many new features, including the Unicode categories needed to select out Python 3.x identifiers
- potentially big hassle for other implementations since re module includes a lot of C
- IronPython currently translates to .NET compatible regexes, but could rewrite more custom code

GUI Library
- Guido: GUI libraries are nearly as complicated as the rest of Python put together and just aren't a good fit with the release cycle of the standard lib
- Don't want to add another one, but don't want to remove Tcl/Tk support either

twisted.reactor/deferred style APIs in the standard library
- asyncore/aynchat still has users
- would like to have an alternative in the stdlib that offers a better migration path to Twisted
- deferred could be added, such that asyncore based apps can benefit from it
- reactor model separates transport/protocol concerns far more cleanly than asyncore
- protocol level API and transport level API for asyncore may be a better option
- would allow asyncore based applications to more easily migrate to other async loops
- defining in a PEP would allow this to be the "WSGI" for async frameworks ("asyncref", anyone?) (Jesse suggested concurrent.eventloop instead)
- still need someone to step up to write the PEP and integrate the feedback from the Twisted team and the other async frameworks
- plenty of async programming folks able to help and provide feedback (including glyph)
- having this standardised would help make event loop based programming more pluggable
- Guido still doesn't like the "deferred" name
- Glyph considers deferred to be less important than standardising the basic event loop interface

Comments

Comments powered by Disqus