Posts/2011/07

Sharing vs broadcasting

Even since I started playing with Google+ and its Circle mechanic, I've been trying to figure out why I don't like it. I thought it sounded great when I heard about it, but as soon as I started using it... meh :P

I still haven't quite figured it out (although I mostly think it's the neither-chicken-nor-fowl aspect, along with it adding back ~500 useless "suggestions" from random Python mailing list contacts that I had already purged once), but it's certainly helped me see the way I communicate online in a different light.

For a long time, my only real online presence was regular postings to Python mailing lists, commenting on various sites and the very occasional post here on B&L.

After enough of my friends joined, Facebook was added to the mix, but I do everything I can to keep that locked down so only acquaintances can see most things I post there.

After going to PyconAU last year, and in the leadup to PyCon US earlier this year, I started up my @ncoghlan_dev account on Twitter, got the "python" category tag on here added to Planet Python, and started to develop a bit more of an online public presence.

Here on the blog, I can, and do, tag each post according to areas of interest. As far as I know, the 'python' tag is the only one with any significant readership (due to the aggregation via Planet Python), but people could subscribe to the philosophy or metablogging tags if they really wanted to.

When it comes to sharing information about myself, there's really only a few levels based on how much I trust the people involved: Friends&Family, Acquaintances, General Public pretty much covers it. Currently I handle that via a locked down FB for Friends, Family & Acquaintances (with a "Limited Access" list for people that I exclude from some things, like tagged photos) and completely public material on Twitter and Blogger.

The public stuff is mostly Python related, since that's my main online presence, but I'm fairly open about political and philosophical matters as well. FB, by contrast, rarely sees any mention of Python at all (and I'm often a little more restrained on the political and philosophical front).

Where I think Circles goes wrong is that it conflates Access Control with Topic Tagging. When I publish Python stuff, I'm quite happy for it to be public. However, I'd also be happy to tag it with "python", just as I do here on the blog, to make it easier for my friends to decide which of my updates they want to see.

This is classic Publish/Subscribe architecture thinking. When Publishing, I normally want to be able to decide who *can* access stuff. That is limited by closeness, but typically unrelated to the topic. Having, tagging as a service to my subscribers, I am quite happy to do. When Subscribing, I want to be able to filter by topic.

If I publish something to, say, my Pythonistas circle, than that does more than I want. Yes, it publishes it to them, but it also locks out everybody else. The ways I know people and the degree to which I trust them do not align well with the topics I like to talk about. I've already seen quite a few cases where the reshared versions of public statuses I have received have been limited access.

The more I think about it, the more I think I'm going to stick with my current policy of using it as a public micro-blog and pretty much ignore the fact that the Circles page exists.

Effective communication, brain hacking and diversity

Jesse Noller recently posted an interesting quote on Twitter:
"One who feels hurt while listening to harsh language may lose his mindfulness and not hear what the other person is really saying."
When you think about it, human language is a truly awe inspiring tool. By the simple act of creating certain vibrations in the air, marks on a page or electromagnetic patterns in a storage system, we're able to project our thoughts and feelings across space and time, using them to shape the thoughts and feelings of others.

While this ability to communicate is so thoroughly natural to most of us as humans that we typically take it for granted, it is actually an amazing world shaping capability deserving of our respect and attention.

And once we start giving it the attention it deserves, then we realise that we can judge the effectiveness of our own communication by looking at the communication that is subsequently reflected back at us. How well do those reflections mirror the thoughts and feelings that our own words were intended to create? It's the linguistic equivalent of running our code and seeing if it does what we wanted it to do.

All communication is a form of brain hacking, even when the only target is ourselves. We write lists of goals - formulating for our own benefit concrete plans of action that we can then tackle one step at a time. We write polemics, trying to engender in others some dim sense of the joy or outrage we feel with respect to certain topics. Sometimes we succeed, sometimes we fail. Sometimes we assume certain shared beliefs and understanding, so the point completely fails to come across to those without that common background.

For those closest to us, those with the most shared history, we have a rich tapestry of common knowledge to draw from. Movies we've all seen, books we've all read, events we all attended, discussions we were all part of - outsiders attempting to follow a transcript of our conversations would likely soon be utterly lost due to the shared subtext that isn't explicitly articulated (and the same is true for any group of close friends).

As groups get larger, the amount of truly common knowledge decreases, but there's still plenty of unwritten subtext that backs up whatever is explicitly articulated. In a certain sense, that unwritten subtext can be seen as the very definition of culture - it's the things you don't have to say because they're assumed. My past musings on the culture of python-dev are an example of this.

And that brings us to the point of considering the opening quote, diversity and questions of common courtesy. When speaking to friends, I can share truly awful jokes without offence because of the shared background information as to what is and isn't acceptable (and what things should be taken seriously). As the group being addressed gets larger, then the valid assumptions I can make about shared views of the world become fewer and fewer, so I have to start explicitly articulating things I would otherwise assume, and simply not say some things because I know (or at least strongly suspect) that they won't come across correctly to the audience I'm attempting to reach. Sometimes even addressing similar groups of people in a different context can change the assumptions as to what is a reasonable way to phrase things.

If a member of my target audience gets hung up on my wording or my choice of examples to the point where they miss the underlying message, then to a large degree, the responsibility lies with me as the originator of the communication. Now, I'm not a saint and make no pretence of being one. The rich fields of metaphors in English include many relating to subjects that are truly quite horrific or otherwise offensive to various groups of people. Sometimes I'm going to use that kind of phrasing without thinking about it, especially when talking rather than writing (my own innate tact filter is definitely set up to filter incoming communication, so applying tact in the outwards direction is a conscious process rather than something I do automatically). If such a miscommunication happens and someone points it out, then the onus is on me to admit that yes, my choice of words was poor and obscured my meaning rather than illuminating it. That's life, I make mistakes, and hopefully we can move on.

It's not entirely a one way street, though. Just as we apply contextual analysis to our understanding of historical writings, so it can be useful to apply the same approach to things that are said by current figures. Richard Dawkins recently made some ill-advised comments in relation to Skepchick's advice to men to avoid certain actions that make them look creepy (that's all she said, "Don't do this, it's creepy" and she copped flack for it, as if she'd said people doing it should be sent to prison or castrated or something equally extreme). Does the fact that Dawkins clearly didn't get why he was in the wrong make him a horrible human being or devalue his extensive contributions to our understanding of evolutionary biology*? No, it doesn't, any more than Isaac Newton's obsession with alchemy devalued his contributions to physics and mathematics. It just makes him a product of the culture that raised him. Hopefully he'll eventually realise this and publicly apologise for failing to give the matter due consideration before weighing in.

However, what really surprised me is the number of people that indicate they're shocked by his words, or questioning their support for his other activities just because he so vividly demonstrated his cluelessness on this particular topic. The world is a complicated place and the social dynamics of privilege, cultural blindspots and effectively encouraging diversity aren't one of the easiest pieces to comprehend. Hell, as a middle-class, 30-something, white, English-speaking, straight, cisgendered male living in Australia I'm quite certain that my own grasp of the topic is heavily coloured by the fact that on pretty much any of the typical grounds for discrimination I'm in the favoured majority (being an atheist is arguably the only exception, but that's far less of a problem here in Australia than it is in the US. Our Prime Minister is an acknowledged atheist and even the Murdoch media machine didn't really try to make much of an issue out of that before the last election). I do my best to understand the topic of diversity based on the experiences of those that actually have to deal with it on a daily basis, but it's still a far cry from seeing things first hand.

So, since I don't believe I can speak credibly to the topic of diversity directly, I instead prefer to encourage people to reflect on the value and nature of communication and community in general. Martin Fowler wrote an excellent piece about the challenge of creating communities that are welcoming to a diverse audience without rendering them bland and humourless (as the benign violations of expectations and assumptions that are at the heart of most humour often depend on the shared context that welcoming communities can't necessarily assume). This excellent video highlights the importance of focusing on actions (e.g. "This thing you said was inappropriate and you should consider apologising for saying it") rather than attributes (e.g. "You are a racist/misogynist/whatever"). If the latter is actually true, you're unlikely to change their mind and if it *isn't* true, you're likely to miss an opportunity to educate them as they get defensive and stop listening (refer back to that opening quote!).

I don't pretend to have all (or even any of) the answers, I just believe the entire topic of effective communication and all it entails is one worthy of our collective consideration, since effective communication is almost always a necessary precursor to taking effective action (e.g. on matters such as mitigating and coping with climate change).

In many respects though, the entire topic is really quite simple. To quote Abe Lincoln in one of my all time favourite movies:
Be excellent to each other.

----
* Seriously, read the popular science books on biology that Dawkins has written, especially "The Greatest Show on Earth". They're orders of magnitudes better than "The God Delusion", which is far too laden with angry and aggressive undertones to be an effective tool for communicating with anyone that doesn't already agree with the thesis of the book. In his biology books, his obvious love and passion for the subject matter comes to the fore and they're by far the better for it.

Note for Planet Python: even though this post is about communication rather than code, I have included the python tag since a couple of different diversity related issues have come up recently on python-dev and psf-members. It is no coincidence that "communication" and "community" share a common(!) root in "communis".

Sure it's surprising, but what's the alternative?

Armin Ronacher (aka @mitsuhiko) did a really nice job of explaining some of the behaviours of Python that are often confusing to developers coming from other languages.

However, in that article, he also commented on some of the behaviours of Python that he still considers surprising and questions whether or not he would retain them in the hypothetical event of designing a new Python-inspired language that had the chance to do something different.

My perspective on two of the behaviours he listed is that they're items that are affected by fundamental underlying concepts that we really don't explain well even to current Python users. This reply is intended to be a small step towards correcting that. I mostly agree with him on the third, but I really don't know what could be done as an alternative.

The dual life of "."


Addressing the case where I mostly agree with Armin first, the period has two main uses in Python: as part of floating point and complex number literals and as the identifier separator for attribute access. These two uses collide when it comes to accessing attributes directly on integer literals. Historically that wasn't an issue, since integers didn't really have any interesting attributes anyway.

However, with the addition of the "bit_length" method and the introduction of the standardised numeric tower (and the associated methods and attributes inherited from the Complex and Rational ABCs), integers now have some public attributes in addition to the special method implementations they have always provided. That means we'll sometimes see things like:
1000000 .bit_length()
to avoid this confusing error:
>>> 1000000.bit_length()
File "stdin", line 1
1000000.bit_length()
^
SyntaxError: invalid syntax
This could definitely be avoided by abandoning Guido's rule that parsing Python shall not require anything more sophisticated than an LL(1) parser and require the parser to backtrack when float parsing fails and reinterpret the operation as attribute access instead. (That said, looking at the token stream, I'm now wondering if it may even be possible to fix this within the constraints of LL(1) - the tokenizer emits two tokens for "1.bit_length", but only one for something like "1.e16". I'm not sure the concept can be expressed in the Grammar in a way that the CPython parser generator would understand, though)

Decorators vs decorator factories


This is the simpler case of the two where I think we have a documentation and education problem rather than a fundamental design flaw, and it stems largely from a bit of careless terminology: the word "decorator" is used widely to refer not only to actual decorators but also to decorator factories. Decorator expressions (the bit after a "@" on a line preceding a function header) are required to produce callables that accept a single argument (typically the function being defined) and return a result that will be bound to the name used in the function header line (usually either the original function or else some kind of wrapper around it). These expressions typically take one of two forms: they either reference an actual decorator by name, or else they will call a decorator factory to create an appropriate decorator at function definition time.

And that's where the sloppy terminology catches up with us: because we've loosely used the term "decorator" for both actual decorators and decorator factories since the early days of PEP 318, decorator implementators get surprised at how difficult the transition can be from "simple decorator" to "decorator with arguments". In reality, it is just as hard as any transition from providing a single instance of an object to instead providing a factory function that creates such instances, but the loose terminology obscures that.

I actually find this case to be somewhat analogous to the case of first class functions. Many developers coming to Python from languages with implicit call semantics (i.e. parentheses optional) get frustrated by the fact that Python demands they always supply the (to them) redundant parentheses. Of course, experienced Python programmers know that, due to the first class nature of functions in Python, "f" just refers to the function itself and "f()" is needed to actually call it.

The situation with decorator factories is similar. @classmethod is an actual decorator, so no parentheses are needed and we can just refer to it directly. Something like @functools.wraps on the other hand, is a decorator factory, so we need to call it if we want it to create a real decorator for us to use.

Evaluating default parameters at function definition time


This is another case where I think we have an underlying education and documentation problem and the confusion over mutable default arguments is just a symptom of that. To make this one extra special, it lies at the intersection of two basic points of confusion, only one of which is well publicised.

The immutable vs mutable confusion is well documented (and, indeed, Armin pointed it out in his article in the context of handling of ordinary function arguments) so I'm not going to repeat it here. The second, less well documented point of confusion is the lack of a clear explanation in the official documentation of the differences between compilation time (syntax checks), function definition time (decorators, default argument evaluation) and function execution time (argument binding, execution of function body). (Generators actually split that last part up even further)

However, Armin clearly understands both of those distinctions, so I can't chalk the objection in his particular case up to that explanation. Instead, I'm going to consider the question of "Well, what if Python didn't work that way?".

If default arguments aren't evaluated at definition time in the scope defining the function, what are the alternatives? The only alternative that readily presents itself is to keep the code objects around as distinct closures. As a point of history, Python had default arguments long before it had closures, so that provides a very practical reason why deferred evaluation of default argument expressions really wasn't an option. However, this is a hypothetical discussion, so we'll continue.

Now we get to the first serious objection: the performance hit. Instead of just moving a few object references around, in the absence of some fancy optimisation in the compiler, deferred evaluation of even basic default arguments like "x=1, y=2" is going to require multiple function calls to actually run the code in the closures. That may be feasible if you've got a sophisticated toolchain like PyPy backing you up but is a serious concern for simpler toolchains. Evaluating some expressions and stashing the results on the function object? That's easy. Delaying the evaluation and redoing it every time it's needed? Probably not too hard (as long as closures are already available). Making the latter as fast as the former for the simple, common cases (like immutable constants)? Damn hard.

But, let's further suppose we've done the work to handle the cases that allow constant folding nicely and we still cache those on the function object so we're not getting a big speed hit. What happens to our name lookup semantics from default argument expressions when we have deferred evaluation? Why, we get closure semantics of course, and those are simple and natural and never confused anybody, right? (if you believe I actually mean that, I have an Opera House to sell you...)

While Python's default argument handling is the way it is at least partially due to history (i.e. the lack of closures when it was implemented meant that storing the expressions results on the function object was the only viable way to do it), the additional runtime overhead and complexity in the implementation and semantics involved in delaying the evaluation makes me suspect that going the runtime evaluation path wouldn't necessarily be the clear win that Armin suggests it would be.