It’s a trap! On Python reference semantics

I guess this is a common trap. I personally am python coder for years now and still, from time to time I run into it. As the syntax for it is very concise you might sometimes be tempted to give default values for members in a classes definition. Code says more then thousand words. This works fine:

class Foo(object):
  x = 7

f1 = Foo()
f2 = Foo()

f2.x = 42

print f1.x # prints 7
print f2.x # prints 42

Now say we have something else for x, say it will contain a list most of the time and thus by default it should be the empty list:

class Foo(object):
  x = []

  def append(self, y): self.x.append(y)

f1 = Foo()
f2 = Foo()

f1.append("Take evasive action!")

print f1.x # prints ["Take evasive action!"]
print f2.x # prints ["Take evasive action!"] <- WTF?

The somewhat experienced programmer immediately sees what happens. Let’s look at this step by step.

What does python do when you access f1.x?

It first looks if the object has a member x, if it can not find any, it accesses the class member x.

So what exactly happens in the first example?

Directly after construction the objects f1 and f2 point both to Foo.x that is at the same x. Thes with f2.x = 42 we add a member x to the object f2 so that we have a situation like the following:

Foo.x points to the value 7
f1 does not have a member x, so access to f1.x will be redirected to Foo.x
f2.x points to the value 42

Note that you never “change” an integer in python, you can only reference a different integer, ie integers are immutable.

What exactly happens in the second example?

As in the first example, after construction, both f1 and f2 don’t even have a member x, they just redirect to Foo.x. Whats different here is that in this example we don’t ever create an x-member in one of the objects!

In contrast to integers, lists are mutable. In our append call, we actually change Foo.x instead of creating a new member. As both f1 and f2 still pass through to Foo.x, we see the change on both objects.

Yet another trap!

Now you might think “Okay, then I do it like this:”

class Foo(object):
  def __init__(self, x=[]):
    self.x = x

f1 = Foo()
f2 = Foo()

But this is basically the same problem, with a slightly different background. The default value for x you type there is only created once. So what you do here would be creating members f1.x and f2.x but still having them point to the same object which leads to the same unwanted result.

So what should I do?

Don’t use this as a way for giving default values for object members. Instead do something like:

class Foo(object):
  def __init__(self, **kwargs):
    self.x = kwargs.get('x', [])

This way you have the default value defined inside __init__, which guarantees that each defaultly-initialized object gets a different “[]”.

Depending on your actual needs you might also do something like:

class Foo(object):
  def __init__(self, x=None):
    self.x = x
    if self.x is None: self.x = []

Or you could copy every received value:

class Foo(object):
  def __init__(self, x=[]):
    self.x = x[:] # assumes, passed x is always a list!

For those of you that are still very new to python: Try to always keep one eye on what references what. Be aware that everything in python has reference semantics although for immutable values that can’t lead to this problem (and in lots of situations this “feels” like value semantic). Also note that my approaches are far from perfect too. And: Don’t use any of that code if you do not fully understand it, play around with everything in the python interpreter, you can learn a lot of things there.

What does python do when you access f1.x?

So what exactly happens in the first example?

What exactly happens in the second example?

Yet another trap!

So what should I do?

Leave a Comment Cancel reply