Remove filter Showing only articles with tag "ItsATrap"

It's a trap - Multilingual web applications

Posted: Saturday, 2009-11-21 10:26 | Tags: ItsATrap, Programming, Web

Say you're designing a multilingual web application. Where would you put the language information?

  • Carry it over with each request (generate hidden inputs)
  • Into the users session
  • Into the URL

The first approach is obviously quite tedious and more or less just included for completeness. No doubt you don't wanna do this except for a really small app or some rather unusual constraints (no cookies, framework that has hard constraints on URL scheme).

The second one looks like a really good idea:

  • You don't need to find out the users language with each request with url parsing or similar things,
  • You can freely choose your URLs after whatever scheme you like, language doesn't have to be a part of it.
  • You URLs can be exchanged between users of different languages, they will be displayed in each users favourite language to them.

So wheres the TRAP? Here a hint: Think of search engines.

Still don't see it?

Well took me some time, too. The problem is: If a search engine from some country indexes your page its bot will most probably not emulate/have cookie support. So all search engines in all countries will see and thus index the page in its default language (that one that appears when a user has no cookies enabled). This way the search engines won't find your site when the users enter keywords in their own language. What a crap.

So the answer is obvious: Put that information into your url. Or if you want detect it from the domain name (if you have one domain for each country/language <- distinguish carefully btw!)

I guess this is a common trap. I personally am python coder for years now and still, from time to time I run into it. As the syntax for it is very concise you might sometimes be tempted to give default values for members in a classes definition. Code says more then thousand words. This works fine:

class Foo(object):
  x = 7

f1 = Foo()
f2 = Foo()

f2.x = 42

print f1.x # prints 7
print f2.x # prints 42

Now say we have something else for x, say it will contain a list most of the time and thus by default it should be the empty list:

class Foo(object):
  x = []

  def append(self, y): self.x.append(y)

f1 = Foo()
f2 = Foo()

f1.append("Take evasive action!")

print f1.x # prints ["Take evasive action!"]
print f2.x # prints ["Take evasive action!"] <- WTF?

The somewhat experienced programmer immediately sees what happens. Let's look at this step by step.

What does python do when you access f1.x?

It first looks if the object has a member x, if it can not find any, it accesses the class member x.

So what exactly happens in the first example?

Directly after construction the objects f1 and f2 point both to Foo.x that is at the same x. Thes with f2.x = 42 we add a member x to the object f2 so that we have a situation like the following:

  • Foo.x points to the value 7
  • f1 does not have a member x, so access to f1.x will be redirected to Foo.x
  • f2.x points to the value 42

Note that you never "change" an integer in python, you can only reference a different integer, ie integers are immutable.

What exactly happens in the second example?

As in the first example, after construction, both f1 and f2 don't even have a member x, they just redirect to Foo.x. Whats different here is that in this example we don't ever create an x-member in one of the objects!

Instead of integers, lists are mutable. In our append call, we actually change Foo.x instead of creating a new member. As both f1 and f2 still pass through to Foo.x, we see the change on both objects.

Yet another trap!

Now you might think "Okay, then I do it like this:"

class Foo(object):
  def __init__(self, x=[]):
    self.x = x

f1 = Foo()
f2 = Foo()

But this is basically the same problem, with a slightly different background. The default value for x you type there is only created once. So what you do here would be creating members f1.x and f2.x but still having them point to the same object which leads to the same unwanted result.

So what should I do?

Don't use this as a way for giving default values for object members. Instead do something like:

class Foo(object):
  def __init__(self, **kwargs):
    self.x = kwargs.get('x', [])

This way you have the default value defined inside __init__, which guarantees that each defaultly-initialized object gets a different "[]".

Depending on your actual needs you might also do something like:

class Foo(object):
  def __init__(self, x=None):
    self.x = x
    if self.x is None: self.x = []

Or you could copy every received value:

class Foo(object):
  def __init__(self, x=[]):
    self.x = x[:] # assumes, passed x is always a list!

For those of you that are still very new to python: Try to always keep one eye on what references what. Be aware that everything in python has reference semantics although for immutable values that can't lead to this problem (and in lots of situations this "feels" like value semantic). Also note that my approaches are far from perfect too. And: Don't use any of that code if you do not fully understand it, play around with everything in the python interpreter, you can learn a lot of things there.

It's a trap!

Posted: Saturday, 2009-11-21 10:16 | Tags: ItsATrap, Site

I'm hereby creating a new kind of postings on this site: I want to post about common programming traps, you know that kind of stuff you think at first "Obviously it works this way" and later on you feel... trapped!

For general reference to this topic see:

Admiral Ackbar, Sheldon and/or Captain Picard

Traps ahead, be prepared!