Failing In So Many Ways

Icon

Liang Nuren – Failing In So Many Ways

There Is No Excuse For Bad Programming

I’d like to take a moment to discuss a common problem in software development. It comes up everywhere. That problem is the problem of what to do. No really, hear me out on this. Suppose we want to get a value based on the input of a function. We might build a function that looks like this:

def if_block_func(v):
if v == 'some string1': x = 1
elif v == 'some string2': x = 2
elif v == 'some string3': x = 3
elif v == 'some string4': x = 4
elif v == 'some string5': x = 5
elif v == 'some string6': x = 6
elif v == 'some string7': x = 7
else: x = 8

return x

Of course, our piece of code works and it’s relatively fast. However, it’s kinda ugly and painful to maintain. I’m pretty sure we can come up with a more maintainable piece of code, even if it’s slightly slower. Seeing as how we’re using Python for this example, we might come up with something clever like this:


def recreate_dict_func(v):
d = {
'some string1' : 1,
'some string2' : 2,
'some string3' : 3,
'some string4' : 4,
'some string5' : 5,
'some string6' : 6,
'some string7' : 7,
}
return d.get(v, 8)

Of course, this only works if there’s really simple requirements and a single value is a single value. What if we needed something slightly more complicated? Maybe the return value for ‘some string4’ depends on several other variables. Well, that’s certainly a lot stronger case for if/else. However, we might be able to adjust the dictionary based approach by doing something more clever like this:


def recreate_lambda_dict_func(v):
dl = {
'some string1' : lambda: 1,
'some string2' : lambda: 2,
'some string3' : lambda: 3,
'some string4' : lambda: 10 if random.choice(range(5)) % 3 == 0 else 4,
'some string5' : lambda: 5,
'some string6' : lambda: 6,
'some string7' : lambda: 7,
}
return dl.get(v, lambda: 8)()

Of course, the penalty of working in a higher level language is generally a performance penalty, and it’s pretty easy to put together a benchmark that shows that penalty here. However, sometimes programmer time might be worth that penalty. But sometimes it may not be. Maybe there’s a way we could improve these functions to be even faster? Maybe if we instantiated less objects the method would be faster.


external_dict = {
'some string1' : 1,
'some string2' : 2,
'some string3' : 3,
'some string4' : 4,
'some string5' : 5,
'some string6' : 6,
'some string7' : 7,
}
def external_dict_func(v):
global external_dict
return external_dict.get(v, 8)

external_lambda_dict = {
‘some string1’ : lambda: 1,
‘some string2’ : lambda: 2,
‘some string3’ : lambda: 3,
‘some string4’ : lambda: 4,
‘some string5’ : lambda: 5,
‘some string6’ : lambda: 6,
‘some string7’ : lambda: 7,
}

def external_lambda_dict_func(v):
global external_lambda_dict
return external_lambda_dict.get(v, lambda: 8)()

So let’s put together this benchmark that shows how bad the performance penalty is. However to keep things on the even, let’s just have the lambda function always return 4 (just because the rest of the code exhibits that behavior). Let’s throw a @benchmark decorator on those methods and run this. So here’s the results:


~ $ python dict_vs_ifs.py
+---------------------------+--------------+---------+
| Function | Sum Duration | Calls |
+===========================+==============+=========+
| external_dict_func | 0.541 | 1000000 |
+---------------------------+--------------+---------+
| if_block_func | 0.682 | 1000000 |
+---------------------------+--------------+---------+
| external_lambda_dict_func | 0.727 | 1000000 |
+---------------------------+--------------+---------+
| recreate_dict_func | 1.258 | 1000000 |
+---------------------------+--------------+---------+
| recreate_lambda_dict_func | 1.746 | 1000000 |
+---------------------------+--------------+---------+

Ok, that’s pretty interesting. The external dictionary function is actually the fastest (almost 25% faster!) – while being the most maintainable. That’s absolutely wild. I know, I know – you’re going to tell me that this doesn’t represent “the real world”. After all, we’re frequently working with classes and you’ve got to do the instance dereference to get the dictionary and .. ok. Sure. Let’s modify the code such that we have that layer of dereferencing. It’s certain to add some complexity and maybe our naive if/else block will be the most performant option again. While we’re at it, we can run a variant where the variable we’re switching on is an instance variable.

Class attribute lookups

Benchmark Results
+---------------------------+--------------+---------+
| Function | Sum Duration | Calls |
+===========================+==============+=========+
| external_dict_func | 0.609 | 1000000 |
+---------------------------+--------------+---------+
| if_block_func | 0.707 | 1000000 |
+---------------------------+--------------+---------+
| external_lambda_dict_func | 0.807 | 1000000 |
+---------------------------+--------------+---------+
| recreate_dict_func | 1.314 | 1000000 |
+---------------------------+--------------+---------+
| recreate_lambda_dict_func | 1.749 | 1000000 |
+---------------------------+--------------+---------+

Instance Variable

Benchmark Results
+---------------------------+--------------+---------+
| Function | Sum Duration | Calls |
+===========================+==============+=========+
| external_dict_func | 0.645 | 1000000 |
+---------------------------+--------------+---------+
| if_block_func | 0.851 | 1000000 |
+---------------------------+--------------+---------+
| external_lambda_dict_func | 0.854 | 1000000 |
+---------------------------+--------------+---------+
| recreate_dict_func | 1.342 | 1000000 |
+---------------------------+--------------+---------+
| recreate_lambda_dict_func | 1.821 | 1000000 |
+---------------------------+--------------+---------+

As we can see, the dictionary based approach continues to simultaneously excel at both readability and performance. As situations come closer to real life, the lambda approach begins to also look better than if/else blocks. Basically, there is no excuse for writing bad code.

Note: I apologize for the shitty code formatting.  Wordpress has deleted like 12 times and I’m done fighting it.  Maybe tomorrow.

Advertisements

Filed under: Software Development,

4 Responses

  1. Druur Monakh says:

    Now, if Python had a ‘switch’ statement, which would combine the clarity of the recreate_dict source with the speed of the external_dict execution…

  2. omouse says:

    You could try using gist.github.com for posting code snippets. I think that wordpress.com supports as a shortcode

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: