Failing In So Many Ways


Liang Nuren – Failing In So Many Ways

Persistent Configurations

Persistent Configuration is the idea of keeping track of a configuration change between restarts of an application.  Things that might be configured are things like window locations and what theme you’re using.  The obvious way to store this information is via a flat file – like an ini or pickle file.

Here’s what you might expect to see:

import pickle, time

def log_call(fn):
    def _decorator(*args, **kw):
        print "%.02f: %s Called" % (time.time(), fn.__name__)
        val = fn(*args, **kw)
        print "%.02f: %s Finished" % (time.time(), fn.__name__)
        return val
    return _decorator

class App:
    def __init__(self):
        self.option_filename = "filename.opt"

    def initialize_options(self):
        self.options = {}

        for x in range(100000):
            self.options["Option %d" % x] = "Value %d" % x

    def load_options(self):
            with open(self.option_filename, 'r') as fp:
                self.options = pickle.load(fp)
        except Exception, e:

    def save_options(self):
        with open(self.option_filename, 'w') as fp:
            pickle.dump(self.options, fp)

app = App()

It even works as expected:

$ python
1323829572.75: load_options Called
1323829572.75: initialize_options Called
1323829572.89: initialize_options Finished
1323829572.89: load_options Finished
1323829572.89: save_options Called
1323829573.60: save_options Finished
$ wc -l filename.opt
400001 filename.opt

Now, that looks like a totally reasonable way to do things and its just a few lines of code to save and load your configuration.  But what happens if someone does this to you:

$ python
1323829586.04: load_options Called
1323829586.81: load_options Finished
1323829586.81: save_options Called
^CTraceback (most recent call last):
File “”, line 38, in <module>

$ wc -l filename.opt
186599 filename.opt

$ python
1323831195.15: load_options Called
1323831195.50: initialize_options Called
1323831195.64: initialize_options Finished
1323831195.64: load_options Finished
1323831195.64: save_options Called
1323831196.35: save_options Finished
$ wc -l filename.opt
400001 filename.opt

Well, it looks like we lost our configuration file and had to rebuild it from scratch.  While that can be merely inconvenient when dealing with user preferences and application options, it can be really devastating if you’re keeping track of something that’s really important.  From here we should probably improve our infrastructure by writing first to a tmp file, fsync(), make sure you can read the new version, and finally replace the old version.  And then we need to remember not to forget proper exception handling!  Oh my, what a headache!

On the flip side, we could just use something that’s designed to do exactly what we want.  What we’re really after here is the ability to Atomically change an attribute or option, maintain file Consistency and integrity no matter what, and have Durability such that once we’ve updated the option we know it isn’t gonna be switching back because we forgot to fsync()!  If we stop and think about it, the design requirements are almost exactly ACID [Wikipedia].  Fortunately, not all databases are heavy weight, and we can use something like SQLite.  And Python (as well as many other popular languages — like HTML5) have it built right in.

import sqlite3

class App:
    def __init__(self):
        self.option_filename = "filename.db"
            self.conn = sqlite3.connect(self.option_filename)
        except Exception, e:
            print "Unable to open options db ", e

    def initialize_options(self):
        with self.conn:
            self.conn.execute("create table options (key text, value text, primary key(key))")
            for x in range(100000):
                self.set_option("Option %d" % x, "Value %d" % x) # executemany would be better here

    def get_option(self, option_name):
        with self.conn:
            value = self.conn.execute("select value from options where key = ?").fetchone()[0]
        return value

    def set_option(self, option_name, value):
        with self.conn:
            self.conn.execute("update options set value - ? where key = ?")

app = App()

Now what happens if someone ^Cs your application while its writing its option files?  Well, they lose the updates to the options that haven’t already been written yet – and that’s a totally acceptable state of affairs.


Filed under: Software Development, , ,

3 Responses

  1. Mara Rinn says:

    +1 was talking to the guys at the office about this today. Customer-facing Asterisk server had a kernel oops half way through a rewrite of the config file, 2000 customers lost their phone numbers for three hours.

    We don’t do excellence, word from management is that mediocrity is more profitable.

    I wish I was making this up.

    • Liang Nuren says:

      Oh man, that sounds really brutal. The sad thing about it is that there’s almost no excuse for it too – the code and effort required to do it right is arguably less than trying to do it wrong!

      • Mara Rinn says:

        Then there’s the pain of watching my colleagues manually testing web application because Selenium “is just another thing to go wrong”


        In the meantime I am messing about with unit testing in my Perl and Python apps, handing the package over and saying, “here, it works” but they won’t use my code because they weren’t involved in writing it, and they spend the next six months reinventing the stuff I did, getting it wrong, and using their failure as proof that I don’t know what I am doing and that my way wouldn’t have worked anyway.

        I am not long for this world :/

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: