Failing In So Many Ways


Liang Nuren – Failing In So Many Ways

Finding Unused Functions in a Python Source Tree

So for the last few months I have been crunching up a storm on getting the analytics out the door for our latest game.  I just finished some cool new features and had to do some major refactors (one step at a time, of course).  I began to suspect that I’d left some stray code hanging out there that wasn’t being used anymore.  I figured a great way to solve the problem was by looking at each function and grepping the source tree to see if it existed anywhere else.  First, let me introduce you to ack-grep [] (aliased as ‘ack’ below).  The best thing about ack-grep is that it lets you filter out certain file types from your results – and despite being written in Perl it’s frequently much faster than grep.

Now I’ll go over the evolution of a shell command.  Some of the steps are injected here because I knew where I was going – but for your benefit I’ll explain how things evolved.  The code base in question is about 10k lines over 140 files.

$ ack ‘def’

… bunch of functions and a bunch of other stuff …

$ ack ‘ def ‘

… function definitions …

Now let’s get the function name itself.  It should look like ‘def “function name”(arguments):’ so the first order of business is to cook up a regular expression to filter the line.  I’m better with Perl than with Sed, so I did it like this:

$ ack ‘def ‘ | perl -pe ‘s/.*def (\w+)\(.*/\1/’

I scrolled through looking for any obvious errors but didn’t find any.  Now comes the magic sauce.  xargs -I{} creates a new command for every input line and inserts the input row where {} exists.  Basically, this created a for each loop in shell.

$ ack ‘def ‘ | perl -pe ‘s/.*def (\w+)\(.*/\1/’ | xargs -I{} bash -c ‘ack “{}”‘

One thing I saw was that function usages were being attributed to functions that *contained* their name.

$ ack ‘def ‘ | perl -pe ‘s/.*def (\w+)\(.*/\1/’ | xargs -I{} bash -c ‘ack “\b{}\b”‘

That’s better, but functions that are defined multiple times (like __init__) are coming up a lot…

 $ ack ‘def ‘ | perl -pe ‘s/.*def (\w+)\(.*/\1/’ | sort | uniq | xargs -I{} bash -c ‘ack “\b{}\b”‘

Ok, but now the text is just flying by so fast and some functions are used hundreds of times and others very few…

$ ack ‘def ‘ | perl -pe ‘s/.*def (\w+)\(.*/\1/’ | sort | uniq | xargs -I{} bash -c ‘ack “\b{}\b” | wc -l’

Ok, now I know how many times something appears but not what it is…

$ ack ‘def ‘ | perl -pe ‘s/.*def (\w+)\(.*/\1/’ | sort | uniq | xargs -I{} bash -c ‘echo {} && ack “\b{}\b” | wc -l’

Ok, that’s great.  Now I have the name of the function and how many times that name appears in the source tree.  It’s kinda unwieldy though because the name is on a different line than the count and I can’t use grep.

$ ack ‘def ‘ | perl -pe ‘s/.*def (\w+)\(.*/\1/’ | sort | uniq | xargs -I{} bash -c ‘echo -n {} && ack “\b{}\b” | wc -l’

Fantastic.  Now I can look at just the functions that rarely get used!  Oh look, there’s so many tests coming up… :-/

$ ack ‘def ‘ | perl -pe ‘s/.*def (\w+)\(.*/\1/’ | sort | uniq | xargs -I{} bash -c ‘echo -n {} && ack “\b{}\b” | wc -l’ | egrep -v ‘^test_’

Ok, now let’s limit it to things appearing once…

$ ack ‘def ‘ | perl -pe ‘s/.*def (\w+)\(.*/\1/’ | sort | uniq | xargs -I{} bash -c ‘echo -n {} && ack “\b{}\b” | wc -l’ | egrep -v ‘^test_’ | egrep ‘ 1$’

Fantastic.  That’s exactly what I was looking for.


Filed under: Software Development, , ,