Mutation testing, reinventing the wheel, and python bytecode

Sun, 18 Jun 2006 19:18:54 +0000
tech article python

So I recently stumbled across Jester, which is a mutation testing tool.Mutation testing is really a way of testing your tests. The basic idea is:

  1. Ensure your tests work on the code under test.
  2. Generate a set of mutant pieces of code, which are copies of the original code with some small changes, or mutations.
  3. For each mutant, run you test cases. If one of these tests doesn't fail then you are missing test case.

So in testing your code you should first aim for complete coverage, that is ensure that your test cases exercise each line of code. Once you get here, then mutation testing can help you find other holes in your test cases.

So the interesting thing here is what mutation operations (or mutagens), you should do on the original code. Jester provides only some very simple mutations, (at least according to this paper):

Unfortunately it seems that such an approach is, well, way too simplistic, at least according Jeff Offutt, who has published papers in the area. Basically, any rudimentary testing finds these changes, and doesn't really find the holes in your test suite.

A paper on Mothra, a mutation system for FORTRAN, describes a large number of different mutation operations. (As a coincident, my friend has a bug tracking system called Mothra.)

Anyway, so while Jester has a python port, I didn't quite like its approach (especially since I needed to install Java, which I didn't really feel like doing). So I decided to explore how this could be done in Python.

So the first set of mutation I wanted to play with is changing some constants to different values. E.g: change 3 to (1 or 2). So this turns out to be reasonably easy to do in python. I took functions as the unit of test I wanted to play with. So, Python makes it easy to introspect function. You can get a list of conants on a function like so: function.func_code.co_consts. Now, you can't actually modify this, but what you can do is make a new copy of the method with a different set of constants. This is conveniant because in mutation testing we want to create mutants. So:

def f_new_consts(f, newconsts):
    """Return a copy of function f, but change its constants to newconsts."""
    co = f.func_code
    codeobj = type(co)(co.co_argcount, co.co_nlocals, co.co_stacksize,
                       co.co_flags, co.co_code, tuple(newconsts), co.co_names,
                       co.co_varnames, co.co_filename, co.co_name,
                       co.co_firstlineno, co.co_lnotab, co.co_freevars,
                       co.co_cellvars)
    new_function =  type(f)(codeobj, f.func_globals, f.func_name, f.func_defaults,
                           f.func_closure)
    return new_function

So that is the basic mechanism I'm using for creating mutant function with different constants. The other mutants I wanted to make where those where I change comparison operators. E.g: changing < to <= or >. This is a bit trickier, it means getting down and dirty with the python byte code, or at least this is my preferred approach. I guess you could also do syntactic changes and recompile. Anyway, you can get a string representation of a function's bytecode like so: function.func_code.co_code. To do something useful with this you really want convert it to a list of integers, which is easy as so: [ord(x) for x in function.func_code.co_code].

So far, so good, what you need to be able to do next is iterate through the different bytecode, this is a little tricky as they can be either 1 or 3 bytes long. A loop like so:

    from opcode import opname, HAVE_ARGUMENT
    i = 0
    opcodes = [ord(x) for x in co.co_code]
    while i < len(opcodes):
	print opname[opcode]

        i += 1
        if opcode >= HAVE_ARGUMENT:
           i+= 2

will iterate through the opcodes and printing each one. Basically all bytecodes larger than the constant HAVE_ARGUMENT, are three bytes long.

From here it is possible to find all the COMPARE_OP byte codes. The first argument byte specifies the type of compare operations, these can be decoded using the opcode.cmp_op tuple.

So, at this point I have some basic proof-of-concept code for creating some very simple mutants. The next step is to try and integrate this with my unit testing and coverage testing framework, and then my code will never have bugs in it again!

blog comments powered by Disqus