Musings on literate programming

Wed, 21 May 2008 15:35:14 +0000
tech article programming

You know that any blog post with musings in the title is going to be a lot of navel-gazing babble, so I don’t blame you if you skip out now, this is mostly for me to consolidate my thoughts on literate programming.

The idea behind literate programming that you should write programs with a human reader as the primary audience, and a compiler as the secondary audience. So this means that you organise your program in logical orders that aid explanation of the program; think chapters and sections, rather than organising your program in a way that is oriented towards the compilers; think files and functions.

Some of the outcomes of writing your programs in this literate manner is that you think a lot more about how to explain things to another programmer (who is your audience), than if you are writing with the compiler as your audience. I’m quite interested in things that can improve the quality of my code personally, and of my team’s code. So I thought I’d try it out.

I first tried a somewhat traditional tool, called noweb. I took a fairly complex module of a kernel that I’m writing as the base for this. The output that I produced from this was some quite nice look LaTeX, that I think did a good job of explaining the code, as well as some of the design decisions that might have otherwise been difficult to communicate to another programmer. I was able to structure my prose in a way that I thought was quite logical for presentation of the ideas, but ended up being quite different to the actual structure of the original code. It is no surprise that the tool to take the source file, and generate source files to be used by the compile is called tangle. Unfortunately I can’t really share the output of this experiment as the code is closed (at the moment).

While I liked the experience of using noweb, it seem a lot like a case of write the code, then write the documentation, and then going back to modify the code would be a real nightmare. There is a lot of evidence (i.e: working bodies of code) that a body of code can be worked on by multiple people at once reasonably effectively. I’m yet to see a piece of literature that can be effectively modified by multiple parties. (And no, Wikipedia doesn’t count).

One person who agrees is Zed Shaw. He agreed so much that he made his own tool, Idiopidae that allowed you to have code, but then separately create documentation describing that code, in a mostly literate manner, in a separate file. This seemed like a good altnernative, and I tried it out when documenting simplefilemon. Here the documentation is separate, but the code has markers in it so that the documentation can effectively refer to blocks of code, which goes some way to eliminating the problems of traditional literate programming. For a start syntax hi-lighting actually worked! (Yes, emacs has a way of doing dual major modes, but none of them really worked particularly well). When doing this approach, I have to admit it felt less literate, which is pretty wishy-washy, but I felt more like I was documenting the code, rather than really taking a holistic approach to explaining the program. Of course, that isn’t exactly a good explanation, but it definitely felt different to the other approach. Maybe I felt dirty because I wasn’t following the Knuth religion to the letter. I think this approach probably has more legs, but I did end up with a lot of syntactic garbage in the source file, which made it more difficult to read than it should have been. Also I couldn’t find a good way of summarising large chunks of code. So for example, I could present code for a loop with the body replaced by a reference to another block of code, which is one of the nice things I could do in noweb. Of course that is probably something that can be added to the tool in the future, and isn’t really the end of the world.

Where to go next? Well, I think I’m going to try and go back and reproduce my original kernel documentation using Idiopidae to see what the experience is when only modifying one variable (the tool), and see how that goes. If that can produce something looking reasonably good, I think I might invest some time in extending Idiopidae to get it working exactly how I want it to.

blog comments powered by Disqus