A tale of autotools pedantry

Wed, 09 Jan 2013 16:48:18 +0000

The backstory to this one is too tedious to describe, and besides, whenever I do that I get irrelevant comments about the backstory, also I'm going to try and make my posts more concise this year both for my sake and yours. See, it is working already!

Anyway, I'm using autotools to compile binutils, and for some reason when a run make install, the permissions of directories are different on different machines. I mean, it doesn't really matter other than the fact that I'm tar-ing up the result and checking the tarfiles are identical (for various regression tests and validating that builds are reproducible). And I could probably just fix this after make install has run by changing the file permissions as I add the file to the archive (since I already squelch the username and timestamp in a similar manner), but doing that wouldn't let me have fun appreciating how awesome the autotools are.

At first I assumed this was just a umask problem, and that is certainly part of it (why the Fedora default umask is 002 remains a mystery to me). Anyway, after changing my other machine to have a matching 002 umask the same problem was happening. (Fun note, if I'd experimented the other way, and changed my umask on Fedora to something sane, I would have also managed to avoid this fun.). So anyway, at this point on Fedora make install creates group-writable directories (as expected with a 002 umask), however, on OS X make install continues to create non group-writable directories, even with the modified umask.

The best thing about autotools is that it makes the output from make really simple to read and understand, so it didn't take long at all to work out that on Fedora it was directly using mkdir -p to create the directories, however on OS X it was using install-sh -c -d to create the directories. install-sh is a magical shell script that, as far as I can tell, is install during the auto* process.

Just as the output from running an autoconf Makefile is a pleasure to read, the generated Makefiles are themselves a work of beauty primarily underscored by great clarity and concision, much like my own prose./

In a snap, I found the offending line:

test -z "$(man1dir)" || $(MKDIR_P) "$(DESTDIR)$(man1dir)"

Sarcasm aside, this is pretty simple to grok; if we have a non empty $(man1dir) variable, create the directory using whatever MKDIR_P expands to. Given the name of the variable, you would have to assume that this is generally assumed to expand to mkdir -p in the common case, which is exactly what it does on my Fedora box. In any case, we are now just left with two interesting questions: Why isn't mkdir -p used on OS X? and Why does install -c -d have different semantics to mkdir -p? and Why does configure choose that if the semantics are different? and Why can't I count?

Well, to answer the first question, the magical configure script does an explicit version check on mkdir using the --version flag. If your mkdir doesn't give back one of the magically blessed strings, you are out of luck. As you can guess the BSD derived mkdir on OS X doesn't play nicely here, in fact it doesn't even support the --version flag at all. The interesting thing is that this check is described as checking for a thread-safe mkdir -p, but really it is just checking for a GNU mkdir. This check was added in 1.8.3.

Early versions of OS X mkdir certainly seem to have the race-condition (see mkdir.c), however more modern version appear to use mkpath_np underneath, which appears on my first reading to correctly address the issue (see: mkpath_np.c).

OK, so that is a pretty good answer for that first question; looks like autotools saves the day again by protecting us from broken mkdir -p implementations, of which there seem to be many. (Seriously, how do you fuck that up in the first place? Time-of-check vs time-of-use is a pretty simple thing to consider during design. More to the point how does it stay there for what, 20 years, without getting fixed?) Of course, now that various non-GNU mkdir implementations (well at least one), seem to have fixed the problem, it would be nice if configure had some way of knowing and didn't have to drop to the sub-optimal install-sh -c -d.

OK, so now on to trying to work out what is going on with install-sh. In theory this should have the same semantics as mkdir -p. Or at least I think it should if configure is going to choose it in preference to the mkdir -p. Now this script does a lot, but if we find the interesting bit is:

(umask $mkdir_umask &&
    eval "\$doit_exec \$mkdirprog $prefixes")

This is explicitly setting the umask to a variable mkdir_umask before executing mkdir (in a round-about way)

Well, this is pretty much explained in the source:

# Create intermediate dirs using mode 755 as modified by the umask.
# This is like FreeBSD 'install' as of 1997-10-28.

and then backed up by this commit message. Basically, using mode 755 as the base mode is safer than using mkdir's default of 777. On some levels I've got to agree (see my earlier outrage at Fedora's default umask), but shouldn't this kind of script obey whatever I specify as my umask? If I was collaborating on a group project, and had explicitly set my umask to make that easy, then this would be a pretty annoying. Of course, my main problem here is that the behaviour is different to that of mkdir -p.

So, the best work around for this is to just not use a umask of 002 in the first place. To be honest, I'm not really sure if this behaviour, is a bug or a feature, and if it is a bug whether the mkdir -p or the install -d behaviour is correct.

The real moral of this story is that if you are super pedantic about the little things, you get to find out really uninteresting details about software you'd rather not use while wasting time that could be better spent on just about anything else interesting things about the software you use every day, making you a more knowledgeable engineer.

blog comments powered by Disqus