The backstory to this one is too tedious to describe, and besides, whenever I do that I get irrelevant comments about the backstory, also I'm going to try and make my posts more concise this year both for my sake and yours. See, it is working already!
Anyway, I'm using autotools
to compile binutils, and for some reason
when a run make install
, the permissions of directories are different on
different machines. I mean, it doesn't really matter other than the fact that I'm tar-ing
up the result and checking the tarfiles are identical (for various regression tests
and validating that builds are reproducible). And I could probably just fix this after
make install
has run by changing the file permissions as I add the file
to the archive (since I already squelch the username and timestamp in a similar manner),
but doing that wouldn't let me have fun appreciating how awesome the autotools are.
At first I assumed this was just a umask
problem, and
that is certainly part of it (why the Fedora default
umask
is 002
remains a mystery to me). Anyway,
after changing my other machine to have a matching 002
umask
the same problem was happening. (Fun note, if I'd experimented the other way, and changed
my umask
on Fedora to something sane, I would have also managed to avoid this
fun.). So anyway, at this point on Fedora make install
creates group-writable
directories (as expected with a 002
umask
), however, on OS X
make install
continues to create non group-writable directories, even
with the modified umask
.
The best thing about autotools is that it makes the output from
make
really simple to read and understand, so it didn't
take long at all to work out that on Fedora it was directly using
mkdir -p
to create the directories, however on OS X it
was using install-sh -c -d
to create the
directories. install-sh
is a magical shell script that,
as far as I can tell, is install during the auto*
process.
Just as the output from running an autoconf Makefile
is a pleasure to read, the generated Makefiles are themselves a work
of beauty primarily underscored by great clarity and concision, much
like my own prose./
In a snap, I found the offending line:
test -z "$(man1dir)" || $(MKDIR_P) "$(DESTDIR)$(man1dir)"
Sarcasm aside, this is pretty simple to grok; if we have a
non empty $(man1dir)
variable, create the directory
using whatever MKDIR_P
expands to. Given the
name of the variable, you would have to assume that this is generally
assumed to expand to mkdir -p
in the common case, which
is exactly what it does on my Fedora box. In any case, we are now
just left with two interesting questions: Why isn't mkdir -p
used on OS X? and Why does install -c -d
have different
semantics to mkdir -p
? and Why does configure choose that
if the semantics are different? and Why can't I count?
Well, to answer the first question, the magical configure
script does an explicit version check on mkdir
using
the --version
flag. If your mkdir
doesn't
give back one of the magically blessed strings, you are out of luck.
As you can guess the BSD derived mkdir
on OS X doesn't
play nicely here, in fact it doesn't even support the --version
flag at all. The interesting thing is that this check is described as
checking for a thread-safe mkdir -p
, but really it is just
checking for a GNU mkdir. This check was added in 1.8.3.
Early versions of OS X mkdir
certainly seem to have
the race-condition (see mkdir.c),
however more modern version appear to use mkpath_np
underneath, which
appears on my first reading to correctly address the issue (see: mkpath_np.c).
OK, so that is a pretty good answer for that first question; looks
like autotools saves the day again by protecting us from broken
mkdir -p
implementations, of which there seem to be many.
(Seriously, how do you fuck that up in the first place? Time-of-check
vs time-of-use is a pretty simple thing to consider during
design. More to the point how does it stay there for what, 20 years,
without getting fixed?) Of course, now that various non-GNU mkdir
implementations (well at least one), seem to have fixed the problem, it would
be nice if configure
had some way of knowing and didn't have
to drop to the sub-optimal install-sh -c -d
.
OK, so now on to trying to work out what is going on with
install-sh
. In theory this should have the same semantics
as mkdir -p
. Or at least I think it should if
configure
is going to choose it in preference to the
mkdir -p
. Now this script does a lot, but if we find the
interesting bit is:
(umask $mkdir_umask && eval "\$doit_exec \$mkdirprog $prefixes")
This is explicitly setting the umask
to a variable mkdir_umask
before executing mkdir
(in a round-about way)
Well, this is pretty much explained in the source:
# Create intermediate dirs using mode 755 as modified by the umask. # This is like FreeBSD 'install' as of 1997-10-28.
and then backed up by this commit message.
Basically, using mode 755 as the base mode is safer than using mkdir
's default of 777. On some levels I've got to agree (see my earlier outrage at Fedora's default umask
), but shouldn't this kind of script obey whatever I specify as my umask
? If I was collaborating on a group project, and had explicitly set my umask
to make that easy, then this would be a pretty annoying. Of course, my main problem here is that the behaviour is different to that of mkdir -p
.
So, the best work around for this is to just not use a umask
of 002
in the first place. To be honest, I'm not really sure if this behaviour, is a bug or a feature, and if it is a bug whether the mkdir -p
or the install -d
behaviour is correct.
The real moral of this story is that if you are super pedantic about the little things, you get to find out really uninteresting details about software you'd rather not use while wasting time that could be better spent on just about anything else interesting things about the software you use every day, making you a more knowledgeable engineer.