I’m an evil masochistic hacker
11 09 2003A peck of unexpected trouble
It’s been almost a week since I last updated my blog. No, I didn’t finally find a chance to go on vacation to some nice warm country. I mostly used the time to move to my new flat. And to hack on the build system of SpamAssassin. Yes, we planned to release the new version 2.60 almost two weeks ago. But then suddenly the build system began to fall apart.
How it all works (not)
SpamAssassin is basically a Perl module. Perl modules are built by calling a Makefile.PL which uses ExtUtils::MakeMaker to create a Makefile which is then used to build the module. When you worked long and extensively enough with EU::MM you will quickly come to the conclusion that it sucks. Most Perl developer poke it only with a very long stick — and that only when it’s really inevitable (one reason might be the ugly templates h2xs generates). Oh, in theory it’s a great piece of code, it just has two basic flaws. The first one is best given with Chris Winters words:
While this is an amazing feat (EU::MM runs on almost as many platforms as Perl, which is huge), it points out an unfortunate problem: installing Perl modules requires a working make. Unixfolk take it for granted, but while a working equivalent (nmake) is available for Win32, it’s another hoop to jump through. If you know what you’re doing it’s easy to get around, but most people don’t — they just want things to work.
The second one is covered pretty good in Michael G. Schwern’s (the current maintainer of EU::MM) pesentation “MakeMaker is DOOMED!“. To summarize it: “The architecture of MakeMaker is fundamentally flawed” — “there is no safe way to customize MakeMaker”. But SpamAssassin’s build system has to customize the generated Makefile. Heavily.
Why don’t we just use what the system offers?
The reason is that Mail::SpamAssassin isn’t a “normal” Perl module. On top of the common Perl libraries it ships some command line tools (spamassassin, sa-learn), a daemon (spamd) and a client written in C (spamc). Additionally does SpamAssassin depend on its rule definitions (one could even say most of its logic lies within these files) and is heavily configurable.
EU::MM does its job of creating portable Makefiles for Perl libraries and modules well but doesn’t support pure C programs. So the GNU Autotools are used to adapt spamc to the system it’s built for.
It also doesn’t support other standardised file locations for data and config files (as those aren’t needed by Perl modules in most cases). That means we need a bunch of additional targets in the Makefile to build those files if necessary and later copy them to the right location.
Additionally do the tools have to know where the libraries are located and the libraries need to find the other files they rely on. Those paths can vary depending on the system and are determined when the user creates the Makefile. That means we do need a preprocessor which writes those paths to the files when they are built.
And finally is SpamAssassin not only available in form of source tarballs or via CPAN. There are also packages for the most popular (Linux) packaging systems: RPMs, debs, ebuilds and probably some other ones I don’t know of. For packaged files it’s often necessary to install the files in a temporary build directory which is not the final location the files will be in. So we need to give the user/packager a chance to build SpamAssassin and have one set of paths written to the files while make install puts the files somewhere else.
All this stuff has to work with Perl back to version 5.005_03 (which is the oldest Perl we support — at least until 2.70 when we’re going to ditch that support and require Perl 5.6.1).
But all this was possible with 2.5x, wasn’t it?
Yes. More or less. The need for these features became apparent with time and were hacked into the Makefile.PL one by one. This not only resulted in an inconsistent interface but also in a very fragile kludge around half-understood EU::MM internals (that’s not meant as an accusation — it took me a long time getting up the nerve to throw away the long stick and dive into the guts of EU::MM, too). That stuff was so fragile that some small parts broke once every few months and somebody had to go and fix it. Me alone fixed it at least four times.
One of those occasions was when I (re)wrote the preprocessor which takes care of the mangling of the files. If I remember correctly was this needed when we found out that Perl 5.8.0 enabled Unicode per default — breaking SpamAssassin’s internal data handling. Our first solution to fix this with some pack() magic. This turned out to be slow and broke easily so we went to add a use bytes to each file. Alas! That wasn’t backward compatible to Perl 5.5. So the preprocessor was born which removed all those lines. (Later support for replacing @@VARIABLES@@ was added — or was that added first and the Unicode stuff followed later…?)
The preprocessor is hooked into the build process via EU::MM’s PM_FILTER parameter. But that one was introduced with EU::MM 5.45. Which was shipped with Perl 5.6.1. And 5.6.0. Wait. Was it? Noo… both EU::MMs carry the version number 5.45 but only the latter supports PM_FILTER. *sigh* We want(ed) to be backward compatible to Perl 5.005_03 anyway and that one had EU::MM 5.4302 which did definitely not support the needed feature. After some discussion we came to the conclusion that forcing an upgrade of EU::MM on the user before he could use the new SpamAssassin was not an option. The solution: A package ExtUtils::Install::Post545 which contains the needed code from 6.05. That one is then hooked into the Makefile writing process via some conditional macro and postamble stuff. So that worked. Not nice, not clean, but it worked.
All we needed were some more directories
That built the base for later hacking. Like the @@VARIABLE@@ replacement code. Which worked fine. Until the need to install “somewhere else” (see above) arose. But support for that was quickly added. After quite some patching we had these variables:
- PREFIX
- A default EU:MM variable we pretty much depended on. More on that later. make install copied the files there.
- INST_PREFIX
- For packaging, defaulting to $(PREFIX). This was the prefix which was actually written to the files.
- SYSCONFDIR
- The base dir for config files. Determined from $(PREFIX), normally something like /etc.
- INST_SYSCONFDIR
- Like what $(INST_PREFIX) did for $(PREFIX); the sample config file was copied to $(SYSCONFDIR) on make install but the installed SpamAssassin later on would look for it here instead.
- INST_SITELIB
- Another of those. The SpamAssassin applications need to know where the libs are found, even with very weird combinations of installation paths. (So the user doesn’t have to set PERL5LIB.) Defaults to $(INSTALLSITELIB).
- LOCAL_RULES_DIR
- The full path to the config dir — some people didn’t like $(SYSCONFDIR)/mail/spamassassin.
- PKG_LOCAL_RULES_DIR
- Again for packaging. This was the path the sample config file was copied to on make install.
- DEF_RULES_DIR
- Where SpamAssassin looks for its rules and some other stuff. Default was just $(PREFIX)/share/spamassassin.
- PKG_DEF_RULES_DIR
- Like PKG_LOCAL_RULES_DIR, you get the point. Wait… isn’t that $(PKG_*) semantic different to the $(INST_*) stuff? Right, some small glitch…
Again, it was neither nice nor clean but it worked.
It all depends on the $(PREFIX)…
You remember what Michael G. Schwern said in his presentation? “There is no safe way to customize MakeMaker”…
Last week I had the chance to find out how true this is. When I tried to find the reason why the data files were suddenly put into /share/spamassassin (note the missing prefix). And why the SYSCONFDIR was /etc on the one box and /usr/local/etc on another with the same PREFIX. And why the packaging process sometimes mysteriously failed. And why some other weird stuff happened. Sometimes. Not always.
I already followed the EU::MM mailinglist for some months now and so I faintly remembered some changes which rang the bells for me already when I read the postings. Inspecting the generated Makefile and some googling brought me some certainty: The PREFIX was broken in some versions of EU::MM. And starting with version 5.90_01 (the alpha for 6.0) it wasn’t set at all if you didn’t explicitly set it at the command line. Doh!
The reason for this change was obviously that EU:MM/Perl always offered some different “repositories” (the best word I found for that stuff) for files. At least for 5.6.1 there are the three choices: PERL, SITE and VENDOR (please read the file PACKAGING in the SpamAssassin 2.60 distribution where I wrote a — hopefully — understandable description of these three values). Before version 6.0 EU::MM just wrote one set of macros, depending on the value of INSTALLDIRS. Beginning with 6.0, it writes three sets of macros: One for each possible value. So you end with PERLPREFIX, SITEPREFIX and VENDORPREFIX. And PREFIX is now only used to override those! To make a decision based on the chosen prefix I now had to evaluate INSTALLDIRS myself and choose the correct set of macros. Was that change to EU::MM really necessary? The resulting Makefile is really nice and the cruft inside EU::MM was probably reduced by a high factor. But developer relying on the PREFIX it did a real disservice.
The straw that broke the camel’s back
When I compared the generated Makefile with an old one from 5.005_03 (it’s always a good idea having several Perl versions on your box when you have to cope with backward compatibility) I noticed another difference: With EU::MM 6.06 the new variable DESTDIR was introduced. What does it do? Simply said, what we always needed for packaging: Building a module with one prefix and install it below the directory given by DESTDIR. That’s a good thing. Really. Ok, in theory. Not with our code. When I tried to make it work with the several possible versions of EU:MM I quickly came to a point where I didn’t understand my own code.
The real problem was that our Makefile.PL was more or less written for EU::MM 5.45 at that point. That meant if we continued with that structure we had to maintain compatibility in two directions: Stuff like the PM_FILTER hack for Perl 5.5 and EU:MM before 5.45. And another bunch of hacks to “work around” the fixes which went into EU::MM 6.0 and later. After some hours of coding I came to the conclusion that this was impossible. An IRC session with Michael Schwern backed that conclusion:
Schwern: we need to set a SYSCONFDIR (and some others) which is FHS compliant and now we’ve got LOADS of nasty hacks in there which I want to get rid of yikes s/want/need/ because 2.60 will be released in a few days and the Makefile is totally borked Moonflux: And you want to set the SYSCONFDIR based on the PREFIX but can’t because there’s a bunch of prefixes now and you can’t be sure where its going to be installed? Schwern: more or less. currently I have my $is_sys_build = ($prefix =~ m{^(\Q$(PREFIX)\E|/usr(/local)?)/?$}); Schwern: but $(PREFIX) is empty if no PREFIX was set at all Moonflux: Heh, that ain’t gonna fly no more. Moonflux: Well. You’re fucked. ![]()
So I grinded the SpamAssassin 2.60 release process to halt with a bug report for the only possibly halfways stable and portable solution I saw: Going to base our Makefile.PL on EU::MM 6.11, feature-wise, and enable a bunch of hacks which “backport” everything else needed if the Makefile was build with an older EU::MM.
After one night of hardcore hacking I had a first patch available. Especially the DESTDIR support in newer EU::MM helped a lot. Both Schwern and I thought that this special way of building SpamAssassin is needed only for packagers.
Theo proved us wrong: Seems like the recommended way to install RPMs is to rebuild them for your system. Which meant DESTDIR support for everybody. There were three possible solutions:
- The user building RPMs needs to have a EU::MM 6.11+ around — [...] that [is] not necessarily an updated system lib but can also be just a copy somewhere around. They would not even have to set the PERL5LIB, the Makefile.PL could look automagically in a directory where the user just had to copy the contents of the ‘lib’ dir of the EU::MM package.
- The libs of EU::MM 6.16 could be included in the SRPM (possible?). that would increase the size of the SRPM by about 500k — removing the unneeded platforms would make that about 350k.
- I could hack support for DESTDIR into old EU::MMs via our Makefile.PL. That’s not too hard but requires rewriting of the Makefile (just some search-and-replace of var names) so might break somewhere — I currently know of two places/pattern which I have to exclude from rewriting but those might be different between EU::MM version. I checked all the shipped versions of the various Perls I have installed and cant find any difference but that doesn’t
mean that there actually is a deifferent version out there.
I decided to go for the last option. That wasn’t very hard. Just Yet another Nasty Hack.
I never said it would be beautiful
In the end there was something which seemed to work. But the others didn’t like it. Not because it didn’t work. “Sorry — I really can’t [approve] it as it stands, it hurts my head too much.”
said Justin. And he was right, I must admit. So I it took me another night to refactor it into an understandable form. (Where I got acquainted to some weird stuff related to OOP in Perl.)
Categories : Imported, en






Recent Comments