bugs all the way down

← more efficient hyperlinked blogging | blog | seeking real life uses for generalized specializers →

There are times when being in control of the whole software stack is a mixed blessing.

While doing investigations related to my previous post, I found myself wondering what the arguments and return values of make-method-lambda were in practice, in SBCL. So I did what any self-respecting Lisp programmer would do, and instead of following that link and decoding the description, I simply ran (trace sb-mop:make-method-lambda), and then ran my defmethod as normal. I was half-expecting it to break instantly, because the implementation of trace encapsulates named functions in a way that changes the class of the function object (essentially, it wraps the existing function in a new anonymous function; fine for ordinary functions, not so good for generic-function objects), and I was half-right: an odd error occurred, but after trace printed the information I wanted.

What was the odd error? Well, after successfully calling and returning from make-method-lambda, I got a no-applicable-method error while trying to compute the applicable methods for... make-method-lambda. Wait, what?

SBCL's CLOS has various optimizations in it; some of them have been documented in the SBCL Internals Manual, such as the clever things done to make slot-value fast, and specialized discriminating functions. There are plenty more that are more opaque to the modern user, one of which is the “fast method call” optimization. In that optimization, the normal calling convention for methods within method combination, which involves calling the method's method-function with two arguments – a list of the arguments passed to the generic function, and a list of next methods – is bypassed, with the fast-method-function instead being supplied with a permutation vector (for fast slot access) and next method call (for fast call-next-method) as the first two arguments and the generic function's original arguments as the remainder, unrolled.

In order for this optimization to be valid, the call-method calling convention must be the standard one – if the user is extending or overriding the method invocation protocol, all the optimizations based on assuming that the method invocation protocol might be invalid. We have to be conservative, so we need to turn this optimization off if we can't prove that it's valid – and the only case where we can prove that it's valid is if only the system-provided method on make-method-lambda has been called. But we can't communicate that after the fact; although make-method-lambda returns initargs as well as the lambda, an extending method could arbitrarily mess with the lambda while returning the initargs the system-provided method returns. So in order to find out whether the optimization is safe, we have to check whether exactly our system-provided method on make-method-lambda was the applicable one, so there's an explicit call to compute-applicable-methods of make-method-lambda after the method object has been created. And make-method-lambda being traced and hence not a generic-function any more, it's normal that there's an error. Hooray! Now we understand what is going on.

As for how to fix it, well, how about adding an encapsulations slot to generic-function objects, and handling the encapsulations in sb-mop:compute-discriminating-function? The encapsulation implementation as it currently stands is fairly horrible, abusing as it does special variables and chains of closures; there's a fair chance that encapsulating generic functions in this way will turn out a bit less horrible. So, modify sb-debug::encapsulate, C-c C-c, and package locks strike. In theory we are meant to be able to unlock and continue; in practice, that seems to be true for some package locks but not others. Specifically, the package lock from setting the fdefinition from a non-approved package gives a continuable error, but the ones from compiling special declarations of locked symbols have already taken effect and converted themselves to run-time errors. Curses. So, (mapcar #'unlock-package (list-all-packages)) and try again; then, it all goes well until adding the slot to the generic-function class (and I note in passing that many of the attributes that CL specifies are generic-function SBCL only gives to standard-generic-function objects), at which point my SLIME repl tells me that something has gone wrong, but not what, because no generic function works any more, including print-object. (This happens depressingly often while working on CLOS).

That means it's time for an SBCL rebuild, which is fine because it gives me time to write up this blog entry up to this point. Great, that finishes, and now we go onwards: implementing the functionality we need in compute-discriminating-function is a bit horrible, but this is only a proof-of-concept so we wrap it all up in a labels and stop worrying about 80-column conventions. Then we hit C-c C-c and belatedly remember that redefining methods involves removing them from their generic function and adding them again, and doing that to compute-discriminating-function is likely to have bad consequences. Sure enough:

There is no applicable method for the generic function 
  #<STANDARD-GENERIC-FUNCTION COMPUTE-DISCRIMINATING-FUNCTION (1)>
when called with arguments
  (#<STANDARD-GENERIC-FUNCTION NO-APPLICABLE-METHOD (1)>).

Yes, well. One (shorter) rebuild of just CLOS later, and then a few more edit/build/test cycles, and we can trace generic functions without changing the identity of the fdefinition. Hooray! Wait, what was I intending to do with my evening?