There was a bit of a kerfuffle following the 1.2.2 release of SBCL, regarding the incompatible change in the internals of the backquote reader macro.
Formally, implementations can choose how to implement the backquote reader macro (and its comma-based helpers): the semantics of backquote are defined only after evaluation:
An implementation is free to interpret a backquoted form F1 as any form F2 that, when evaluated, will produce a result that is the same under equal as the result implied by the above definition, provided that the side-effect behavior of the substitute form F2 is also consistent with the description given above.
(CLHS 2.4.6; emphasis mine)
There are also two advisory notes about the representation:
Often an implementation will choose a representation that facilitates pretty printing of the expression, so that
(pprint '`(a ,b))
will display`(a ,b)
and not, for example,(list 'a b)
. However, this is not a requirement.
(CLHS 2.4.6.1; added quote in example mine), and:
Implementors who have no particular reason to make one choice or another might wish to refer to IEEE Standard for the Scheme Programming Language, which identifies a popular choice of representation for such expressions that might provide useful to be useful compatibility for some user communities.
(CLHS 2.4.6.1;
the Scheme representation reads `(foo ,bar)
as (quasiquote (foo
(unquote bar)))
)
The problem the new implementation of backquote is attempting to
address is the first one: pretty printing. To understand what the
problem is, an example might help: imagine that we as Common Lisp
programmers (i.e. not implementors, and aiming for portability) have
written a macro bind
which is exactly equivalent to let
:
(defmacro bind (bindings &body body)
`(let ,bindings ,@body))
and we want to implement a pretty printer for it, so that (pprint
'(progn (bind ((x 2) (z 3)) (if *print-pretty* (1+ x) (1- y)))))
produces
(progn
(bind ((x 2)
(z 3))
(if *print-pretty*
(1+ x)
(1- y))))
What does that look like? Writing pretty-printers is a little bit of a black art; a first stab is something like:
(defun pprint-bind (stream object)
(pprint-logical-block (stream object :prefix "(" :suffix ")")
(pprint-exit-if-list-exhausted)
(write (pprint-pop) :stream stream)
(pprint-exit-if-list-exhausted)
(write-char #\Space stream)
(pprint-logical-block (stream (pprint-pop) :prefix "(" :suffix ")")
(pprint-exit-if-list-exhausted)
(loop
(write (pprint-pop) :stream stream)
(pprint-exit-if-list-exhausted)
(pprint-newline :mandatory stream)))
(pprint-exit-if-list-exhausted)
(pprint-indent :block 1 stream)
(pprint-newline :mandatory stream)
(loop
(write (pprint-pop) :stream stream)
(pprint-exit-if-list-exhausted)
(pprint-newline :mandatory stream))))
(set-pprint-dispatch '(cons (eql bind)) 'pprint-bind)
The
loop
noise is necessary because we're using :mandatory
newlines; a
different newline style, such as :linear
, might have let us use a
standard utility function such as
pprint-linear
.
But otherwise, this is straightforward pretty-printing code, doing
roughly the equivalent of SBCL's internal pprint-let
implementation,
which is:
(formatter "~:<~^~W~^ ~@_~:<~@{~:<~^~W~@{ ~_~W~}~:>~^ ~_~}~:>~1I~:@_~@{~W~^ ~_~}~:>")
A few tests at the repl should show that this works with nasty,
malformed inputs (“malformed” in the sense of not respecting the
semantics of bind
) as well as expected ones:
(pprint '(bind))
(pprint '(bind x))
(pprint '(bind x y))
(pprint '(bind (x y) z))
(pprint '(bind ((x 1) (y 2)) z))
(pprint '(bind ((x 1) (y 2)) z w))
(pprint '(bind . 3))
(pprint '(bind x . 4))
(pprint '(bind (x . y) z))
(pprint '(bind ((x . 0) (y . 1)) z))
(pprint '(bind ((x) (y)) . z))
(pprint '(bind ((x) y) z . w))
Meanwhile, imagine a world where the backquote reader macro simply
wraps (quasiquote ...)
around its argument, and comma likewise wraps
(unquote ...)
:
(set-macro-character #\` (defun read-backquote (stream char)
(list 'quasiquote (read stream t nil t))))
(set-macro-character #\, (defun read-comma (stream char)
(list 'unquote (read stream t nil t))))
Writing pretty-printer support for that is easy, right?
(defun pprint-quasiquote (stream object)
(write-char #\` stream)
(write (cadr object) :stream stream))
(defun pprint-unquote (stream object)
(write-char #\, stream)
(write (cadr object) :stream stream))
(set-pprint-dispatch '(cons (eql quasiquote) (cons t null)) 'pprint-quasiquote)
(set-pprint-dispatch '(cons (eql unquote) (cons t null)) 'pprint-unquote)
(ignoring for the moment what happens if the printed representation of
object
happens to start with a @
or .
)
(pprint '(quasiquote (x (unquote y))))
The problem arises when we try to combine these two things. In particular, what happens when we attempt to print backquoted forms:
(pprint '`(bind ,y z))
What we would hope to see is something like
`(bind ,y
z)
but what we actually get is
`(bind (unquote
y)
z)
because each of the bindings in bind
is printed individually, rather
than the bindings being printed as a whole. And, lest there be hopes
that this can be dealt with by a slightly different way of handling
the pretty printing in pprint-bind
, note that it's important that
(pprint '(bind (function y) z))
print as
(bind (function
y)
z)
and not as
(bind #'y
z)
so the only way to handle this is to know the magical symbols involved
in backquote and comma reader macros – but that is not portably
possible. So, we've come to the point where the conclusion is
inevitable: it is not possible for an implementation to support
list-structured quasiquote and unquote reader macros and general
pretty printing for user-defined operators. (This isn’t the only
failure mode for the combination of unquote
-as-list-structure and
pretty-printing; it’s surprisingly easy to write pretty-printing
functions that fail to print accurately, not just cosmetically as
above but catastrophically, producing output that cannot be read back
in, or reads as a structurally unequal object to the original.)
The new implementation, by Douglas Katzman, preserves the
implementation of the backquote reader macro as a simple list, but
comma (and related reader macros) read as internal, literal
structures. Since these internal structures are atoms, not lists,
they are handled specially by
pprint-logical-block
and friends, and so their own particular pretty-printing routines
always fire. The internal quasiquote
macro ends up extracting and
arranging for appropriate evaluation and splicing of unquoted
material, and everything ends up working.
Everything? Well, not quite: one or two programmer libraries out
there implemented some utility functionality – typically variable
renaming, automatic lambda generation, or similar – without performing
a full macroexpansion and proper codewalk. That code was in general
already broken, but it is true that in the past generating an example
to demonstrate the breakage would have to violate the general
expectation of what “normal” Lisp code would look like, whereas as a
result of the new implementation of backquote in SBCL the symptoms of
breakage were much easier to generate. Several of these places were
fixed before the new implementation was activated, such as
iterate's #l
macro; among
the things to be dealt with after the new implementation was released
was the utility code from let-over-lambda
(a workaround has been installed in the
version distributed from github),
and there is still a little bit of fallout being dealt with (e.g. a
regression in the accuracy of source-location tracking).
But overall, I think the new implementation of backquote has
substantial advantages in maintainability and correctness, and while
it’s always possible to convince maintainers that they’ve made a
mistake, I hope this post explains some of why the change was made.
Meanwhile, I've released SBCL version 1.2.3 – hopefully a much less “exciting” release...