holiday hacking swankr

I'm on holiday! And holidays, as seems to be my usual, involve trains!

I was on a train yesterday, wondering how I could possibly fill two hours of leisure time (it's more straightforward these days than a few years ago, when my fellow travellers were less able to occupy their leisure time with books), when help came from a thoroughly unexpected quarter: I got a bug report for swankr.

I wrote swankr nine years ago, mostly at ISMIR2010. I was about to start the Teclo phase of my ongoing adventures, and an academic colleague had recommended that I learn R; and the moment where it all fell into place was when I discovered that R supported handlers and restarts: enough that writing support for slime's SLDB was a lower-effort endeavour than learning R's default debugger. I implemented support for the bits of slime that I use, and also for displaying images in the REPL -- so that I can present lattice objects as thumbnails of the rendered graph, and have the canned demo of adding two of them together to get a combined graph: generates “oo” sounds at every outing, guaranteed! (Now that I mostly use ggplot2, I need to find a new demo: incrementally building and styling a plot by adding theme, scale, legend objects and so on is actually useful but lacks a wow factor.)

SLIME works by exchanging messages between the emacs front-end and the backend lisp; code on the backend lisp is responsible for parsing the messages, executing whatever is needed to support the request, and sending a response. The messages are, broadly, sexp-based, and the message parsing and executing in the backends is mostly portable Common Lisp, delegating to implementation-specific code for specific implementation-specific support needed.

“Mostly portable Common Lisp” doesn't mean that it'll run in R. The R code is necessarily completely separate, implementing just enough of a Lisp reader to parse the messages. This works fine, because the messages themselves are simple; when the front-end sends a message asking for evaluation for the listener, say, the message is in the form of a sexp, but the form to evaluate is a string of the user-provided text: so as long as we have enough of a sexp reader/evaluator to deal with the SLIME protocol messages, the rest will be handled by the backend's eval function.

... except in some cases. When slime sends a form to be evaluated which contains an embedded presentation object, the presentation is replaced by #.(swank:lookup-presented-object-or-lose 57.) in the string sent for evaluation. This works fine for Common Lisp backends – at least provided *read-eval* is true – but doesn't work elsewhere. One regexp allows us to preprocess the string to evaluate to rewrite this into something else, but what? Enter cunning plan number 1: (ab)use backquote.

Backquote? Yes, R has backquote bquote. It also has moderately crazy function call semantics, so it's possible to: rewrite the string ob be evaluated to contain unquotations; parse the string into a form; funcall the bquote function on that form (effectively performing the unquotations, simulating the read-time evaluation), and then eval the result. And this would be marvellous except that Philipp Marek discovered that his own bquote-using code didn't work. Some investigation later, and the root cause became apparent:

CL-USER> (progn (set 'a 3) (eval (read-from-string "``,a")))
`,a

compare

R> a <- 3; bquote(bquote(.(a)))
bquote(3)

NO NESTED BACKQUOTES?

So, scratch the do.call("bquote", ...) plan. What else is available to us? It turns out that there's a substitute function, which for substitute(expr, env)

returns the parse tree for the (unevaluated) expression ‘expr’, substituting any variables bound in ‘env’.

Great! Because that means we can use the regular expression to rewrite the presentation lookup function calls to be specific variable references, then substitute these variable references from the id-to-object environment that we have anyway, and Bob is your metaphorical uncle. So I did that.

The situation is not perfect, unfortunately. Here's some more of the documentation for substitute:

‘substitute’ works on a purely lexical basis. There is no guarantee that the resulting expression makes any sense.

... and here's some more from do.call:

The behavior of some functions, such as ‘substitute’, will not be the same for functions evaluated using ‘do.call’ as if they were evaluated from the interpreter. The precise semantics are currently undefined and subject to change.

Well that's not ominous at all.