1 #+TITLE: Generalizers: New Metaobjects for Generalized Dispatch
2 #+AUTHOR: Christophe Rhodes, Jan Moringen, David Lichteblau
5 #+LaTeX_HEADER: \usepackage[margin=1in]{geometry}
8 1. This paper introduces a new metaobject, the generalizer, which
9 complements the existing specializer metaobject.
10 2. With the help of examples, we show that this metaobject allows for
11 the efficient implementation of complex non-class-based dispatch
12 within the framework of existing metaobject protocols
13 3. We present the generalizer protocol, implemented within the SBCL
14 implementation of Common Lisp
15 4. In combination with previous work, this produces a fully-functional
16 extension of the existing mechanism for method selection and
17 effective method computation, including support for standard and
18 user-defined method combination independent from method selection.
22 The revisions to the original Common Lisp language \cite{CLtL1}
23 included the detailed specification of an object system, known as
24 the Common Lisp Object System (CLOS), which was eventually
25 standardized as part of the ANSI Common Lisp standard \cite{CLtS}.
26 The object system as presented to the standardization committee was
27 formed of three parts, the first two of which covered XXX [what?]
28 and were incorporated into the final standard, and the third,
29 covering a Metaobject Protocol (MOP) for CLOS, was not.
31 Nevertheless, the CLOS MOP has proven to be a robust design, and
32 while many implementations have derived their implementations of
33 CLOS from either the Closette illustrative implementation in
34 \cite{AMOP}, or the Portable Common Loops implementation of CLOS
35 from Xerox Parc, there have been from-scratch reimplementations of
36 CLOS (in at least CLISP; check for others -- ABCL? Lisp500?!)
37 incorporating the majority of the Metaobject Protocol as described.
39 Although it has stood the test of time, the MOP is neither without
40 issues (e.g. M-M-L considered harmful; slot-definition initargs
41 issue) nor a complete framework for the metaprogrammer to implement
42 all conceivable variations of object-oriented behaviour; indeed,
43 while metaprogramming offers some possibilities for customization of
44 the object system behaviour, those possibilities cannot extend
45 arbitrarily in all directions. There is still an expectation that
46 functionality is implemented with methods on generic functions,
47 acting on objects with slots. [XXX find Paepke picture here? Not
48 Paepke; AMOP?]. XXX include typical examples of MOP: object
49 persistence; maybe ref. Kizcales "MOPs: why we want them and what
50 else they can do"? (Fig. 2 in that is good) ORMs; sparse slots.
52 + introspection, e.g. documentation generation
53 + programmatic construction of classes and generic functions
54 e.g. for IDL compilers, model transformations
56 One area of functionality where there is scope for customization by
57 the metaprogrammer is in the mechanics and semantics of method
58 applicability and dispatch. While in principle AMOP allows
59 customization of dispatch in various different ways (the
60 metaprogrammer can define methods on protocol functions such as
61 =compute-applicable-methods=,
62 =compute-applicable-methods-using-classes=), for example, in
63 practice implementation support for this was weak until relatively
64 recently (ref. closer, also check how ContextL and filtered dispatch
66 jmoringe: filtered dispatch uses a custom method combination, i
69 Another potential mechanism for customizing dispatch is implicit in
70 the class structure defined by AMOP: standard specializer objects
71 (instances of =class= and =eql-specializer=) are generalized
72 instances of the =specializer= protocol class, and in principle
73 there are no restrictions on the metaprogrammer constructing
74 additional subclasses. Previous work [Newton/Rhodes] has explored
75 the potential for customizing generic function dispatch using
76 extended specializers, but as of that work the metaprogrammer must
77 override the entirety of the generic function invocation protocol
78 (from =compute-discriminating-function= on down), leading to toy
79 implementations and duplicated effort.
81 This paper introduces a protocol for efficient and controlled
82 handling of arbitrary subclasses of =specializer=. In particular,
83 it introduces the =generalizer= protocol class, which generalizes
84 (ahem) the return value of =class-of=, and allows the metaprogrammer
85 to hook into cacheing schemes to avoid needless recomputation of
86 effective methods for sufficiently similar generic function
87 arguments (See Figure\nbsp\ref{fig:dispatch}).
89 #+CAPTION: Dispatch Comparison
91 #+ATTR_LATEX: width=0.9\linewidth float
92 [[file:figures/dispatch-comparison.pdf]]
94 The remaining sections in this paper can be read in any order. We
95 give some motivating examples in section XX, including
96 reimplementations of examples from previous work, as well as
97 examples which are poorly supported by previous protocols. We
98 describe the protocol itself in section YY, describing each protocol
99 function in detail and, where applicable, relating it to existing
100 protocol functions within the CLOS MOP. We survey related work in
101 more detail in section ZZ, touching on work on customized dispatch
102 schemes in other environments. Finally, we draw our conclusions
103 from this work, and indicate directions for further development, in
104 section WW; reading that section before the others indicates
105 substantial trust in the authors' work.
107 In this section, we present a number of examples of dispatch
108 implemented using our protocol, which we describe in section YY.
109 For reasons of space, the metaprogram code examples in this section
110 do not include some of the necessary support code to run; complete
111 implementations of each of these cases are included in an appendix /
112 in the accompanying repository snapshot / at this location.
114 A note on terminology: we will attempt to distinguish between the
115 user of an individual case of generalized dispatch (the
116 “programmer”), the implementor of a particular case of generalized
117 dispatch (the “metaprogrammer”), and the authors as the designers
118 and implementors of our generalized dispatch protocol (the
119 “metametaprogammer”, or more likely ”we”).
121 - [ ] =cons-specializer= (can be done using filtered dispatch)
122 - [ ] factorial (like filtered dispatch)
123 - [ ] HTTP Accept header
125 - [ ] prototype/multimethod
127 We start by presenting our original use case, performing
128 dispatching on the first element of lists. Semantically, we allow
129 the programmer to specialize any argument of methods with a new
130 kind of specializer, =cons-specializer=, which is applicable if and
131 only if the corresponding object is a =cons= whose =car= is =eql=
132 to the symbol associated with the =cons-specializer=; these
133 specializers are more specific than the =cons= class, but less
134 specific than an =eql-specializer= on any given =cons=.
136 One motivation for the use of this specializer is in an extensible
137 code walker: a new special form can be handled simply by writing an
138 additional method on the walking generic function, seamlessly
139 interoperating with all existing methods.
141 The programmer code using these specializers is unchanged from
142 \cite{Newton.Rhodes.2008}; the benefits of the protocol described
143 here are centered on performance: in an application such as walking
144 source code, we would expect to encounter special forms
145 (distinguished by particular atoms in the =car= position) multiple
146 times, and hence to dispatch to the same effective method
149 (defclass cons-specializer (specializer)
150 ((%car :reader %car :initarg :car)))
151 (defclass cons-generalizer (generalizer)
152 ((%car :reader %car :initarg :car)))
153 (defmethod generalizer-of-using-class ((gf cons-generic-function) arg)
155 ((cons symbol) (make-instance 'cons-generalizer :car (car arg)))
156 (t (call-next-method))))
157 (defmethod generalizer-equal-hash-key ((gf cons-generic-function)
158 (g cons-generalizer))
160 (defmethod specializer-accepts-generalizer-p ((gf cons-generic-function)
162 (g cons-generalizer))
163 (if (eql (%car s) (%car g))
166 (defmethod specializer-accepts-p ((s cons-specializer) o)
167 (and (consp o) (eql (car o) (%car s))))
169 #| less interesting methods elided: jmoringe: (un)parsing, specializer<?, more? |#
172 (defgeneric walk (form env vars)
173 (:generic-function-class cons-generic-function))
174 (defmethod walk ((expr (cons lambda)) env call-stack)
175 (let ((lambda-list (cadr expr))
177 (with-checked-bindings ((bindings-from-ll lambda-list) env call-stack)
179 (walk form env (cons form call-stack))))))
180 (defmethod walk ((expr (cons let)) env call-stack)
181 (with-checked-bindings ((mapcar (lambda (x) (walk (cadr x) env (cons (cadr x) call-stack)) (cons (car x) (make-instance 'binding))) (cadr expr)) env call-stack)
182 (dolist (form (cddr expr))
183 (walk form env (cons form call-stack)))))
186 | implementation | time (ms / 100k calls) | overhead |
187 |-----------------------+------------------------+----------|
188 | cons-gf/no-cache | 9000 | +2700% |
189 | cons-gf | 1500 | +370% |
190 | cons-gf/one-arg-cache | 740 | +130% |
191 | gf/methods | 360 | +14% |
194 Note that in this example there is no strict need for
195 =cons-specializer= and =cons-generalizer= to be distinct classes –
196 just as in the normal protocol involving
197 =compute-applicable-methods= and
198 =compute-applicable-methods-using-classes=, the specializer object
199 for mediating dispatch contains the same information as the object
200 representing the equivalence class of objects to which that
201 specializer is applicable: here it is the =car= of the =cons=
202 object; in the standard dispatch it is the =class= of the object.
203 This feature also characterizes those use cases where the
204 metaprogrammer could straightforwardly use filtered dispatch
205 \cite{Costanza.etal:2008} to implement their dispatch semantics.
206 We will see in section XX.x an example of a case where filtered
207 dispatch is incapable of efficiently implementing the dispatch, but
208 first we present our implementation of the motivating case from
209 \cite{Costanza.etal:2008}.
211 Our second example of the implementation and use of generalized
212 specializers is a reimplementation of one of the examples in
213 \cite{Costanza.etal:2008}: specifically, the factorial function.
214 Here, we will perform dispatch based on the =signum= of the
215 argument, and again, at most one method with a =signum= specializer
216 will be appliable to any given argument, which makes the structure
217 of the specializer implementation very similar to the =cons=
218 specializers in the previous section.
220 We have chosen to compare signum values using \texttt{=}, which
221 means that a method with specializer =(signum 1)= will be
222 applicable to positive floating-point arguments (see the first
223 method on =specializer-accepts-generalizer-p= and the method on
224 =specializer=accepts-p= below). This leads to one subtle
225 difference in behaviour compared to that of the =cons=
226 specializers: in the case of =signum= specializers, the /next/
227 method after any =signum= specializer can be different, depending
228 on the class of the argument. This aspect of the dispatch is
229 handled by the second method on =specializer-accepts-generalizer-p=
232 (defclass signum-specializer (specializer)
233 ((%signum :reader %signum :initarg :signum)))
234 (defclass signum-generalizer (generalizer)
235 ((%signum :reader %signum :initarg :signum)))
236 (defmethod generalizer-of-using-class ((gf signum-generic-function) arg)
238 (real (make-instance 'signum-generalizer :signum (signum arg)))
239 (t (call-next-method))))
240 (defmethod generalizer-equal-hash-key ((gf signum-generic-function)
241 (g signum-specializer))
242 (%signum g)) ; this will create multiple entries for the same emf, but that's OK
243 (defmethod specializer-accepts-generalizer-p ((gf signum-generic-function)
244 (s signum-specializer)
245 (g signum-generalizer))
246 (if (= (%signum s) (%signum g)) ; or EQL?
250 ;; this method is perhaps interesting enough to talk about?
251 (defmethod specializer-accepts-generalizer-p ((gf signum-generic-function) (specializer sb-mop:specializer) (thing signum-specializer))
252 (specializer-accepts-generalizer-p gf specializer (class-of (%signum thing))))
255 (defmethod specializer-accepts-p ((s signum-specializer) o)
256 (and (realp o) (= (%signum s) (signum o))))
258 #| again elide more boring methods |#
261 Given these definitions, and some more straightforward ones elided
262 for reasons of space, we can implement the factorial function as
267 (:generic-function-class signum-generic-function))
268 (defmethod fact ((n (signum 0))) 1)
269 (defmethod fact ((n (signum 1))) (* n (fact (1- n))))
272 We do not need to include a method on =(signum -1)=, as the
273 standard =no-applicable-method= protocol will automatically apply to
274 negative real or non-real arguments.
276 Benchmarketing: we chose to benchmark 20! because that is the
277 largest factorial whose answer fits in SBCL's 63-bit fixnums, so as
278 to attempt to measure the maximum effect of dispatch (unobscured by
279 allocation / gc issues)
282 (progn (gc :full t) (time (dotimes (i 10000) (%fact 20))))
285 | implementation | time (ms/10k calls) | overhead |
286 |-------------------------+---------------------+----------|
287 | signum-gf/no-cache | 2400 | +41000% |
288 | signum-gf | 230 | +3800% |
289 | signum-gf/one-arg-cache | 75 | +1100% |
290 | gf/fixnum | 12 | +100% |
293 We could allow the metaprogrammer to improve on the one-argument
294 performance by constructing a specialized cache: for =signum=
295 arguments of =rational= arguments, the logical cache structure is
296 to index a three-element vector with =(1+ signum)=. The current
297 protocol does not provide a way of eliding the two generic function
298 calls for the generic cache; we discuss possible approaches in
300 ** HTTP Accept header
301 In this section, we implement a non-trivial form of dispatch. The
302 application in question is a web server, and specifically to allow
303 the programmer to support RFC 2616 \cite{rfc2616} content
304 negotiation, of particular interest to publishers and consumers of
307 The basic mechanism in content negotiation is as follows: the web
308 client sends an HTTP request with an =Accept:= header, which is a
309 string describing the media types it is willing to receive as a
310 response to the request, along with numerical preferences. The web
311 server compares these stated client preferences with the resources
312 it has available to satisfy this request, and sends the best
313 matching resource in its response.
315 In the case where there are static files on the filesystem, and the
316 web server must merely select between them, there is not much more
317 to say. However, it is not unusual for a web service to be backed
318 by some other form of data, and responses computed and sent on the
319 fly, and in these circumstances the web server must compute which
320 of its known output formats it can use to satisfy the request
321 before actually generating the best matching response.
323 The =accept-specializer= below implements the dispatch. It depends
324 on a lazily-computed =tree= slot to represent the information in
325 the accept header (generated by =parse-accept-string=), and a
326 function =q= to compute the (defaulted) preference level for a
327 given content-type and =tree=; then, method selection and ordering
328 involves finding the =q= for each =accept-specializer='s content
329 type given the =tree=, and sorting them according to the preference
333 (defclass accept-specializer (extended-specializer)
334 ((media-type :initarg :media-type :reader media-type)))
335 (defclass accept-generalizer ()
336 ((header :initarg :header :reader header)
338 (next :initarg :next :reader next)))
339 (defmethod generalizer-equal-hash-key
340 ((gf accept-generic-function) (g accept-generalizer))
341 `(accept-generalizer ,(header g)))
342 (defmethod specializer-accepts-generalizer-p ((gf accept-generic-function) (s acc
343 ept-specializer) (generalizer accept-generalizer))
344 (values (q (media-type s) (tree generalizer)) t))
345 (defmethod specializer-accepts-generalizer-p ((gf accept-generic-function) (s sb-
346 mop:specializer) (generalizer accept-generalizer))
347 (specializer-accepts-generalizer-p gf s (next generalizer)))
349 (defmethod specializer< ((gf accept-generic-function) (s1 accept-specializer) (s2
350 accept-specializer) generalizer)
352 ((string= (media-type s1) (media-type s2)) '=)
353 (t (let ((q1 (q (media-type s1) (tree generalizer)))
354 (q2 (q (media-type s2) (tree generalizer))))
362 (defmethod generalizer-of-using-class ((gf accept-generic-function) (arg tbnl:request))
363 (make-instance 'accept-generalizer
364 :header (tbnl:header-in :accept arg)
365 :next (class-of arg)))
366 (defmethod specializer-accepts-p ((specializer accept-specializer) (obj tbnl:requ
368 (q (media-type specializer) (parse-accept-string (tbnl:header-in :accept obj)))
372 This dispatch can't be done with filtered dispatch, except by
373 generating anonymous classes with all the right mime-types as
374 direct superclasses in dispatch order; the filter would generate
376 (ensure-class nil :direct-superclasses '(text/html image/webp ...))
378 and dispatch the operates using those anonymous classes.
380 While this is possible to do, it is awkward to express content-type
381 negotiation in this way, as it means that the dispatcher must know
382 about the universe of mime-types that clients might declare that
383 they accept, rather than merely the set of mime-types that a
384 particular generic function is capable of serving; handling
385 wildcards in accept strings is particularly awkward in the
388 Note that in this example, the method on =specializer<= involves a
389 nontrivial ordering of methods based on the =q= values specified in
390 the accept header (whereas in sections XX.1 and XX.2 only a single
391 extended specializer could be applicable to any given argument).
393 Also note that the accept specializer protocol is straightforwardly
394 extensible to other suitable objects; for example, one simple
395 debugging aid is to define that an =accept-specializer= should be
396 applicable to =string= objects. This can be done in a modular
397 fashion (see source example NN), and generalizes to dealing with
398 multiple web server libraries, so that content-negotiation methods
399 are applicable to each web server's request objects.
402 (defmethod generalizer-of-using-class ((gf accept-generic-function) (s string))
403 (make-instance 'accept-generalizer
406 (defmethod specializer-accepts-p ((s accept-specializer) (string string))
407 (q (media-type s) (parse-accept-string string)))
409 jmoringe: the name =accept-specializer=, while sensible, may
410 confusing in this context because "accept" occurs as part of the
411 protocol with a different semantic.
412 ** Pattern / xpattern / regex / optima
413 Here's the /really/ interesting bit, but on the other hand we're
414 probably going to run out of space, and the full description of
415 these is going to take us into =make-method-lambda= territory.
416 A second paper? Future work?
419 - [ ] =generalizer-of-using-class= (NB class of gf not class of object)
420 - [ ] =compute-applicable-methods-using-generalizers=
421 - [ ] =generalizer-equal-hash-key=
422 - [ ] =specializer-accepts-generalizer-p=
423 - [ ] =specializer-accepts-p=
425 jmoringe: If I remember correctly, closette has
426 =method-more-specific-p= should we aim for parity with that and
427 use =specializer-more-specific-p=? The downside would be that
428 =-p= indicates a Boolean return value which is not the case here.
430 Description and specification left for reasons of space (we'll see?)
431 - [ ] =same-specializer-p=
432 - [ ] =parse/unparse-specializer-using-class=
433 - [ ] =make-method-specializers-form=
434 - [ ] jmoringe: In an email, I suggested
435 =make-specializer-form-using-class=:
438 Could we change =make-method-specializers-form='s default
439 behaviour to call a new generic function
441 make-specializer-form-using-class gf method name env
443 with builtin methods on =sb-mop:specializer=, =symbol=, =cons= (for
444 eql-specializers)? This would make it unnecessary to repeat
445 boilerplate along the lines of
447 (flet ((make-parse-form (name)
448 (if <name-is-interesting>
449 <handle-interesting-specializer>
450 <repeat-handling-of-standard-specializers>)))
451 `(list ,@(mapcar #'make-parse-form specializer-names)))
453 for each generic function class.
455 - [ ] =make-method-lambda= revision (use environment arg?)
457 jmoringe: would only be relevant for pattern dispatch, right? I
458 think, we didn't finish the discussion regarding special
459 variables vs. environment vs. new protocol function
461 - [ ] Newton/Rhodes, obv
462 - [ ] filtered dispatch -- the point is that our work continues to
463 be useful in cases where there are unbounded numbers of
464 equivalence classes but each given invokation involves a small
466 - [ ] ContextL / context-oriented programming -- dispatch occurs on
467 hidden layer argument being an instance of an anonymous class with
468 suitably arranged superclasses -- OK because set of layers is
469 bounded and under programmer control
470 - [ ] http://soft.vub.ac.be/Publications/2010/vub-tr-soft-10-04.pdf
471 - [ ] http://soft.vub.ac.be/lambic/files/lambic-ilc09.pdf
472 - [ ] http://soft.vub.ac.be/Publications/2011/vub-soft-phd-11-03.pdf
473 - [ ] Prototypes with Multiple Dispatch
474 http://sauerbraten.org/lee/ecoop.pdf -- extension of Self-style
475 object system to handle multiple equally-privileged "receivers".
476 A good test case for our protocol; handled adequately with
477 generalizer being the tuple of (roles,delegations), with some
478 thought needed for method redefinitions but otherwise working
481 - [ ] Multiple dispatch in Clojure
482 http://clojure.org/multimethods -- seems to allow hierarchy-based,
483 eql and the equivalent of filtered dispatch
485 - protocol for straightforward definition of custom dispatch
486 + interoperates seamlessly with rest of CLOS: method combination,
488 + tolerably efficient: two extra standard gf invokations and one
489 hash table lookup per call on the fast path (but more to be
491 + expressive: handles foms of dispatch not handled elsewhere; all
492 the usual niceties of redefinition, modularity, introspection
494 - compute-cache-handling-functions (and general speed issues)
495 - automatic pattern-specializer generalizer computation
496 - prototype-oriented progamming a la Slate.
498 We thank Lee Salzman, Pascal Costanza, Mikel Evins for their