#+AUTHOR: Christophe Rhodes, Jan Moringen, David Lichteblau
#+OPTIONS: toc:nil
-#+LaTeX_HEADER: \usepackage[margin=1in]{geometry}
+#+LaTeX_CLASS: acm_proc_article-sp
+#+LaTeX_HEADER: \DeclareTextFontCommand{\texttt}{\ttfamily\hyphenchar\font=45\relax}
+
+#+begin_src elisp :exports none
+(add-to-list 'org-latex-classes
+ '("acm_proc_article-sp" "\\documentclass{acm_proc_article-sp}"
+ ("\\section{%s}" . "\\section*{%s}")
+ ("\\subsection{%s}" . "\\subsection*{%s}")
+ ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
+ ("\\paragraph{%s}" . "\\paragraph*{%s}")
+ ("\\subparagraph{%s}" . "\\subparagraph*{%s}")))
+#+end_src
#+begin_abstract
1. This paper introduces a new metaobject, the generalizer, which
#+end_abstract
* Introduction
- The revisions to the original Common Lisp language \cite{CLtL1}
+ The revisions to the original Common Lisp language \cite{CLtL}
included the detailed specification of an object system, known as
the Common Lisp Object System (CLOS), which was eventually
standardized as part of the ANSI Common Lisp standard \cite{CLtS}.
The object system as presented to the standardization committee was
- formed of three parts, the first two of which covered XXX [what?]
- and were incorporated into the final standard, and the third,
- covering a Metaobject Protocol (MOP) for CLOS, was not.
+ formed of three chapters. The first two chapters covered programmer
+ interface concepts and the functions in the programmer interface
+ \cite[Chapter 28]{CLtL2} and were largely incorporated into the
+ final standard; the third chapter, covering a Metaobject Protocol
+ (MOP) for CLOS, was not.
Nevertheless, the CLOS MOP has proven to be a robust design, and
while many implementations have derived their implementations of
CLOS from either the Closette illustrative implementation in
\cite{AMOP}, or the Portable Common Loops implementation of CLOS
from Xerox Parc, there have been from-scratch reimplementations of
- CLOS (in at least CLISP; check for others -- ABCL? Lisp500?!)
- incorporating the majority of the Metaobject Protocol as described.
-
- Although it has stood the test of time, the MOP is neither without issues
- (e.g. M-M-L considered harmful; slot-definition initargs issue) nor
- a complete framework for the metaprogrammer to implement all
- conceivable variations of object-oriented behaviour; indeed, while
- metaprogramming offers some possibilities for customization of the
- object system behaviour, those possibilities cannot extend
- arbitrarily in all directions. There is still an expectation that
+ CLOS (in at least CLISP; check for others -- Lisp500? CCL?)
+ incorporating substantial fractions of the Metaobject Protocol as
+ described.
+
+ Although it has stood the test of time, the CLOS MOP is neither
+ without issues (e.g. semantic problems with =make-method-lambda=
+ \cite{Costanza.Herzeel:2008}; useful functions such as
+ =compute-effective-slot-definition-initargs= being missing from the
+ standard) nor is it a complete framework for the metaprogrammer to
+ implement all conceivable variations of object-oriented behaviour.
+ While metaprogramming offers some possibilities for customization of
+ the object system behaviour, those possibilities cannot extend
+ arbitrarily in all directions. There is still an expectation that
functionality is implemented with methods on generic functions,
- acting on objects with slots. [XXX find Paepke picture here? Not
- Paepke; AMOP?]. XXX include typical examples of MOP: object
- persistence; maybe ref. Kizcales "MOPs: why we want them and what
- else they can do"? (Fig. 2 in that is good) ORMs; sparse slots.
- jmoringe:
- + introspection, e.g. documentation generation
- + programmatic construction of classes and generic functions
- e.g. for IDL compilers, model transformations
+ acting on objects with slots. Nevertheless, the MOP is flexible,
+ and is used for a number of things, including: documentation
+ generation (where introspective functionality in the MOP is used to
+ extract information from a running system); object-relational
+ mapping and other approaches to object persistence; alternative
+ backing stores for slots (hash-tables or symbols); and programmatic
+ construction of metaobjects, for example for IDL compilers and model
+ transformations.
+
+ [ A picture on MOP flexibility here would be good; I have in my mind
+ one where an object system is a point and the MOP opens up a blob
+ around that point, and I'm sure I've seen it somewhere but I can't
+ remember where. Alternatively, there's Kiczales et al "MOPs: why we
+ want them and what else they can do", fig. 2 ]
One area of functionality where there is scope for customization by
the metaprogrammer is in the mechanics and semantics of method
=compute-applicable-methods=,
=compute-applicable-methods-using-classes=), for example, in
practice implementation support for this was weak until relatively
- recently (ref. closer, also check how ContextL and filtered dispatch
- are implemented).
- jmoringe: filtered dispatch uses a custom method combination, i
- think
+ recently[fn:1].
Another potential mechanism for customizing dispatch is implicit in
the class structure defined by AMOP: standard specializer objects
(instances of =class= and =eql-specializer=) are generalized
instances of the =specializer= protocol class, and in principle
there are no restrictions on the metaprogrammer constructing
- additional subclasses. Previous work [Newton/Rhodes] has explored
- the potential for customizing generic function dispatch using
- extended specializers, but as of that work the metaprogrammer must
- override the entirety of the generic function invocation protocol
- (from =compute-discriminating-function= on down), leading to toy
- implementations and duplicated effort.
+ additional subclasses. Previous work \cite{Newton.Rhodes:2008} has
+ explored the potential for customizing generic function dispatch
+ using extended specializers, but as of that work the metaprogrammer
+ must override the entirety of the generic function invocation
+ protocol (from =compute-discriminating-function= on down), leading
+ to toy implementations and duplicated effort.
This paper introduces a protocol for efficient and controlled
- handling of arbitrary subclasses of =specializer=. In particular,
- it introduces the =generalizer= protocol class, which generalizes
- (ahem) the return value of =class-of=, and allows the metaprogrammer
- to hook into cacheing schemes to avoid needless recomputation of
- effective methods for sufficiently similar generic function
- arguments (See Figure\nbsp\ref{fig:dispatch}).
+ handling of new subclasses of =specializer=. In particular, it
+ introduces the =generalizer= protocol class, which generalizes the
+ return value of =class-of= in method applicability computation, and
+ allows the metaprogrammer to hook into cacheing schemes to avoid
+ needless recomputation of effective methods for sufficiently similar
+ generic function arguments (See Figure\nbsp\ref{fig:dispatch}).
#+CAPTION: Dispatch Comparison
#+LABEL: fig:dispatch
[[file:figures/dispatch-comparison.pdf]]
The remaining sections in this paper can be read in any order. We
- give some motivating examples in section XX, including
+ give some motivating examples in section [[#Examples]], including
reimplementations of examples from previous work, as well as
examples which are poorly supported by previous protocols. We
- describe the protocol itself in section YY, describing each protocol
- function in detail and, where applicable, relating it to existing
- protocol functions within the CLOS MOP. We survey related work in
- more detail in section ZZ, touching on work on customized dispatch
- schemes in other environments. Finally, we draw our conclusions
- from this work, and indicate directions for further development, in
- section WW; reading that section before the others indicates
- substantial trust in the authors' work.
+ describe the protocol itself in section [[#Protocol]], describing each
+ protocol function in detail and, where applicable, relating it to
+ existing protocol functions within the CLOS MOP. We survey related
+ work in more detail in section [[#Related Work]], touching on work on
+ customized dispatch schemes in other environments. Finally, we draw
+ our conclusions from this work, and indicate directions for further
+ development, in section [[#Conclusions]]; reading that section before the
+ others indicates substantial trust in the authors' work.
* Examples
- - [ ] =cons-specializer= (can be done using filtered dispatch)
- - [ ] factorial (like filtered dispatch)
- - [ ] HTTP Accept header
- - [ ] xpattern
- - [ ] prototype/multimethod
-** car-of-cons
- NB this example can be done using filtered dispatch, with a filter
- calling =car= on cons arguments.
-
- Note also that there's no real need for =cons-specializer= and
- =cons-generalizer= to be distinct classes (as with =class= and
- =class=). Also true for =signum=, below; but more interesting
- dispatch reveals the need to split.
+ :PROPERTIES:
+ :CUSTOM_ID: Examples
+ :END:
+ In this section, we present a number of examples of dispatch
+ implemented using our protocol, which we describe in section
+ [[#Protocol]]. For reasons of space, the metaprogram code examples in
+ this section do not include some of the necessary support code to
+ run; complete implementations of each of these cases are included in
+ an appendix / in the accompanying repository snapshot / at this
+ location.
+
+ A note on terminology: we will attempt to distinguish between the
+ user of an individual case of generalized dispatch (the
+ “programmer”), the implementor of a particular case of generalized
+ dispatch (the “metaprogrammer”), and the authors as the designers
+ and implementors of our generalized dispatch protocol (the
+ “metametaprogammer”, or more likely “we”).
+** CONS specializers
+ :PROPERTIES:
+ :CUSTOM_ID: Cons
+ :END:
+ We start by presenting our original use case, performing
+ dispatching on the first element of lists. Semantically, we allow
+ the programmer to specialize any argument of methods with a new
+ kind of specializer, =cons-specializer=, which is applicable if and
+ only if the corresponding object is a =cons= whose =car= is =eql=
+ to the symbol associated with the =cons-specializer=; these
+ specializers are more specific than the =cons= class, but less
+ specific than an =eql-specializer= on any given =cons=.
+
+ One motivation for the use of this specializer is in an extensible
+ code walker: a new special form can be handled simply by writing an
+ additional method on the walking generic function, seamlessly
+ interoperating with all existing methods.
+
+ The programmer code using these specializers is unchanged from
+ \cite{Newton.Rhodes:2008}; the benefits of the protocol described
+ here are centered on performance and generality: in an application
+ such as walking source code, we would expect to encounter special
+ forms (distinguished by particular atoms in the =car= position)
+ multiple times, and hence to dispatch to the same effective method
+ repeatedly. We discuss this in more detail in section [[#Memoization]];
+ we present the metaprogrammer code below.
+
#+begin_src lisp
(defclass cons-specializer (specializer)
((%car :reader %car :initarg :car)))
(defclass cons-generalizer (generalizer)
((%car :reader %car :initarg :car)))
-(defmethod generalizer-of-using-class ((gf cons-generic-function) arg)
+(defmethod generalizer-of-using-class
+ ((gf cons-generic-function) arg)
(typecase arg
- ((cons symbol) (make-instance 'cons-generalizer :car (car arg)))
+ ((cons symbol)
+ (make-instance 'cons-generalizer
+ :car (car arg)))
(t (call-next-method))))
-(defmethod generalizer-equal-hash-key ((gf cons-generic-function)
- (g cons-generalizer))
+(defmethod generalizer-equal-hash-key
+ ((gf cons-generic-function)
+ (g cons-generalizer))
(%car g))
-(defmethod specializer-accepts-generalizer-p ((gf cons-generic-function)
- (s cons-specializer)
- (g cons-generalizer))
+(defmethod specializer-accepts-generalizer-p
+ ((gf cons-generic-function)
+ (s cons-specializer)
+ (g cons-generalizer))
(if (eql (%car s) (%car g))
(values t t)
(values nil t)))
-(defmethod specializer-accepts-p ((s cons-specializer) o)
+(defmethod specializer-accepts-p
+ ((s cons-specializer) o)
(and (consp o) (eql (car o) (%car s))))
+#+end_src
-#| less interesting methods elided: jmoringe: (un)parsing, specializer<?, more? |#
-
-#| XXX insert motivating example from Newton/Rhodes here |#
+The code above shows the core of the use of our protocol. We have
+elided some support code for parsing and unparsing specializers, and
+for handling introspective functions such as finding generic functions
+for a given specializer. We have also elided methods on the protocol
+function =specializer<=; for =cons-specializers= here, specializer
+ordering is trivial, as only one =cons-specializer= can ever be
+applicable to any given argument. See section [[#Accept]] for a case
+where specializer ordering is substantially different.
+
+As in \cite{Newton.Rhodes:2008}, we can use these specializers to
+implement a modular code walker, where we define one method per
+special operator. We show two of those methods below, in the context
+of a walker which checks for unused bindings and uses of unbound
+variables.
+
+#+begin_src
+(defgeneric walk (form env stack)
+ (:generic-function-class cons-generic-function))
+(defmethod walk ((expr (cons lambda)) env call-stack)
+ (let ((lambda-list (cadr expr))
+ (body (cddr expr)))
+ (with-checked-bindings
+ ((bindings-from-ll lambda-list) env call-stack)
+ (dolist (form body)
+ (walk form env (cons form call-stack))))))
+(defmethod walk ((expr (cons let)) env call-stack)
+ (flet ((let-binding (x)
+ (walk (cadr x) env (cons (cadr x) call-stack))
+ (cons (car x) (make-instance 'binding))))
+ (with-checked-bindings
+ ((mapcar #'let-binding (cadr expr)) env call-stack)
+ (dolist (form (cddr expr))
+ (walk form env (cons form call-stack))))))
#+end_src
-** signum
+
+ Note that in this example there is no strict need for
+ =cons-specializer= and =cons-generalizer= to be distinct classes –
+ just as in the normal protocol involving
+ =compute-applicable-methods= and
+ =compute-applicable-methods-using-classes=, the specializer object
+ for mediating dispatch contains the same information as the object
+ representing the equivalence class of objects to which that
+ specializer is applicable: here it is the =car= of the =cons=
+ (which we wrap in a distinct object); in the standard dispatch it
+ is the =class= of the object. This feature also characterizes
+ those use cases where the metaprogrammer could straightforwardly
+ use filtered dispatch \cite{Costanza.etal:2008} to implement their
+ dispatch semantics. We will see in section [[#Accept]] an example
+ of a case where filtered dispatch is incapable of straightforwardly
+ expressing the dispatch, but first we present our implementation of
+ the motivating case from \cite{Costanza.etal:2008}.
+** SIGNUM specializers
+ :PROPERTIES:
+ :CUSTOM_ID: Signum
+ :END:
Our second example of the implementation and use of generalized
specializers is a reimplementation of one of the examples in
\cite{Costanza.etal:2008}: specifically, the factorial function.
((%signum :reader %signum :initarg :signum)))
(defclass signum-generalizer (generalizer)
((%signum :reader %signum :initarg :signum)))
-(defmethod generalizer-of-using-class ((gf signum-generic-function) arg)
+(defmethod generalizer-of-using-class
+ ((gf signum-generic-function) arg)
(typecase arg
- (real (make-instance 'signum-generalizer :signum (signum arg)))
+ (real (make-instance 'signum-generalizer
+ :signum (signum arg)))
(t (call-next-method))))
-(defmethod generalizer-equal-hash-key ((gf signum-generic-function)
- (g signum-specializer))
- (%signum g)) ; this will create multiple entries for the same emf, but that's OK
-(defmethod specializer-accepts-generalizer-p ((gf signum-generic-function)
- (s signum-specializer)
- (g signum-generalizer))
+(defmethod generalizer-equal-hash-key
+ ((gf signum-generic-function)
+ (g signum-specializer))
+ (%signum g))
+(defmethod specializer-accepts-generalizer-p
+ ((gf signum-generic-function)
+ (s signum-specializer)
+ (g signum-generalizer))
(if (= (%signum s) (%signum g)) ; or EQL?
(values t t)
(values nil t)))
-;; this method is perhaps interesting enough to talk about?
-(defmethod specializer-accepts-generalizer-p ((gf signum-generic-function) (specializer sb-mop:specializer) (thing signum-specializer))
- (specializer-accepts-generalizer-p gf specializer (class-of (%signum thing))))
+(defmethod specializer-accepts-generalizer-p
+ ((gf signum-generic-function)
+ (specializer sb-mop:specializer)
+ (thing signum-specializer))
+ (specializer-accepts-generalizer-p
+ gf specializer (class-of (%signum thing))))
-
-(defmethod specializer-accepts-p ((s signum-specializer) o)
+(defmethod specializer-accepts-p
+ ((s signum-specializer) o)
(and (realp o) (= (%signum s) (signum o))))
-
-#| again elide more boring methods |#
#+end_src
-Given these definitions, and some more straightforward ones elided for
-reasons of space, we can implement the factorial function as follows:
+ Given these definitions, and once again some more straightforward
+ ones elided for reasons of space, we can implement the factorial
+ function as follows:
#+begin_src lisp
(defgeneric fact (n)
(defmethod fact ((n (signum 1))) (* n (fact (1- n))))
#+end_src
-We do not need to include a method on (signum -1), as the standard
-=no-applicable-method= protocol will automatically apply to negative
-real or non-real arguments.
-
-Benchmarketing: we chose to benchmark 20! because that is the largest
-factorial whose answer fits in SBCL's 63-bit fixnums, so as to attempt
-to measure the maximum effect of dispatch (unobscured by allocation /
-gc issues)
-
-| fact (signum-gf) | %fact (fun) | %%fact (gf / 1meth) | fact (signum-gf / 1arg hash-special-case) |
-|------------------+-------------+---------------------+-------------------------------------------|
-| 0.284 | 0.004 | 0.016 | 0.032 |
-| 0.076 | 0.008 | 0.012 | 0.024 |
-| 0.072 | 0.004 | 0.012 | 0.116 |
-| 0.264 | 0.004 | 0.008 | 0.120 |
-| 0.292 | 0.008 | 0.012 | 0.120 |
-| 0.264 | 0.004 | 0.016 | 0.084 |
-| 0.276 | 0.008 | 0.012 | 0.092 |
-| 0.264 | 0.008 | 0.012 | 0.036 |
-| 0.276 | 0.008 | 0.004 | 0.104 |
-| 0.272 | 0.004 | 0.012 | 0.020 |
+ We do not need to include a method on =(signum -1)=, as the
+ standard =no-applicable-method= protocol will automatically apply to
+ negative real or non-real arguments.
+** Accept HTTP header specializers
+ :PROPERTIES:
+ :CUSTOM_ID: Accept
+ :END:
+ In this section, we implement a non-trivial form of dispatch. The
+ application in question is a web server, and specifically to allow
+ the programmer to support RFC 2616 \cite{rfc2616} content
+ negotiation, of particular interest to publishers and consumers of
+ REST-style Web APIs.
+
+ The basic mechanism in content negotiation is as follows: the web
+ client sends an HTTP request with an =Accept= header, which is a
+ string describing the media types it is willing to receive as a
+ response to the request, along with numerical preferences. The web
+ server compares these stated client preferences with the resources
+ it has available to satisfy this request, and sends the best
+ matching resource in its response.
+
+ For example, a graphical web browser might by default send an
+ =Accept= header such as
+ =text/html,application/xml;q=0.9,*/*;q=0.8=. This should be
+ interpreted by a web server as meaning that if for a given resource
+ the server can provide content of type =text/html= (i.e. HTML),
+ then it should do so. Otherwise, if it can provide
+ =application/xml= content (i.e. XML of any schema), then that
+ should be provided; failing that, any other content type is
+ acceptable.
+
+ In the case where there are static files on the filesystem, and the
+ web server must merely select between them, there is not much more
+ to say. However, it is not unusual for a web service to be backed
+ by some other form of data, and responses computed and sent on the
+ fly, and in these circumstances the web server must compute which
+ of its known output formats it can use to satisfy the request
+ before actually generating the best matching response. This can be
+ modelled as one generic function responsible for generating the
+ response, with methods corresponding to content-types -- and the
+ generic function must then perform method selection against the
+ request's =Accept= header to compute the appropriate response.
+
+ The =accept-specializer= below implements the dispatch. It depends
+ on a lazily-computed =tree= slot to represent the information in
+ the accept header (generated by =parse-accept-string=), and a
+ function =q= to compute the (defaulted) preference level for a
+ given content-type and =tree=; then, method selection and ordering
+ involves finding the =q= for each =accept-specializer='s content
+ type given the =tree=, and sorting them according to the preference
+ level.
#+begin_src lisp
-(progn (gc :full t) (time (dotimes (i 10000) (%fact 20))))
-#+end_src
-** HTTP Accept header
- implement RFC2616 content negotiation
-
- NB this definitely can't be done with filtered dispatch, except by
- generating anonymous classes with all the right mime-types as
- direct superclasses in dispatch order,so the filter does
-#+begin_src lisp
-(ensure-class nil :direct-superclasses '(text/html image/webp ...))
-#+end_src
- and dispatch is defined by using those classes. And that's even
- more awkward than it sounds, because that means that in principle
- all the mime types in the universe need an existence as classes, to
- cater for arbitrary mime types in accept headers. And handling
- wildcards is pretty much impossible, too. See that in
- =specializer<= which involves a nontrivial ordering of methods
- (whereas in two above previous cases only a single extended
- specializer could be applicable to any given argument)
-
- Also of interest: note that we can have these
- specializer/generalizers handle arbitrary objects: the =string= and
- =tbnl:request= methods are independent of each other; this
- generalizes to dealing with multiple web server libraries.
-
- jmoringe: the name =accept-specializer=, while sensible, may
- confusing in this context because "accept" occurs as part of the
- protocol with a different semantic.
-
-#+begin_src lisp
-(defclass accept-specializer (extended-specializer)
+(defclass accept-specializer (specializer)
((media-type :initarg :media-type :reader media-type)))
-(defclass accept-generalizer ()
+(defclass accept-generalizer (generalizer)
((header :initarg :header :reader header)
(tree)
(next :initarg :next :reader next)))
(defmethod generalizer-equal-hash-key
- ((gf accept-generic-function) (g accept-generalizer))
- `(accept-generalizer ,(header g)))
-(defmethod specializer-accepts-generalizer-p ((gf accept-generic-function) (s acc
-ept-specializer) (generalizer accept-generalizer))
+ ((gf accept-generic-function)
+ (g accept-generalizer))
+ `(accept-generalizer ,(header g)))
+(defmethod specializer-accepts-generalizer-p
+ ((gf accept-generic-function)
+ (s accept-specializer)
+ (generalizer accept-generalizer))
(values (q (media-type s) (tree generalizer)) t))
-(defmethod specializer-accepts-generalizer-p ((gf accept-generic-function) (s sb-
-mop:specializer) (generalizer accept-generalizer))
- (specializer-accepts-generalizer-p gf s (next generalizer)))
-
-(defmethod specializer< ((gf accept-generic-function) (s1 accept-specializer) (s2
- accept-specializer) generalizer)
- (cond
- ((string= (media-type s1) (media-type s2)) '=)
- (t (let ((q1 (q (media-type s1) (tree generalizer)))
- (q2 (q (media-type s2) (tree generalizer))))
- (cond
- ((= q1 q2) '=)
- ((< q1 q2) '>)
- (t '<))))))
-
-;; here are the only methods that actually know about TBNL
-(defmethod generalizer-of-using-class ((gf accept-generic-function) (arg tbnl:request))
+(defmethod specializer-accepts-generalizer-p
+ ((gf accept-generic-function)
+ (s specializer)
+ (generalizer accept-generalizer))
+ (specializer-accepts-generalizer-p
+ gf s (next generalizer)))
+
+(defmethod specializer<
+ ((gf accept-generic-function)
+ (s1 accept-specializer)
+ (s2 accept-specializer)
+ (generalizer accept-generalizer))
+ (let ((m1 (media-type s1))
+ (m2 (media-type s2))
+ (tree (tree generalizer)))
+ (cond
+ ((string= m1 m2) '=)
+ (t (let ((q1 (q m1 tree)))
+ (q2 (q m2 tree))))
+ (cond
+ ((= q1 q2) '=)
+ ((< q1 q2) '>)
+ (t '<))))))
+#+end_src
+
+ The metaprogrammer can then support dispatching in this way for
+ suitable objects, such as the =request= object representing a
+ client request in the Hunchentoot web server. The code below
+ implements this, by defining the computation of a suitable
+ =generalizer= object for a given request, and specifying how to
+ compute whether the specializer accepts the given request object
+ (=q= returns a number between 0 and 1 if any pattern in the =tree=
+ matches the media type, and =nil= if the media type cannot be
+ matched at all).
+
+#+begin_src
+(defmethod generalizer-of-using-class
+ ((gf accept-generic-function)
+ (arg tbnl:request))
(make-instance 'accept-generalizer
:header (tbnl:header-in :accept arg)
:next (class-of arg)))
-(defmethod specializer-accepts-p ((specializer accept-specializer) (obj tbnl:requ
-est))
- (q (media-type specializer) (parse-accept-string (tbnl:header-in :accept obj)))
-)
+(defmethod specializer-accepts-p
+ ((specializer accept-specializer)
+ (obj tbnl:request))
+ (let* ((accept (tbnl:header-in :accept obj))
+ (tree (parse-accept-string accept))
+ (q (q (media-type specializer) tree)))
+ (and q (> q 0))))
+#+end_src
+
+ This dispatch cannot be implemented using filtered dispatch, except
+ by generating anonymous classes with all the right mime-types as
+ direct superclasses in dispatch order; the filter would generate
+#+begin_src lisp
+(ensure-class nil :direct-superclasses
+ '(text/html image/webp ...))
+#+end_src
+ and dispatch the operates using those anonymous classes. While
+ this is possible to do, it is awkward to express content-type
+ negotiation in this way, as it means that the dispatcher must know
+ about the universe of mime-types that clients might declare that
+ they accept, rather than merely the set of mime-types that a
+ particular generic function is capable of serving; handling
+ wildcards in accept strings is particularly awkward in the
+ filtering paradigm.
+
+ Note that in this example, the method on =specializer<= involves a
+ nontrivial ordering of methods based on the =q= values specified in
+ the accept header (whereas in sections [[#Cons]] and [[#Signum]] only a
+ single extended specializer could be applicable to any given
+ argument).
+
+ Also note that the accept specializer protocol is straightforwardly
+ extensible to other suitable objects; for example, one simple
+ debugging aid is to define that an =accept-specializer= should be
+ applicable to =string= objects. This can be done in a modular
+ fashion (see the code below, which can be completely disconnected
+ from the code for Hunchentoot request objects), and generalizes to
+ dealing with multiple web server libraries, so that
+ content-negotiation methods are applicable to each web server's
+ request objects.
-;; we can define methods on STRING too, for debugging/simulation purposes
-(defmethod generalizer-of-using-class ((gf accept-generic-function) (s string))
+#+begin_src lisp
+(defmethod generalizer-of-using-class
+ ((gf accept-generic-function)
+ (s string))
(make-instance 'accept-generalizer
:header s
:next (class-of s)))
-(defmethod specializer-accepts-p ((s accept-specializer) (string string))
- (q (media-type s) (parse-accept-string string)))
+(defmethod specializer-accepts-p
+ ((s accept-specializer) (string string))
+ (let* ((tree (parse-accept-string string))
+ (q (q (media-type s) tree)))
+ (and q (> q 0))))
#+end_src
-
- jmoringe: The role of =accept-generalizer.tree= and the =q=
- function are hard to understand and may require some
- explanation. However, the example with its distinct, asymmetric
- specializers/generalizers, =accept-generalizer.next= and
- =specializer<= is likely worth it.
-
** Pattern / xpattern / regex / optima
Here's the /really/ interesting bit, but on the other hand we're
probably going to run out of space, and the full description of
these is going to take us into =make-method-lambda= territory.
A second paper? Future work?
* Protocol
-** Generalizer
- - [ ] =generalizer-of-using-class= (NB class of gf not class of object)
- - [ ] =compute-applicable-methods-using-generalizers=
- - [ ] =generalizer-equal-hash-key=
- - [ ] =specializer-accepts-generalizer-p=
- - [ ] =specializer-accepts-p=
- - [ ] =specializer<=
- jmoringe: If I remember correctly, closette has
- =method-more-specific-p= should we aim for parity with that and
- use =specializer-more-specific-p=? The downside would be that
- =-p= indicates a Boolean return value which is not the case here.
+ :PROPERTIES:
+ :CUSTOM_ID: Protocol
+ :END:
+
+ In section [[#Examples]], we have seen a number of code fragments as
+ partial implementations of particular non-standard method dispatch,
+ using =generalizer= metaobjects to mediate between the methods of
+ the generic function and the actual arguments passed to it. In
+ section [[#Generalizer metaobjects]], we go into more detail regarding
+ these =generalizer= metaobjects, describing the generic function
+ invocation protocol in full, and showing how this protocol allows a
+ similar form of effective method cacheing as the standard one does.
+ In section [[#Generalizer performance]], we show the results of some
+ simple performance measurements to highlight the improvement that
+ this protocol can bring over a naïve implementation of generalized
+ dispatch, as well as highlighting the potential for further
+ improvement.
+
+** Generalizer metaobjects
+ :PROPERTIES:
+ :CUSTOM_ID: Generalizer metaobjects
+ :END:
+
+*** Generic function invocation
+ As in the standard generic function invocation protocol, the
+ generic function's actual functionality is provided by a
+ discriminating function. The functionality described in this
+ protocol is implemented by having a distinct subclass of
+ =standard-generic-function=, and a method on
+ =compute-discriminating-function= which produces a custom
+ discriminating function. The basic outline of the discriminating
+ function is the same as the standard one: it must first compute the
+ set of applicable methods given particular arguments; from that, it
+ must compute the effective method by combining the methods
+ appropriately according to the generic function's method
+ combination; finally, it must call the effective method with the
+ arguments.
+
+ Computing the set of applicable methods is done using a pair of
+ functions: =compute-applicable-methods=, the standard metaobject
+ function, and a new function
+ =compute-applicable-methods-using-generalizers=. We define a
+ custom method on =compute-applicable-methods= which tests the
+ applicability of a particular specializer against a given argument
+ using =specializer-accepts-p=, a new protocol function with
+ default implementations on =class= and =eql-specializer= to
+ implement the expected behaviour. In order to order the methods,
+ as required by the protocol, we define a pairwise comparison
+ operator =specializer<= which defines an ordering between
+ specializers for a given generalizer argument (remembering that
+ even in standard CLOS the ordering between =class= specializers
+ can change depending on the actual class of the argument).
+
+ The new =compute-applicable-methods-using-generalizers= is the
+ analogue of the MOP's =compute-applicable-methods-using-classes=.
+ Instead of calling it with the =class-of= each argument, we compute
+ the generalizers of each argument using the new function
+ =generalizer-of-using-class= (where the =-using-class= refers to
+ the class of the generic function rather than the class of the
+ object), and call it with the list of generalizers. As with the
+ standard function, a secondary return value indicates whether the
+ result of the function is definitive for that list of generalizers.
+
+ Thus, in generic function invocation, we first compute the
+ generalizers of the arguments; we compute the ordered set of
+ applicable methods, either from the generalizers or (if that is
+ not definitive) from the arguments themselves; then the normal
+ effective method computation and call can occur. Unfortunately,
+ the nature of an effective method object is not specified, so we
+ have to reach into implementation internals a little in order to
+ call it, but otherwise the remainder of the generic function
+ invocation protocol is unchanged from the standard one. In
+ particular, method combination is completely unchanged;
+ programmers can choose arbitrary method combinations, including
+ user-defined long form combinations, for their generic functions
+ involving generalized dispatch.
+
+*** Effective method memoization
+ :PROPERTIES:
+ :CUSTOM_ID: Memoization
+ :END:
+ The potential efficiency benefit to having =generalizer=
+ metaobjects lies in the use of
+ =compute-applicable-methods-using-generalizers=. If a particular
+ generalized specializer accepts a variety of objects (such as the
+ =signum= specializer accepting all reals with a given sign, or the
+ =accept= specializer accepting all HTTP requests with a particular
+ =Accept= header), then there is the possibility of cacheing and
+ reusing the results of the applicable and effective method
+ computation. If the computation of the applicable method from
+ =compute-applicable-methods-using-generalizers= is definitive,
+ then the ordered set of applicable methods and the effective
+ method can be cached.
+
+ One issue is what to use as the key for that cache. We cannot use
+ the generalizers themselves, as two generalizers that should be
+ considered equal for cache lookup will not compare as =equal= –
+ and indeed even the standard generalizer, the =class=, cannot be
+ used as we must be able to invalidate cache entries upon class
+ redefinition. The issue of =class= generalizers we can solve as
+ in \cite{Kiczales.Rodriguez:1990} by using the =wrapper= of a
+ class, which is distinct for each distinct (re)definition of a
+ class; for arbitrary generalizers, however, there is /a priori/ no
+ good way of computing a suitable hash key automatically, so we
+ allow the metaprogrammer to specify one by defining a method on
+ =generalizer-equal-hash-key=, and combining the hash keys for all
+ required arguments in a list to use as a key in an =equal=
+ hash-table.
+
+ [XXX could we actually compute a suitable hash key using the
+ generalizer's class name and initargs?]
+
+ - [X] =generalizer-of-using-class= (NB class of gf not class of object)
+ - [X] =compute-applicable-methods-using-generalizers=
+ - [X] =generalizer-equal-hash-key=
+ - [X] =specializer-accepts-generalizer-p=
+ - [X] =specializer-accepts-p=
+ - [X] =specializer<=
+** Performance
+ :PROPERTIES:
+ :CUSTOM_ID: Generalizer performance
+ :END:
+ We have argued that the protocol presented here allows for
+ expressive control of method dispatch while preserving the
+ possibility of efficiency. In this section, we quantify the
+ efficiency that the memoization protocol described in section
+ [[#Memoization]] achieves, by comparing it both to the same protocol
+ with no memoization, as well as with equivalent dispatch
+ implementations in the context of methods with regular
+ specializers, and with implementation in straightforward functions.
+
+ In the case of the =cons-specializer=, we benchmark the walker
+ acting on a small but non-trivial form. The implementation
+ strategies in the table below refer to: an implementation in a
+ single function with a large =typecase= to dispatch between all the
+ cases; the natural implementation in terms of a standard generic
+ function with multiple methods (the method on =cons= having a
+ slightly reduced =typecase= to dispatch on the first element, and
+ other methods handling =symbol= and other atoms); and three
+ separate cases using =cons-specializer= objects. As well as
+ measuring the effect of memoization against the full invocation
+ protocol, we can also introduce a special case: when only one
+ argument participates in method selection (all the other required
+ arguments only being specialized on =t=), we can avoid the
+ construction of a list of hash keys and simply use the key
+ from the single active generalizer directly.
+
+ Refer to \cite{Kiczales.Rodriguez:1990}
+
+ | implementation | time (µs/call) | overhead |
+ |-----------------------+----------------+----------|
+ | function | 3.17 | |
+ | standard-gf/methods | 3.6 | +14% |
+ | cons-gf/one-arg-cache | 7.4 | +130% |
+ | cons-gf | 15 | +370% |
+ | cons-gf/no-cache | 90 | +2700% |
+
+ The benchmarking results from this exercise are promising: in
+ particular, the introduction of the effective method cache speeds
+ up the use of generic specializers in this case by a factor of 6,
+ and the one-argument special case by another factor of 2. For this
+ workload, even the one-argument special case only gets to within a
+ factor of 2-3 of the function and standard generic function
+ implementations, but the overall picture is that the memoizability
+ in the protocol does indeed drastically reduce the overhead
+ compared with the full invocation.
+
+ For the =signum-specializer= case, we choose to benchmark the
+ computation of 20!, because that is the largest factorial whose
+ answer fits in SBCL's 63-bit fixnums – in an attempt to measure the
+ worst case for generic dispatch, where the work done within the
+ methods is as small as possible without being meaningless, and in
+ particular does not cause allocation or garbage collection to
+ obscure the picture.
+
+#+begin_src lisp :exports none
+(progn (gc :full t) (time (dotimes (i 10000) (%fact 20))))
+#+end_src
+
+ | implementation | time (µs/call) | overhead |
+ |-------------------------+----------------+----------|
+ | function | 0.6 | |
+ | standard-gf/fixnum | 1.2 | +100% |
+ | signum-gf/one-arg-cache | 7.5 | +1100% |
+ | signum-gf | 23 | +3800% |
+ | signum-gf/no-cache | 240 | +41000% |
+
+ The relative picture is similar to the =cons-specializer= case;
+ including a cache saves a factor of 10 in this case, and another
+ factor of 3 for the one-argument cache special case. The cost of
+ the genericity of the protocol here is starker; even the
+ one-argument cache is a factor of 6 slower than the standard
+ generic-function implementation, and a further factor of 2 away
+ from the implementation of factorial as a function. We discuss
+ ways in which we expect to be able to improve performance in
+ section [[#Future Work]].
+
+ We could allow the metaprogrammer to improve on the one-argument
+ performance by constructing a specialized cache: for =signum=
+ arguments of =rational= arguments, the logical cache structure is
+ to index a three-element vector with =(1+ signum)=. The current
+ protocol does not provide a way of eliding the two generic function
+ calls for the generic cache; we discuss possible approaches in
+ section [[#Conclusions]].
** Full protocol
Description and specification left for reasons of space (we'll see?)
- [ ] =same-specializer-p=
jmoringe: would only be relevant for pattern dispatch, right? I
think, we didn't finish the discussion regarding special
variables vs. environment vs. new protocol function
+
* Related Work
- - [ ] Newton/Rhodes
+ :PROPERTIES:
+ :CUSTOM_ID: Related Work
+ :END:
+ - [ ] Newton/Rhodes, \cite{Newton.Rhodes:2008}
- [ ] filtered dispatch -- the point is that our work continues to
be useful in cases where there are unbounded numbers of
equivalence classes but each given invokation involves a small
- number of methods.
+ number of methods. Filtered dispatch works by having a custom
+ discriminating function which wraps the usual one, and augments
+ the set of applicable methods computed with applicable methods
+ from other (hidden) generic functions (one per filter group). It
+ then also has a custom method combination to handle combining
+ these applicable methods. \cite{Costanza.etal:2008}
- [ ] ContextL / context-oriented programming -- dispatch occurs on
hidden layer argument being an instance of an anonymous class with
suitably arranged superclasses -- OK because set of layers is
- bounded and under programmer control
+ bounded and under programmer control. \cite{Hirschfeld.etal:2008,Vallejos.etal:2010}
- [ ] http://soft.vub.ac.be/Publications/2010/vub-tr-soft-10-04.pdf
- [ ] http://soft.vub.ac.be/lambic/files/lambic-ilc09.pdf
- [ ] http://soft.vub.ac.be/Publications/2011/vub-soft-phd-11-03.pdf
A good test case for our protocol; handled adequately with
generalizer being the tuple of (roles,delegations), with some
thought needed for method redefinitions but otherwise working
- fine.
+ fine. \cite{Salzman.Aldrich:2005}
- [ ] Sheeple
+ - [ ] Multiple dispatch in Clojure
+ http://clojure.org/multimethods -- seems to allow hierarchy-based,
+ eql and the equivalent of filtered dispatch
* Conclusions
+ :PROPERTIES:
+ :CUSTOM_ID: Conclusions
+ :END:
+ - protocol for straightforward definition of custom dispatch
+ + interoperates seamlessly with rest of CLOS: method combination,
+ etc.
+ + tolerably efficient: two extra standard gf invokations and one
+ hash table lookup per call on the fast path (but more to be
+ done)
+ + expressive: handles forms of dispatch not handled elsewhere; all
+ the usual niceties of redefinition, modularity, introspection
+** Future Work
+ :PROPERTIES:
+ :CUSTOM_ID: Future Work
+ :END:
+ Although the protocol described in this paper allows for a more
+ efficient implementation, as described in section [[#Memoization]],
+ than computing the applicable and effective methods at each generic
+ function call, the efficiency is still some way away from a
+ baseline of the standard generic-function, let alone a standard
+ function. Most of the invocation protocol is memoized, but there
+ are still two full standard generic-function calls –
+ =generalizer-of-using-class= and =generalizer-equal-hash-key= – per
+ argument per call to a generic function with extended specializers,
+ not to mention a hash table lookup.
+
+ For many applications, the additional flexibility afforded by
+ generalized specializers might be worth the cost in efficiency, but
+ it would still be worth investigating how much the overhead from
+ generalized specializers can be reduced; one possible avenue for
+ investigation is giving greater control over the cacheing strategy
+ to the metaprogrammer.
+
+ As an example, consider the =signum-specializer=. The natural
+ cache structure for a single argument generic function specializing
+ on =signum= is probably a four-element vector, where the first
+ three elements hold the effective methods for =signum= values of
+ -1, 0, and 1, and the fourth holds the cached effective methods for
+ everything else. This would make the invocation of such functions
+ very fast for the (presumed) common case where the argument is in
+ fact a real number. We hope to develop and show the effectiveness
+ of an appropriate protocol to allow the metaprogrammer to construct
+ and exploit such cacheing strategies, and (more speculatively) to
+ implement the lookup of an effective method function in other ways.
+
+ We also aim to demonstrate support within this protocol for some
+ particular cases of generalized specializers which seem to have
+ widespread demand (in as much as any language extension can be said
+ to be in “demand”). In particular, we have preliminary work
+ towards supporting efficient dispatch over pattern specializers
+ such as implemented in the \textsf{Optima} library[fn:2], and over
+ a prototype object system similar to that in Slate
+ \cite{Salzman.Aldrich:2005}. Our current source code for the work
+ described in this paper can be seen in the git source code
+ repository at [[http://christophe.rhodes.io/git/specializable.git]],
+ which will be updated with future developments.
** Acknowledgments
- We thank Lee Salzman, Pascal Costanza, Mikel Evins for their
- helpful discussions
+ We thank Lee Salzman, Pascal Costanza and Mikel Evins for helpful
+ and informative discussions, and all the respondents to one
+ author's call for imaginative uses for generalized specializers.
+
+\bibliographystyle{plain}
+\bibliography{crhodes,specializers}
+
+* Footnotes
+
+[fn:1] the \textsf{Closer to MOP} project attempts to harmonize the
+ different implementations of the metaobject protocol in Common Lisp.
+
+[fn:2] https://github.com/m2ym/optima