X-Git-Url: http://christophe.rhodes.io/gitweb/?p=paper-els-specializers.git;a=blobdiff_plain;f=els-specializers.org;h=179120587a537442d2a49c6aff04f7de8562e8ee;hp=d7e129df5a32090cae7ee9c98e8ac209ad957948;hb=HEAD;hpb=fc053e7df033d7fb70f389579105326e2f13c9a0 diff --git a/els-specializers.org b/els-specializers.org index d7e129d..1791205 100644 --- a/els-specializers.org +++ b/els-specializers.org @@ -2,56 +2,132 @@ #+AUTHOR: Christophe Rhodes, Jan Moringen, David Lichteblau #+OPTIONS: toc:nil -#+LaTeX_HEADER: \usepackage[margin=1in]{geometry} +#+LaTeX_CLASS: acm_proc_article-sp +#+LaTeX_HEADER: \DeclareTextFontCommand{\texttt}{\ttfamily\hyphenchar\font=45\relax} +#+LaTeX_HEADER: \renewcommand{\baselinestretch}{0.99} + +#+begin_src elisp :exports none +;;; use C-x C-e on this if org refuses to export +(add-to-list 'org-latex-classes + '("acm_proc_article-sp" "\\documentclass{acm_proc_article-sp}" + ("\\section{%s}" . "\\section*{%s}") + ("\\subsection{%s}" . "\\subsection*{%s}") + ("\\subsubsection{%s}" . "\\subsubsection*{%s}") + ("\\paragraph{%s}" . "\\paragraph*{%s}") + ("\\subparagraph{%s}" . "\\subparagraph*{%s}"))) +(add-to-list 'org-latex-classes + '("sig-alternate" "\\documentclass{sig-alternate}" + ("\\section{%s}" . "\\section*{%s}") + ("\\subsection{%s}" . "\\subsection*{%s}") + ("\\subsubsection{%s}" . "\\subsubsection*{%s}") + ("\\paragraph{%s}" . "\\paragraph*{%s}") + ("\\subparagraph{%s}" . "\\subparagraph*{%s}"))) +(set (make-local-variable 'org-latex-pdf-process) + '("latexmk -f -pdf -bibtex %f")) +(set (make-local-variable 'org-latex-title-command) + "\\numberofauthors{3} +\\author{ +\\alignauthor Christophe Rhodes\\\\ + \\affaddr{Department of Computing}\\\\ + \\affaddr{Goldsmiths, University of London}\\\\ + \\affaddr{London SE14 6NW}\\\\ + \\email{c.rhodes@gold.ac.uk} +\\alignauthor Jan Moringen\\\\ + \\affaddr{Universität Bielefeld}\\\\ + \\affaddr{Technische Fakultät}\\\\ + \\affaddr{33594 Bielefeld}\\\\ + \\email{jmoringe@techfak.uni-bielefeld.de} +\\alignauthor David Lichteblau\\\\ + \\affaddr{ZenRobotics Ltd}\\\\ + \\affaddr{Vilhonkatu 5 A}\\\\ + \\affaddr{FI-00100 Helsinki}\\\\ + \\email{david@lichteblau.com} +} +\\maketitle") +#+end_src #+begin_abstract -1. This paper introduces a new metaobject, the generalizer, which - complements the existing specializer metaobject. -2. With the help of examples, we show that this metaobject allows for - the efficient implementation of complex non-class-based dispatch - within the framework of existing metaobject protocols -3. We present the generalizer protocol, implemented within the SBCL - implementation of Common Lisp -4. In combination with previous work, this produces a fully-functional - extension of the existing mechanism for method selection and - effective method computation, including support for standard and - user-defined method combination independent from method selection. +This paper introduces a new metaobject, the generalizer, which +complements the existing specializer metaobject. With the help of +examples, we show that this metaobject allows for the efficient +implementation of complex non-class-based dispatch within the +framework of existing metaobject protocols. We present our +modifications to the generic function invocation protocol from the +/Art of the Metaobject Protocol/; in combination with previous work, +this produces a fully-functional extension of the existing mechanism +for method selection and combination, including support for method +combination completely independent from method selection. We discuss +our implementation, within the SBCL implementation of Common Lisp, and +in that context compare the performance of the new protocol with the +standard one, demonstrating that the new protocol can be tolerably +efficient. #+end_abstract +#+begin_LaTeX +\begin{flushleft} +Report-No.:~\url{http://eprints.gold.ac.uk/id/eprint/9924} +\end{flushleft} +\category{D.1}{Software}{Programming Techniques}[Object-oriented Programming] +\category{D.3.3}{Programming Languages}{Language Constructs and Features} +\terms{Languages, Design} +\keywords{generic functions, specialization-oriented programming, method selection, method combination} +#+end_LaTeX + * Introduction - The revisions to the original Common Lisp language \cite{CLtL1} + The revisions to the original Common Lisp language \cite{CLtL} included the detailed specification of an object system, known as the Common Lisp Object System (CLOS), which was eventually standardized as part of the ANSI Common Lisp standard \cite{CLtS}. The object system as presented to the standardization committee was - formed of three parts, the first two of which covered XXX [what?] - and were incorporated into the final standard, and the third, - covering a Metaobject Protocol (MOP) for CLOS, was not. - - Nevertheless, the CLOS MOP has proven to be a robust design, and - while many implementations have derived their implementations of - CLOS from either the Closette illustrative implementation in - \cite{AMOP}, or the Portable Common Loops implementation of CLOS - from Xerox Parc, there have been from-scratch reimplementations of - CLOS (in at least CLISP; check for others -- ABCL? Lisp500?!) - incorporating the majority of the Metaobject Protocol as described. - - Although it has stood the test of time, the MOP is neither without issues - (e.g. M-M-L considered harmful; slot-definition initargs issue) nor - a complete framework for the metaprogrammer to implement all - conceivable variations of object-oriented behaviour; indeed, while - metaprogramming offers some possibilities for customization of the - object system behaviour, those possibilities cannot extend - arbitrarily in all directions. There is still an expectation that - functionality is implemented with methods on generic functions, - acting on objects with slots. [XXX find Paepke picture here? Not - Paepke; AMOP?]. XXX include typical examples of MOP: object - persistence; maybe ref. Kizcales "MOPs: why we want them and what - else they can do"? (Fig. 2 in that is good) ORMs; sparse slots. - jmoringe: - + introspection, e.g. documentation generation - + programmatic construction of classes and generic functions - e.g. for IDL compilers, model transformations + formed of three chapters. The first two chapters covered programmer + interface concepts and the functions in the programmer interface + \cite[Chapter 28]{CLtL2} and were largely incorporated into the + final standard; the third chapter, covering a Metaobject Protocol + (MOP) for CLOS, was not. + + Nevertheless, the CLOS MOP continued to be developed, and the + version documented in \cite{AMOP} has proven to be a reasonably + robust design. While many implementations have derived their + implementations of CLOS from either the Closette illustrative + implementation in \cite{AMOP}, or the Portable Common Loops + implementation of CLOS from Xerox Parc, there have been largely + from-scratch reimplementations of CLOS (in CLISP[fn:1] and + CCL[fn:2], at least) incorporating substantial fractions of the + Metaobject Protocol as described. + + #+CAPTION: MOP Design Space + #+LABEL: fig:mopdesign + #+ATTR_LATEX: width=\linewidth float + [[file:figures/mop-design-space.pdf]] + + Although it has stood the test of time, the CLOS MOP is neither + without issues (e.g. semantic problems with =make-method-lambda= + \cite{Costanza.Herzeel:2008}; useful functions such as + =compute-effective-slot-definition-initargs= being missing from the + standard) nor is it a complete framework for the metaprogrammer to + implement all conceivable variations of object-oriented behaviour. + While metaprogramming offers some possibilities for customization of + the object system behaviour, those possibilities cannot extend + arbitrarily in all directions (conceptually, if a given object + system is a point in design space, then a MOP for that object system + allows exploration of a region of design space around that point; + see figure \ref{fig:mopdesign}). In the case of the CLOS MOP, there is + still an expectation that functionality is implemented with methods + on generic functions, acting on objects with slots; it is not + possible, for example, to transparently implement support for + “message not understood” as in the message-passing paradigm, because + the analogue of messages (generic functions) need to be defined + before they are used. + + Nevertheless, the MOP is flexible, and is used for a number of + things, including: documentation generation (where introspection in + the MOP is used to extract information from a running system[fn:3]); + object-relational mapping[fn:4] and other approaches to object + persistence \cite{Paepke:1988}; alternative backing stores for slots + (hash-tables \cite{Kiczales.etal:1993} or symbols + \cite{Costanza.Hirschfeld:2005}); and programmatic construction of + metaobjects, for example for interoperability with other language + runtimes' object systems. One area of functionality where there is scope for customization by the metaprogrammer is in the mechanics and semantics of method @@ -61,140 +137,240 @@ =compute-applicable-methods=, =compute-applicable-methods-using-classes=), for example, in practice implementation support for this was weak until relatively - recently (ref. closer, also check how ContextL and filtered dispatch - are implemented). - jmoringe: filtered dispatch uses a custom method combination, i - think + recently[fn:5]. Another potential mechanism for customizing dispatch is implicit in the class structure defined by AMOP: standard specializer objects (instances of =class= and =eql-specializer=) are generalized instances of the =specializer= protocol class, and in principle there are no restrictions on the metaprogrammer constructing - additional subclasses. Previous work [Newton/Rhodes] has explored - the potential for customizing generic function dispatch using - extended specializers, but as of that work the metaprogrammer must + additional subclasses. Previous work \cite{Newton.Rhodes:2008} has + explored the potential for customizing generic function dispatch + using extended specializers, but there the metaprogrammer must override the entirety of the generic function invocation protocol (from =compute-discriminating-function= on down), leading to toy implementations and duplicated effort. This paper introduces a protocol for efficient and controlled - handling of arbitrary subclasses of =specializer=. In particular, - it introduces the =generalizer= protocol class, which generalizes - (ahem) the return value of =class-of=, and allows the metaprogrammer - to hook into cacheing schemes to avoid needless recomputation of - effective methods for sufficiently similar generic function - arguments (See Figure\nbsp\ref{fig:dispatch}). + handling of new subclasses of =specializer=. In particular, it + introduces the =generalizer= protocol class, which generalizes the + return value of =class-of= in method applicability computation, and + allows the metaprogrammer to hook into cacheing schemes to avoid + needless recomputation of effective methods for sufficiently similar + generic function arguments (See Figure\nbsp\ref{fig:dispatch}). #+CAPTION: Dispatch Comparison #+LABEL: fig:dispatch - #+ATTR_LATEX: width=0.9\linewidth float - [[file:figures/dispatch-comparison.pdf]] + #+ATTR_LATEX: width=\linewidth float + [[file:figures/dispatch-relationships.pdf]] The remaining sections in this paper can be read in any order. We - give some motivating examples in section XX, including + give some motivating examples in section [[#Examples]], including reimplementations of examples from previous work, as well as examples which are poorly supported by previous protocols. We - describe the protocol itself in section YY, describing each protocol - function in detail and, where applicable, relating it to existing - protocol functions within the CLOS MOP. We survey related work in - more detail in section ZZ, touching on work on customized dispatch - schemes in other environments. Finally, we draw our conclusions - from this work, and indicate directions for further development, in - section WW; reading that section before the others indicates - substantial trust in the authors' work. + describe the protocol itself in section [[#Protocol]], describing each + protocol function in detail and, where applicable, relating it to + existing protocol functions within the CLOS MOP. We survey related + work in more detail in section [[#Related Work]], touching on work on + customized dispatch schemes in other environments. Finally, we draw + our conclusions from this work, and indicate directions for further + development, in section [[#Conclusions]]; reading that section before the + others indicates substantial trust in the authors' work. * Examples - - [ ] =cons-specializer= (can be done using filtered dispatch) - - [ ] factorial (like filtered dispatch) - - [ ] HTTP Accept header - - [ ] xpattern - - [ ] prototype/multimethod -** car-of-cons - NB this example can be done using filtered dispatch, with a filter - calling =car= on cons arguments. - - Note also that there's no real need for =cons-specializer= and - =cons-generalizer= to be distinct classes (as with =class= and - =class=). Also true for =signum=, below; but more interesting - dispatch reveals the need to split. + :PROPERTIES: + :CUSTOM_ID: Examples + :END: + In this section, we present a number of examples of dispatch + implemented using our protocol, which we describe in section + [[#Protocol]]. For reasons of space, the metaprogram code examples in + this section do not include some of the necessary support code to + run; complete implementations of each of these cases, along with the + integration of this protocol into the SBCL implementation + \cite{Rhodes:2008} of Common Lisp, are included in the authors' + repository[fn:6]. + + A note on terminology: we will attempt to distinguish between the + user of an individual case of generalized dispatch (the + “programmer”), the implementor of a particular case of generalized + dispatch (the “metaprogrammer”), and the authors as the designers + and implementors of our generalized dispatch protocol (the + “metametaprogrammer”, or more likely “we”). +** CONS specializers + :PROPERTIES: + :CUSTOM_ID: Cons + :END: + One motivation for the use of generalized dispatch is in an + extensible code walker: a new special form can be handled simply by + writing an additional method on the walking generic function, + seamlessly interoperating with all existing methods. In this + use-case, dispatch is performed on the first element of lists. + Semantically, we allow the programmer to specialize any argument of + methods with a new kind of specializer, =cons-specializer=, which + is applicable if and only if the corresponding object is a =cons= + whose =car= is =eql= to the symbol associated with the + =cons-specializer=; these specializers are more specific than the + =cons= class, but less specific than an =eql-specializer= on any + given =cons=. + + The programmer code using these specializers is unchanged from + \cite{Newton.Rhodes:2008}; the benefits of the protocol described + here are: that the separation of concerns is complete – method + selection is independent of method combination – and that the + protocol allows for efficient implementation where possible, even + when method selection is customized. In an application such as + walking source code, we would expect to encounter special forms + (distinguished by particular atoms in the =car= position) multiple + times, and hence to dispatch to the same effective method + repeatedly. We discuss the efficiency aspects of the protocol in + more detail in section [[#Memoization]]; we present the metaprogrammer + code to implement the =cons-specializer= below. + #+begin_src lisp (defclass cons-specializer (specializer) ((%car :reader %car :initarg :car))) (defclass cons-generalizer (generalizer) ((%car :reader %car :initarg :car))) -(defmethod generalizer-of-using-class ((gf cons-generic-function) arg) +(defmethod generalizer-of-using-class + ((gf cons-generic-function) arg) (typecase arg - ((cons symbol) (make-instance 'cons-generalizer :car (car arg))) + ((cons symbol) + (make-instance 'cons-generalizer + :car (car arg))) (t (call-next-method)))) -(defmethod generalizer-equal-hash-key ((gf cons-generic-function) - (g cons-generalizer)) +(defmethod generalizer-equal-hash-key + ((gf cons-generic-function) + (g cons-generalizer)) (%car g)) -(defmethod specializer-accepts-generalizer-p ((gf cons-generic-function) - (s cons-specializer) - (g cons-generalizer)) +(defmethod specializer-accepts-generalizer-p + ((gf cons-generic-function) + (s cons-specializer) + (g cons-generalizer)) (if (eql (%car s) (%car g)) (values t t) (values nil t))) -(defmethod specializer-accepts-p ((s cons-specializer) o) +(defmethod specializer-accepts-p + ((s cons-specializer) o) (and (consp o) (eql (car o) (%car s)))) +#+end_src -#| less interesting methods elided: jmoringe: (un)parsing, specializer) - (t '<)))))) - -;; here are the only methods that actually know about TBNL -(defmethod generalizer-of-using-class ((gf accept-generic-function) (arg tbnl:request)) + ((gf accept-generic-function) + (g accept-generalizer)) + `(accept-generalizer ,(header g))) +(defmethod specializer-accepts-generalizer-p + ((gf accept-generic-function) + (s accept-specializer) + (g accept-generalizer)) + (values (q (media-type s) (tree g)) t)) +(defmethod specializer-accepts-generalizer-p + ((gf accept-generic-function) + (s specializer) + (g accept-generalizer)) + (specializer-accepts-generalizer-p + gf s (next g))) + +(defmethod specializer< + ((gf accept-generic-function) + (s1 accept-specializer) + (s2 accept-specializer) + (g accept-generalizer)) + (let ((m1 (media-type s1)) + (m2 (media-type s2)) + (tree (tree g))) + (cond + ((string= m1 m2) '=) + (t (let ((q1 (q m1 tree))) + (q2 (q m2 tree)))) + (cond + ((= q1 q2) '=) + ((< q1 q2) '>) + (t '<)))))) +#+end_src + + The metaprogrammer can then add support for objects representing + client requests, such as instances of the =request= class in the + Hunchentoot[fn:7] web server, by translating these into + =accept-generalizer= instances. The code below implements this, by + defining the computation of a =generalizer= object for a given + request, and specifying how to compute whether the specializer + accepts the given request object (=q= returns a number between 0 + and 1 if any pattern in the =tree= matches the media type, and + =nil= if the media type cannot be matched at all). + +#+begin_src +(defmethod generalizer-of-using-class + ((gf accept-generic-function) + (arg tbnl:request)) (make-instance 'accept-generalizer :header (tbnl:header-in :accept arg) - :next (class-of arg))) -(defmethod specializer-accepts-p ((specializer accept-specializer) (obj tbnl:requ -est)) - (q (media-type specializer) (parse-accept-string (tbnl:header-in :accept obj))) -) - -;; we can define methods on STRING too, for debugging/simulation purposes -(defmethod generalizer-of-using-class ((gf accept-generic-function) (s string)) + :next (call-next-method))) +(defmethod specializer-accepts-p + ((s accept-specializer) + (o tbnl:request)) + (let* ((accept (tbnl:header-in :accept o)) + (tree (parse-accept-string accept)) + (q (q (media-type s) tree))) + (and q (> q 0)))) +#+end_src + + This dispatch cannot be implemented using filtered dispatch, except + by generating anonymous classes with all the right mime-types as + direct superclasses in dispatch order; the filter would generate +#+begin_src lisp +(ensure-class nil :direct-superclasses + '(text/html image/webp ...)) +#+end_src + and dispatch would operate using those anonymous classes. While + this is possible to do, it is awkward to express content-type + negotiation in this way, as it means that the dispatcher must know + about the universe of mime-types that clients might declare that + they accept, rather than merely the set of mime-types that a + particular generic function is capable of serving; handling + wildcards in accept strings is particularly awkward in the + filtering paradigm. + + Note that in this example, the method on =specializer<= involves a + non-trivial ordering of methods based on the =q= values specified + in the accept header (whereas in sections [[#Cons]] and [[#Signum]] only a + single extended specializer could be applicable to any given + argument). + + Also note that the accept specializer protocol is straightforwardly + extensible to other suitable objects; for example, one simple + debugging aid is to define that an =accept-specializer= should be + applicable to =string= objects. This can be done in a modular + fashion (see the code below, which can be completely disconnected + from the code for Hunchentoot request objects), and generalizes to + dealing with multiple web server libraries, so that + content-negotiation methods are applicable to each web server's + request objects. + +#+begin_src lisp +(defmethod generalizer-of-using-class + ((gf accept-generic-function) + (s string)) (make-instance 'accept-generalizer :header s - :next (class-of s))) -(defmethod specializer-accepts-p ((s accept-specializer) (string string)) - (q (media-type s) (parse-accept-string string))) + :next (call-next-method))) +(defmethod specializer-accepts-p + ((s accept-specializer) (o string)) + (let* ((tree (parse-accept-string o)) + (q (q (media-type s) tree))) + (and q (> q 0)))) #+end_src - jmoringe: The role of =accept-generalizer.tree= and the =q= - function are hard to understand and may require some - explanation. However, the example with its distinct, asymmetric - specializers/generalizers, =accept-generalizer.next= and - =specializer<= is likely worth it. - -** Pattern / xpattern / regex / optima + The =next= slot in the =accept-generalizer= is used to deal with + the case of methods specialized on the classes of objects as well + as on the acceptable media types; there is a method on + =specializer-accepts-generalizer-p= for specializers that are not + of type =accept-specializer= which calls the generic function again + with the next generalizer, so that methods specialized on the + classes =tbnl:request= and =string= are treated as applicable to + corresponding objects, though less specific than methods with + =accept-specializer= specializations. + +** COMMENT Pattern / xpattern / regex / optima Here's the /really/ interesting bit, but on the other hand we're probably going to run out of space, and the full description of these is going to take us into =make-method-lambda= territory. A second paper? Future work? * Protocol -** Generalizer - - [ ] =generalizer-of-using-class= (NB class of gf not class of object) - - [ ] =compute-applicable-methods-using-generalizers= - - [ ] =generalizer-equal-hash-key= - - [ ] =specializer-accepts-generalizer-p= - - [ ] =specializer-accepts-p= - - [ ] =specializer<= - jmoringe: If I remember correctly, closette has - =method-more-specific-p= should we aim for parity with that and - use =specializer-more-specific-p=? The downside would be that - =-p= indicates a Boolean return value which is not the case here. + :PROPERTIES: + :CUSTOM_ID: Protocol + :END: + + In section [[#Examples]], we have seen a number of code fragments as + partial implementations of particular non-standard method dispatch + strategies, using =generalizer= metaobjects to mediate between the + methods of the generic function and the actual arguments passed to + it. In section [[#Generalizer metaobjects]], we go into more detail + regarding these =generalizer= metaobjects, describing the generic + function invocation protocol in full, and showing how this protocol + allows a similar form of effective method cacheing as the standard + one does. In section [[#Generalizer performance]], we show the results + of some simple performance measurements on our implementation of + this protocol in the SBCL implementation \cite{Rhodes:2008} of + Common Lisp to highlight the improvement that this protocol can + bring over a naïve implementation of generalized dispatch, as well + as to make the potential for further improvement clear. + +** Generalizer metaobjects + :PROPERTIES: + :CUSTOM_ID: Generalizer metaobjects + :END: + +*** Generic function invocation + As in the standard generic function invocation protocol, the + generic function's actual functionality is provided by a + discriminating function. The functionality described in this + protocol is implemented by having a distinct subclass of + =standard-generic-function=, and a method on + =compute-discriminating-function= which produces a custom + discriminating function. The basic outline of the discriminating + function is the same as the standard one: it must first compute the + set of applicable methods given particular arguments; from that, it + must compute the effective method by combining the methods + appropriately according to the generic function's method + combination; finally, it must call the effective method with the + arguments. + + Computing the set of applicable methods is done using a pair of + functions: =compute-applicable-methods=, the standard metaobject + function, and a new function + =compute-applicable-methods-using-generalizers=. We define a + custom method on =compute-applicable-methods= which tests the + applicability of a particular specializer against a given argument + using =specializer-accepts-p=, a new protocol function with + default implementations on =class= and =eql-specializer= to + implement the expected behaviour. To order the methods, as + required by the protocol, we define a pairwise comparison operator + =specializer<= which defines an ordering between specializers for + a given generalizer argument (remembering that even in standard + CLOS the ordering between =class= specializers can change + depending on the actual class of the argument). + + The new =compute-applicable-methods-using-generalizers= is the + analogue of the MOP's =compute-applicable-methods-using-classes=. + Instead of calling it with the =class-of= each argument, we + compute the generalizers of each argument using the new function + =generalizer-of-using-class= (where the =-using-class= refers to + the class of the generic function rather than the class of the + object), and call =compute-applicable-methods-using-generalizers= + with the generic function and list of generalizers. As with + =compute-applicable-methods-using-classes=, a secondary return + value indicates whether the result of the function is definitive + for that list of generalizers. + + Thus, in generic function invocation, we first compute the + generalizers of the arguments; we compute the ordered set of + applicable methods, either from the generalizers or (if that is + not definitive) from the arguments themselves; then the normal + effective method computation and call can occur. Unfortunately, + the nature of an effective method function is not specified, so we + have to reach into implementation internals a little in order to + call it, but otherwise the remainder of the generic function + invocation protocol is unchanged from the standard one. In + particular, method combination is completely unchanged; + programmers can choose arbitrary method combinations, including + user-defined long form combinations, for their generic functions + involving generalized dispatch. + +*** Effective method memoization + :PROPERTIES: + :CUSTOM_ID: Memoization + :END: + The potential efficiency benefit to having =generalizer= + metaobjects lies in the use of + =compute-applicable-methods-using-generalizers=. If a particular + generalized specializer accepts a variety of objects (such as the + =signum= specializer accepting all reals with a given sign, or the + =accept= specializer accepting all HTTP requests with a particular + =Accept= header), then there is the possibility of cacheing and + reusing the results of the applicable and effective method + computation. If the computation of the applicable method from + =compute-applicable-methods-using-generalizers= is definitive, + then the ordered set of applicable methods and the effective + method can be cached. + + One issue is what to use as the key for that cache. We cannot use + the generalizers themselves, as two generalizers that should be + considered equal for cache lookup will not compare as =equal= – + and indeed even the standard generalizer, the =class=, cannot + easily be used as we must be able to invalidate cache entries upon + class redefinition. The issue of =class= generalizers we can + solve as in \cite{Kiczales.Rodriguez:1990} by using the =wrapper= + of a class, which is distinct for each distinct (re)definition of + a class; for arbitrary generalizers, however, there is /a priori/ + no good way of computing a suitable hash key automatically, so we + allow the metaprogrammer to specify one by defining a method on + =generalizer-equal-hash-key=, and combining the hash keys for all + required arguments in a list to use as a key in an =equal= + hash-table. + +#+begin_comment + [could we actually compute a suitable hash key using the + generalizer's class name and initargs?] +#+end_comment + +*** COMMENT + - [X] =generalizer-of-using-class= (NB class of gf not class of object) + - [X] =compute-applicable-methods-using-generalizers= + - [X] =generalizer-equal-hash-key= + - [X] =specializer-accepts-generalizer-p= + - [X] =specializer-accepts-p= + - [X] =specializer<= +** Performance + :PROPERTIES: + :CUSTOM_ID: Generalizer performance + :END: + We have argued that the protocol presented here allows for + expressive control of method dispatch while preserving the + possibility of efficiency. In this section, we quantify the + efficiency that the memoization protocol described in section + [[#Memoization]] achieves, by comparing it both to the same protocol + with no memoization, as well as with equivalent dispatch + implementations in the context of methods with regular specializers + (in an implementation similar to that in + \cite{Kiczales.Rodriguez:1990}), and with implementation in + straightforward functions. We performed our benchmarks on a + quad-core X-series ThinkPad with 8GB of RAM running Debian + GNU/Linux, and took the mean of the 10 central samples of 20 runs, + with the number of iterations per run chosen so as to take + substantially over the clock resolution for the fastest case. + Despite these precautions, we advise against reading too much into + these numbers, which are best used as an order-of-magnitude + estimate. + + In the case of the =cons-specializer=, we benchmark the walker + acting on a small but non-trivial form. The implementation + strategies in the table below refer to: an implementation in a + single function with a large =typecase= to dispatch between all the + cases; the natural implementation in terms of a standard generic + function with multiple methods (the method on =cons= having a + slightly reduced =typecase= to dispatch on the first element, and + other methods handling =symbol= and other atoms); and three + separate cases using =cons-specializer= objects. As well as + measuring the effect of memoization against the full invocation + protocol, we can also introduce a special case: when only one + argument participates in method selection (all the other required + arguments only being specialized on =t=), we can avoid the + construction of a list of hash keys and simply use the key + from the single active generalizer directly. + + | implementation | time (µs/call) | overhead | + |-----------------------+----------------+----------| + | function | 3.17 | | + | standard-gf/methods | 3.6 | +14% | + | cons-gf/one-arg-cache | 7.4 | +130% | + | cons-gf | 15 | +370% | + | cons-gf/no-cache | 90 | +2700% | + + The benchmarking results from this exercise are promising: in + particular, the introduction of the effective method cache speeds + up the use of generic specializers in this case by a factor of 6, + and the one-argument special case by another factor of 2. For this + workload, even the one-argument special case only gets to within a + factor of 2-3 of the function and standard generic function + implementations, but the overall picture is that the memoizability + in the protocol does indeed drastically reduce the overhead + compared with the full invocation. + + For the =signum-specializer= case, we choose to benchmark the + computation of 20!, because that is the largest factorial whose + answer fits in SBCL's 63-bit fixnums – in an attempt to measure the + worst case for generic dispatch, where the work done within the + methods is as small as possible without being meaningless, and in + particular does not cause heap allocation or garbage collection to + obscure the picture. + +#+begin_src lisp :exports none +(progn (gc :full t) (time (dotimes (i 10000) (%fact 20)))) +#+end_src + + | implementation | time (µs/call) | overhead | + |-------------------------+----------------+----------| + | function | 0.6 | | + | standard-gf/fixnum | 1.2 | +100% | + | signum-gf/one-arg-cache | 7.5 | +1100% | + | signum-gf | 23 | +3800% | + | signum-gf/no-cache | 240 | +41000% | + + The relative picture is similar to the =cons-specializer= case; + including a cache saves a factor of 10 in this case, and another + factor of 3 for the one-argument cache special case. The cost of + the genericity of the protocol here is starker; even the + one-argument cache is a factor of 6 slower than the standard + generic-function implementation, and a further factor of 2 away + from the implementation of factorial as a function. We discuss + ways in which we expect to be able to improve performance in + section [[#Future Work]]. + + We could allow the metaprogrammer to improve on the one-argument + performance by constructing a specialized cache: for =signum= + arguments of =rational= arguments, the logical cache structure is + to index a three-element vector with =(1+ signum)=. The current + protocol does not provide a way of eliding the two generic function + calls for the generic cache; we discuss possible approaches in + section [[#Conclusions]]. ** Full protocol - Description and specification left for reasons of space (we'll see?) - - [ ] =same-specializer-p= - - [ ] =parse/unparse-specializer-using-class= - - [ ] =make-method-specializers-form= - - [ ] jmoringe: In an email, I suggested - =make-specializer-form-using-class=: - - #+begin_quote - Could we change =make-method-specializers-form='s default - behaviour to call a new generic function - #+begin_src - make-specializer-form-using-class gf method name env - #+end_src - with builtin methods on =sb-mop:specializer=, =symbol=, =cons= (for - eql-specializers)? This would make it unnecessary to repeat - boilerplate along the lines of - #+begin_src lisp - (flet ((make-parse-form (name) - (if - - ))) - `(list ,@(mapcar #'make-parse-form specializer-names))) - #+end_src - for each generic function class. - #+end_quote - - [ ] =make-method-lambda= revision (use environment arg?) - - jmoringe: would only be relevant for pattern dispatch, right? I - think, we didn't finish the discussion regarding special - variables vs. environment vs. new protocol function + The protocol described in this paper is only part of a complete + protocol for =specializer= and =generalizer= metaobjects. Our + development of this protocol is as yet incomplete; the work + described here augments that in \cite{Newton.Rhodes:2008}, but is + yet relatively untested – and additionally our recent experience of + working with that earlier protocol suggests that there might be + useful additions to the handling of =specializer= metaobjects, + independent of the =generalizer= idea presented here. + +*** COMMENT ideas + Description and specification left for reasons of space (we'll see?) + - [ ] =same-specializer-p= + - [ ] =parse/unparse-specializer-using-class= + - [ ] =make-method-specializers-form= + - [ ] jmoringe: In an email, I suggested + =make-specializer-form-using-class=: + + #+begin_quote + Could we change =make-method-specializers-form='s default + behaviour to call a new generic function + #+begin_src + make-specializer-form-using-class gf method name env + #+end_src + with builtin methods on =sb-mop:specializer=, =symbol=, =cons= (for + eql-specializers)? This would make it unnecessary to repeat + boilerplate along the lines of + #+begin_src lisp + (flet ((make-parse-form (name) + (if + + ))) + `(list ,@(mapcar #'make-parse-form specializer-names))) + #+end_src + for each generic function class. + #+end_quote + - [ ] =make-method-lambda= revision (use environment arg?) + + jmoringe: would only be relevant for pattern dispatch, right? I + think, we didn't finish the discussion regarding special + variables vs. environment vs. new protocol function + * Related Work - - [ ] Newton/Rhodes - - [ ] filtered dispatch -- the point is that our work continues to - be useful in cases where there are unbounded numbers of - equivalence classes but each given invokation involves a small - number of methods. - - [ ] ContextL / context-oriented programming -- dispatch occurs on - hidden layer argument being an instance of an anonymous class with - suitably arranged superclasses -- OK because set of layers is - bounded and under programmer control - - [ ] http://soft.vub.ac.be/Publications/2010/vub-tr-soft-10-04.pdf - - [ ] http://soft.vub.ac.be/lambic/files/lambic-ilc09.pdf - - [ ] http://soft.vub.ac.be/Publications/2011/vub-soft-phd-11-03.pdf - - [ ] Prototypes with Multiple Dispatch - http://sauerbraten.org/lee/ecoop.pdf -- extension of Self-style - object system to handle multiple equally-privileged "receivers". - A good test case for our protocol; handled adequately with - generalizer being the tuple of (roles,delegations), with some - thought needed for method redefinitions but otherwise working - fine. - - [ ] Sheeple + :PROPERTIES: + :CUSTOM_ID: Related Work + :END: + + The work presented here builds on specializer-oriented programming + described in \cite{Newton.Rhodes:2008}. Approximately + contemporaneously, filtered dispatch \cite{Costanza.etal:2008} was + introduced to address some of the same use cases: filtered dispatch + works by having a custom discriminating function which wraps the + usual one, where the wrapping function augments the set of + applicable methods with applicable methods from other (hidden) + generic functions, one per filter group; this step is not memoized, + and using =eql= methods to capture behaviours of equivalence classes + means that it is hard to see how it could be. The methods are then + combined using a custom method combination to mimic the standard + one; in principle implementors of other method combinations could + cater for filtered dispatch, but they would have to explicitly + modify their method combinations. The Clojure programming language + supports multimethods[fn:8] with a variant of filtered dispatch as + well as hierarchical and identity-based method selectors. + + In context-oriented programming + \cite{Hirschfeld.etal:2008,Vallejos.etal:2010}, context dispatch + occurs by maintaining the context state as an anonymous class with + the superclasses representing all the currently active layers; this + is then passed as a hidden argument to context-aware functions. The + set of layers is known and under programmer control, as layers must + be defined beforehand. + + In some sense, all dispatch schemes are specializations of predicate + dispatch \cite{Ernst.etal:1998}. The main problem with predicate + dispatch is its expressiveness: with arbitrary predicates able to + control dispatch, it is essentially impossible to perform any + substantial precomputation, or even to automatically determine an + ordering of methods given a set of arguments. Even Clojure's + restricted dispatch scheme provides an explicit operator for stating + a preference order among methods, where here we provide an operator + to order specializers; in filtered dispatch the programmer + implicitly gives the system an order of precedence, through the + lexical ordering of filter specification in a filtered function + definition. + + The Slate programming environment combines prototype-oriented + programming with multiple dispatch \cite{Salzman.Aldrich:2005}; in + that context, the analogue of an argument's class (in Common Lisp) + as a representation of the equivalence class of objects with the + same behaviour is the tuple of roles and delegations: objects with + the same roles and delegations tuple behave the same, much as + objects with the same generalizer have the same behaviour in the + protocol described in this paper. + + The idea of generalization is of course not new, and arises in other + contexts. Perhaps of particular interest is generalization in the + context of partial evaluation; for example, \cite{Ruf:1993} + considers generalization in online partial evaluation, where sets of + possible values are represented by a type system construct + representing an upper bound. Exploring the relationship between + generalizer metaobjects and approximation in type systems might + yield strategies for automatically computing suitable generalizers + and cache functions for a variety of forms of generalized dispatch. * Conclusions + :PROPERTIES: + :CUSTOM_ID: Conclusions + :END: + In this paper, we have presented a new generalizer metaobject + protocol allowing the metaprogrammer to implement in a + straightforward manner metaobjects to implement custom method + selection, rather than the standard method selection as standardized + in Common Lisp. This protocol seamlessly interoperates with the + rest of CLOS and Common Lisp in general; the programmer (the user of + the custom specializer metaobjects) may without constraints use + arbitrary method combination, intercede in effective method + combination, or write custom method function implementations. The + protocol is expressive, in that it handles forms of dispatch not + possible in more restricted dispatch systems, while not suffering + from the indeterminism present in predicate dispatch through the use + of explicit ordering predicates. + + The protocol is also reasonably efficient; the metaprogrammer can + indicate that a particular effective method computation can be + memoized, and under those circumstances much of the overhead is + amortized (though there remains a substantial overhead compared with + standard generic-function or regular function calls). We discuss + how the efficiency could be improved below. +** Future work + :PROPERTIES: + :CUSTOM_ID: Future Work + :END: + Although the protocol described in this paper allows for a more + efficient implementation, as described in section [[#Memoization]], + than computing the applicable and effective methods at each generic + function call, the efficiency is still some way away from a + baseline of the standard generic-function, let alone a standard + function. Most of the invocation protocol is memoized, but there + are still two full standard generic-function calls – + =generalizer-of-using-class= and =generalizer-equal-hash-key= – per + argument per call to a generic function with extended specializers, + not to mention a hash table lookup. + + For many applications, the additional flexibility afforded by + generalized specializers might be worth the cost in efficiency, but + it would still be worth investigating how much the overhead from + generalized specializers can be reduced; one possible avenue for + investigation is giving greater control over the cacheing strategy + to the metaprogrammer. + + As an example, consider the =signum-specializer=. The natural + cache structure for a single argument generic function specializing + on =signum= is probably a four-element vector, where the first + three elements hold the effective methods for =signum= values of + -1, 0, and 1, and the fourth holds the cached effective methods for + everything else. This would make the invocation of such functions + very fast for the (presumed) common case where the argument is in + fact a real number. We hope to develop and show the effectiveness + of an appropriate protocol to allow the metaprogrammer to construct + and exploit such cacheing strategies, and (more speculatively) to + implement the lookup of an effective method function in other ways. + + We also aim to demonstrate support within this protocol for some + particular cases of generalized specializers which seem to have + widespread demand (in as much as any language extension can be said + to be in “demand”). In particular, we have preliminary work + towards supporting efficient dispatch over pattern specializers + such as implemented in the \textsf{Optima} library[fn:9], and over + a prototype object system similar to that in Slate + \cite{Salzman.Aldrich:2005}. Our current source code for the work + described in this paper can be seen in the git source code + repository at [[http://christophe.rhodes.io/git/specializable.git]], + which will be updated with future developments. + + Finally, after further experimentation (and, ideally, non-trivial + use in production) if this protocol stands up to use as we hope, we + aim to produce a standards-quality document so that other + implementors of Common Lisp can, if they choose, independently + reimplement the protocol, and so that users can use the protocol + with confidence that the semantics will not change in a + backwards-incompatible fashion. ** Acknowledgments - We thank Lee Salzman, Pascal Costanza, Mikel Evins for their - helpful discussions + We thank the anonymous reviewers for their helpful suggestions and + comments on the submitted version of this paper. We also thank Lee + Salzman, Pascal Costanza and Mikel Evins for helpful and + informative discussions, and all the respondents to the first + author's request for imaginative uses for generalized specializers. + +\bibliographystyle{plain} +\bibliography{crhodes,specializers} + +* Footnotes + +[fn:1] GNU CLISP, at http://www.clisp.org/ + +[fn:2] Clozure Common Lisp, at http://ccl.clozure.com/ + +[fn:3] as in many of the systems surveyed at +https://sites.google.com/site/sabraonthehill/lisp-document-generation-apps + +[fn:4] e.g. CLSQL, at http://clsql.b9.com/ + +[fn:5] the \textsf{Closer to MOP} project, at + http://common-lisp.net/project/closer/, attempts to harmonize the + different implementations of the metaobject protocol in Common + Lisp. + +[fn:6] the tag =els2014-submission= in +http://christophe.rhodes.io/git/specializable.git corresponds to the +code repository at the point of submitting this paper. + +[fn:7] Hunchentoot is a web server written in Common Lisp, allowing +the user to write handler functions to compute responses to requests; +http://weitz.de/hunchentoot/ + +[fn:8] http://clojure.org/multimethods + +[fn:9] https://github.com/m2ym/optima + +