els-specializers.org

   1 #+TITLE: Generalizers: New Metaobjects for Generalized Dispatch
   2 #+AUTHOR: Christophe Rhodes, Jan Moringen, David Lichteblau
   3 #+OPTIONS: toc:nil
   4
   5 #+LaTeX_CLASS: acm_proc_article-sp
   6 #+LaTeX_HEADER: \DeclareTextFontCommand{\texttt}{\ttfamily\hyphenchar\font=45\relax}
   7
   8 #+begin_src elisp :exports none
   9 ;;; use C-x C-e on this if org refuses to export
  10 (add-to-list 'org-latex-classes
  11              '("acm_proc_article-sp" "\\documentclass{acm_proc_article-sp}"
  12                ("\\section{%s}" . "\\section*{%s}")
  13                ("\\subsection{%s}" . "\\subsection*{%s}")
  14                ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
  15                ("\\paragraph{%s}" . "\\paragraph*{%s}")
  16                ("\\subparagraph{%s}" . "\\subparagraph*{%s}")))
  17 (set (make-local-variable 'org-latex-title-command)
  18      "\\numberofauthors{3}
  19 \\author{
  20 \\alignauthor Christophe Rhodes\\\\
  21   \\affaddr{Department of Computing}\\\\
  22   \\affaddr{Goldsmiths, University of London}\\\\
  23   \\affaddr{London SE14 6NW}\\\\
  24   \\email{c.rhodes@gold.ac.uk}
  25 \\alignauthor Jan Moringen\\\\
  26   \\affaddr{Universität Bielefeld}\\\\
  27   \\affaddr{Technische Fakultät}\\\\
  28   \\affaddr{33594 Bielefeld}\\\\
  29   \\email{jmoringe@techfak.uni-bielefeld.de}
  30 \\alignauthor David Lichteblau\\\\
  31   \\affaddr{ZenRobotics Ltd}\\\\
  32   \\affaddr{Vilhonkatu 5 A}\\\\
  33   \\affaddr{FI-00100 Helsinki}\\\\
  34   \\email{david@lichteblau.com}
  35 }
  36 \\maketitle")
  37 #+end_src
  38
  39 #+begin_abstract
  40 This paper introduces a new metaobject, the generalizer, which
  41 complements the existing specializer metaobject.  With the help of
  42 examples, we show that this metaobject allows for the efficient
  43 implementation of complex non-class-based dispatch within the
  44 framework of existing metaobject protocols.  We present our
  45 modifications to the generic function invocation protocol from the
  46 /Art of the Metaobject Protocol/; in combination with previous work,
  47 this produces a fully-functional extension of the existing mechanism
  48 for method selection and combination, including support for method
  49 combination completely independent from method selection.  We discuss
  50 our implementation, within the SBCL implementation of Common Lisp, and
  51 in that context compare the performance of the new protocol with the
  52 standard one, demonstrating that the new protocol can be tolerably
  53 efficient.
  54 #+end_abstract
  55
  56 #+begin_LaTeX
  57 \begin{flushleft}
  58 Report-No.:~\url{http://eprints.gold.ac.uk/id/eprint/9924}
  59 \end{flushleft}
  60 \category{D.1}{Software}{Programming Techniques}[Object-oriented Programming]
  61 \category{D.3.3}{Programming Languages}{Language Constructs and Features}
  62 \terms{Languages, Design}
  63 \keywords{generic functions, specialization-oriented programming, method selection, method combination}
  64 #+end_LaTeX
  65
  66 * Introduction
  67   The revisions to the original Common Lisp language \cite{CLtL}
  68   included the detailed specification of an object system, known as
  69   the Common Lisp Object System (CLOS), which was eventually
  70   standardized as part of the ANSI Common Lisp standard \cite{CLtS}.
  71   The object system as presented to the standardization committee was
  72   formed of three chapters.  The first two chapters covered programmer
  73   interface concepts and the functions in the programmer interface
  74   \cite[Chapter 28]{CLtL2} and were largely incorporated into the
  75   final standard; the third chapter, covering a Metaobject Protocol
  76   (MOP) for CLOS, was not.
  77
  78   Nevertheless, the CLOS MOP continued to be developed, and the
  79   version documented in \cite{AMOP} has proven to be a reasonably
  80   robust design.  While many implementations have derived their
  81   implementations of CLOS from either the Closette illustrative
  82   implementation in \cite{AMOP}, or the Portable Common Loops
  83   implementation of CLOS from Xerox Parc, there have been largely
  84   from-scratch reimplementations of CLOS (in CLISP[fn:1] and
  85   CCL[fn:2], at least) incorporating substantial fractions of the
  86   Metaobject Protocol as described.
  87
  88   #+CAPTION:    MOP Design Space
  89   #+LABEL:      fig:mopdesign
  90   #+ATTR_LATEX: width=\linewidth float
  91   [[file:figures/mop-design-space.pdf]]
  92
  93   Although it has stood the test of time, the CLOS MOP is neither
  94   without issues (e.g. semantic problems with =make-method-lambda=
  95   \cite{Costanza.Herzeel:2008}; useful functions such as
  96   =compute-effective-slot-definition-initargs= being missing from the
  97   standard) nor is it a complete framework for the metaprogrammer to
  98   implement all conceivable variations of object-oriented behaviour.
  99   While metaprogramming offers some possibilities for customization of
 100   the object system behaviour, those possibilities cannot extend
 101   arbitrarily in all directions (conceptually, if a given object
 102   system is a point in design space, then a MOP for that object system
 103   allows exploration of a region of design space around that point;
 104   see figure \ref{fig:mopdesign}).  In the case of the CLOS MOP, there is
 105   still an expectation that functionality is implemented with methods
 106   on generic functions, acting on objects with slots; it is not
 107   possible, for example, to transparently implement support for
 108   “message not understood” as in the message-passing paradigm, because
 109   the analogue of messages (generic functions) need to be defined
 110   before they are used.
 111
 112   Nevertheless, the MOP is flexible, and is used for a number of
 113   things, including: documentation generation (where introspection in
 114   the MOP is used to extract information from a running system);
 115   object-relational mapping and other approaches to object
 116   persistence; alternative backing stores for slots (hash-tables or
 117   symbols); and programmatic construction of metaobjects, for example
 118   for IDL compilers and model transformations.
 119
 120   One area of functionality where there is scope for customization by
 121   the metaprogrammer is in the mechanics and semantics of method
 122   applicability and dispatch.  While in principle AMOP allows
 123   customization of dispatch in various different ways (the
 124   metaprogrammer can define methods on protocol functions such as
 125   =compute-applicable-methods=,
 126   =compute-applicable-methods-using-classes=), for example, in
 127   practice implementation support for this was weak until relatively
 128   recently[fn:3].
 129
 130   Another potential mechanism for customizing dispatch is implicit in
 131   the class structure defined by AMOP: standard specializer objects
 132   (instances of =class= and =eql-specializer=) are generalized
 133   instances of the =specializer= protocol class, and in principle
 134   there are no restrictions on the metaprogrammer constructing
 135   additional subclasses.  Previous work \cite{Newton.Rhodes:2008} has
 136   explored the potential for customizing generic function dispatch
 137   using extended specializers, but there the metaprogrammer must
 138   override the entirety of the generic function invocation protocol
 139   (from =compute-discriminating-function= on down), leading to toy
 140   implementations and duplicated effort.
 141
 142   This paper introduces a protocol for efficient and controlled
 143   handling of new subclasses of =specializer=.  In particular, it
 144   introduces the =generalizer= protocol class, which generalizes the
 145   return value of =class-of= in method applicability computation, and
 146   allows the metaprogrammer to hook into cacheing schemes to avoid
 147   needless recomputation of effective methods for sufficiently similar
 148   generic function arguments (See Figure\nbsp\ref{fig:dispatch}).
 149
 150   #+CAPTION:    Dispatch Comparison
 151   #+LABEL:      fig:dispatch
 152   #+ATTR_LATEX: width=\linewidth float
 153   [[file:figures/dispatch-relationships.pdf]]
 154
 155   The remaining sections in this paper can be read in any order.  We
 156   give some motivating examples in section [[#Examples]], including
 157   reimplementations of examples from previous work, as well as
 158   examples which are poorly supported by previous protocols.  We
 159   describe the protocol itself in section [[#Protocol]], describing each
 160   protocol function in detail and, where applicable, relating it to
 161   existing protocol functions within the CLOS MOP.  We survey related
 162   work in more detail in section [[#Related Work]], touching on work on
 163   customized dispatch schemes in other environments.  Finally, we draw
 164   our conclusions from this work, and indicate directions for further
 165   development, in section [[#Conclusions]]; reading that section before the
 166   others indicates substantial trust in the authors' work.
 167 * Examples
 168   :PROPERTIES:
 169   :CUSTOM_ID: Examples
 170   :END:
 171   In this section, we present a number of examples of dispatch
 172   implemented using our protocol, which we describe in section
 173   [[#Protocol]].  For reasons of space, the metaprogram code examples in
 174   this section do not include some of the necessary support code to
 175   run; complete implementations of each of these cases, along with the
 176   integration of this protocol into the SBCL implementation
 177   \cite{Rhodes:2008} of Common Lisp, are included in the authors'
 178   repository[fn:4].
 179
 180   A note on terminology: we will attempt to distinguish between the
 181   user of an individual case of generalized dispatch (the
 182   “programmer”), the implementor of a particular case of generalized
 183   dispatch (the “metaprogrammer”), and the authors as the designers
 184   and implementors of our generalized dispatch protocol (the
 185   “metametaprogammer”, or more likely “we”).
 186 ** CONS specializers
 187    :PROPERTIES:
 188    :CUSTOM_ID: Cons
 189    :END:
 190    One motivation for the use of generalized dispatch is in an
 191    extensible code walker: a new special form can be handled simply by
 192    writing an additional method on the walking generic function,
 193    seamlessly interoperating with all existing methods. In this
 194    use-case, dispatch is performed on the first element of lists.
 195    Semantically, we allow the programmer to specialize any argument of
 196    methods with a new kind of specializer, =cons-specializer=, which
 197    is applicable if and only if the corresponding object is a =cons=
 198    whose =car= is =eql= to the symbol associated with the
 199    =cons-specializer=; these specializers are more specific than the
 200    =cons= class, but less specific than an =eql-specializer= on any
 201    given =cons=.
 202
 203    The programmer code using these specializers is unchanged from
 204    \cite{Newton.Rhodes:2008}; the benefits of the protocol described
 205    here are: that the separation of concerns is complete – method
 206    selection is independent of method combination – and that the
 207    protocol allows for efficient implementation where possible, even
 208    when method selection is customized.  In an application such as
 209    walking source code, we would expect to encounter special forms
 210    (distinguished by particular atoms in the =car= position) multiple
 211    times, and hence to dispatch to the same effective method
 212    repeatedly.  We discuss the efficiency aspects of the protocol in
 213    more detail in section [[#Memoization]]; we present the metaprogrammer
 214    code to implement the =cons-specializer= below.
 215
 216 #+begin_src lisp
 217 (defclass cons-specializer (specializer)
 218   ((%car :reader %car :initarg :car)))
 219 (defclass cons-generalizer (generalizer)
 220   ((%car :reader %car :initarg :car)))
 221 (defmethod generalizer-of-using-class
 222     ((gf cons-generic-function) arg)
 223   (typecase arg
 224     ((cons symbol)
 225      (make-instance 'cons-generalizer
 226                     :car (car arg)))
 227     (t (call-next-method))))
 228 (defmethod generalizer-equal-hash-key
 229     ((gf cons-generic-function)
 230      (g cons-generalizer))
 231   (%car g))
 232 (defmethod specializer-accepts-generalizer-p
 233     ((gf cons-generic-function)
 234      (s cons-specializer)
 235      (g cons-generalizer))
 236   (if (eql (%car s) (%car g))
 237       (values t t)
 238       (values nil t)))
 239 (defmethod specializer-accepts-p
 240     ((s cons-specializer) o)
 241   (and (consp o) (eql (car o) (%car s))))
 242 #+end_src
 243
 244 The code above shows a minimal use of our protocol.  We have elided
 245 some support code for parsing and unparsing specializers, and for
 246 handling introspective functions such as finding generic functions for
 247 a given specializer.  We have also elided methods on the protocol
 248 functions =specializer<= and =same-specializer-p=; for
 249 =cons-specializer= objects, specializer ordering is trivial, as only
 250 one =cons-specializer= (up to equality) can ever be applicable to any
 251 given argument.  See section [[#Accept]] for a case where specializer
 252 ordering is non-trivial.
 253
 254 As in \cite{Newton.Rhodes:2008}, the programmer can use these
 255 specializers to implement a modular code walker, where they define one
 256 method per special operator.  We show two of those methods below, in
 257 the context of a walker which checks for unused bindings and uses of
 258 unbound variables.
 259
 260 #+begin_src
 261 (defgeneric walk (form env stack)
 262   (:generic-function-class cons-generic-function))
 263 (defmethod walk
 264     ((expr (cons lambda)) env call-stack)
 265   (let ((lambda-list (cadr expr))
 266         (body (cddr expr)))
 267     (with-checked-bindings
 268         ((bindings-from-ll lambda-list)
 269          env call-stack)
 270       (dolist (form body)
 271         (walk form env (cons form call-stack))))))
 272 (defmethod walk
 273     ((expr (cons let)) env call-stack)
 274   (flet ((let-binding (x)
 275            (walk (cadr x) env
 276                  (cons (cadr x) call-stack))
 277            (cons (car x)
 278                  (make-instance 'binding))))
 279     (with-checked-bindings
 280         ((mapcar #'let-binding (cadr expr))
 281           env call-stack)
 282       (dolist (form (cddr expr))
 283         (walk form env (cons form call-stack))))))
 284 #+end_src
 285
 286    Note that in this example there is no strict need for
 287    =cons-specializer= and =cons-generalizer= to be distinct classes.
 288    In standard generic function dispatch, the =class= functions both
 289    as the specializer for methods and as the generalizer for generic
 290    function arguments; we can think of the dispatch implemented by
 291    =cons-specializer= objects as providing for subclasses of the
 292    =cons= class distinguished by the =car= of the =cons=.  This
 293    analogy also characterizes those use cases where the metaprogrammer
 294    could straightforwardly use filtered dispatch
 295    \cite{Costanza.etal:2008} to implement their dispatch semantics.
 296    We will see in section [[#Accept]] an example of a case where filtered
 297    dispatch is incapable of straightforwardly expressing the dispatch,
 298    but first we present our implementation of the motivating case from
 299    \cite{Costanza.etal:2008}.
 300 ** SIGNUM specializers
 301    :PROPERTIES:
 302    :CUSTOM_ID: Signum
 303    :END:
 304    Our second example of the implementation and use of generalized
 305    specializers is a reimplementation of one of the examples in
 306    \cite{Costanza.etal:2008}: specifically, the factorial function.
 307    Here, dispatch will be performed based on the =signum= of the
 308    argument, and again, at most one method with a =signum= specializer
 309    will be applicable to any given argument, which makes the structure
 310    of the specializer implementation very similar to the =cons=
 311    specializers in the previous section.
 312
 313    The metaprogrammer has chosen in the example below to compare
 314    signum values using \texttt{=}, which means that a method with
 315    specializer =(signum 1)= will be applicable to positive
 316    floating-point arguments (see the first method on
 317    =specializer-accepts-generalizer-p= and the method on
 318    =specializer-accepts-p= below).  This leads to one subtle
 319    difference in behaviour compared to that of the =cons=
 320    specializers: in the case of =signum= specializers, the /next/
 321    method after any =signum= specializer can be different, depending
 322    on the class of the argument.  This aspect of the dispatch is
 323    handled by the second method on =specializer-accepts-generalizer-p=
 324    below.
 325
 326 #+begin_src lisp
 327 (defclass signum-specializer (specializer)
 328   ((%signum :reader %signum :initarg :signum)))
 329 (defclass signum-generalizer (generalizer)
 330   ((%signum :reader %signum :initarg :signum)))
 331 (defmethod generalizer-of-using-class
 332     ((gf signum-generic-function) (arg real))
 333   (make-instance 'signum-generalizer
 334                  :signum (signum arg)))
 335 (defmethod generalizer-equal-hash-key
 336     ((gf signum-generic-function)
 337      (g signum-generalizer))
 338   (%signum g))
 339 (defmethod specializer-accepts-generalizer-p
 340     ((gf signum-generic-function)
 341      (s signum-specializer)
 342      (g signum-generalizer))
 343   (if (= (%signum s) (%signum g))
 344       (values t t)
 345       (values nil t)))
 346
 347 (defmethod specializer-accepts-generalizer-p
 348     ((gf signum-generic-function)
 349      (s specializer)
 350      (g signum-generalizer))
 351   (specializer-accepts-generalizer-p
 352    gf s (class-of (%signum g))))
 353
 354 (defmethod specializer-accepts-p
 355     ((s signum-specializer) o)
 356   (and (realp o) (= (%signum s) (signum o))))
 357 #+end_src
 358
 359    Given these definitions, and once again some more straightforward
 360    ones elided for reasons of space, the programmer can implement the
 361    factorial function as follows:
 362
 363 #+begin_src lisp
 364 (defgeneric fact (n)
 365   (:generic-function-class signum-generic-function))
 366 (defmethod fact ((n (signum 0))) 1)
 367 (defmethod fact ((n (signum 1))) (* n (fact (1- n))))
 368 #+end_src
 369
 370    The programmer does not need to include a method on =(signum -1)=,
 371    as the standard =no-applicable-method= protocol will automatically
 372    apply to negative real or non-real arguments.
 373 ** Accept HTTP header specializers
 374    :PROPERTIES:
 375    :CUSTOM_ID: Accept
 376    :END:
 377    In this section, we implement a non-trivial form of dispatch.  The
 378    application in question is a web server, and specifically to allow
 379    the programmer to support RFC 2616 \cite{rfc2616} content
 380    negotiation, of particular interest to publishers and consumers of
 381    REST-style Web APIs.
 382
 383    The basic mechanism in content negotiation is as follows: the web
 384    client sends an HTTP request with an =Accept= header, which is a
 385    string describing the media types it is willing to receive as a
 386    response to the request, along with numerical preferences.  The web
 387    server compares these stated client preferences with the resources
 388    it has available to satisfy this request, and sends the best
 389    matching resource in its response.
 390
 391    For example, a graphical web browser might send an =Accept= header
 392    of =text/html,application/xml;q=0.9,*/*;q=0.8= for a request of a
 393    resource typed in to the URL bar.  This should be interpreted as
 394    meaning that: if the server can provide content of type =text/html=
 395    (i.e. HTML) for that resource, then it should do so.  Otherwise, if
 396    it can provide =application/xml= content (i.e. XML of any schema),
 397    then that should be provided; failing that, any other content type
 398    is acceptable.
 399
 400    In the case where there are static files on the filesystem, and the
 401    web server must merely select between them, there is not much more
 402    to say.  However, it is not unusual for a web service to be backed
 403    by some other form of data, and responses computed and sent on the
 404    fly, and in these circumstances the web server must compute which
 405    of its known output formats it can use to satisfy the request
 406    before actually generating the best matching response.  This can be
 407    modelled as one generic function responsible for generating the
 408    response, with methods corresponding to content-types -- and the
 409    generic function must then perform method selection against the
 410    request's =Accept= header to compute the appropriate response.
 411
 412    The =accept-specializer= below implements this dispatch.  It depends
 413    on a lazily-computed =tree= slot to represent the information in
 414    the accept header (generated by =parse-accept-string=), and a
 415    function =q= to compute the (defaulted) preference level for a
 416    given content-type and =tree=; then, method selection and ordering
 417    involves finding the =q= for each =accept-specializer='s content
 418    type given the =tree=, and sorting them according to the preference
 419    level.
 420
 421 #+begin_src lisp
 422 (defclass accept-specializer (specializer)
 423   ((media-type :initarg :media-type :reader media-type)))
 424 (defclass accept-generalizer (generalizer)
 425   ((header :initarg :header :reader header)
 426    (tree)
 427    (next :initarg :next :reader next)))
 428 (defmethod generalizer-equal-hash-key
 429     ((gf accept-generic-function)
 430      (g accept-generalizer))
 431   `(accept-generalizer ,(header g)))
 432 (defmethod specializer-accepts-generalizer-p
 433     ((gf accept-generic-function)
 434      (s accept-specializer)
 435      (g accept-generalizer))
 436   (values (q (media-type s) (tree g)) t))
 437 (defmethod specializer-accepts-generalizer-p
 438     ((gf accept-generic-function)
 439      (s specializer)
 440      (g accept-generalizer))
 441   (specializer-accepts-generalizer-p
 442    gf s (next g)))
 443
 444 (defmethod specializer<
 445     ((gf accept-generic-function)
 446      (s1 accept-specializer)
 447      (s2 accept-specializer)
 448      (g accept-generalizer))
 449   (let ((m1 (media-type s1))
 450         (m2 (media-type s2))
 451         (tree (tree g)))
 452     (cond
 453       ((string= m1 m2) '=)
 454       (t (let ((q1 (q m1 tree)))
 455                (q2 (q m2 tree))))
 456            (cond
 457              ((= q1 q2) '=)
 458              ((< q1 q2) '>)
 459              (t '<))))))
 460 #+end_src
 461
 462    The metaprogrammer can then add support for objects representing
 463    client requests, such as instances of the =request= class in the
 464    Hunchentoot[fn:5] web server, by translating these into
 465    =accept-generalizer= instances.  The code below implements this, by
 466    defining the computation of a =generalizer= object for a given
 467    request, and specifying how to compute whether the specializer
 468    accepts the given request object (=q= returns a number between 0
 469    and 1 if any pattern in the =tree= matches the media type, and
 470    =nil= if the media type cannot be matched at all).
 471
 472 #+begin_src
 473 (defmethod generalizer-of-using-class
 474     ((gf accept-generic-function)
 475      (arg tbnl:request))
 476   (make-instance 'accept-generalizer
 477                  :header (tbnl:header-in :accept arg)
 478                  :next (call-next-method)))
 479 (defmethod specializer-accepts-p
 480     ((s accept-specializer)
 481      (o tbnl:request))
 482   (let* ((accept (tbnl:header-in :accept o))
 483          (tree (parse-accept-string accept))
 484          (q (q (media-type s) tree)))
 485     (and q (> q 0))))
 486 #+end_src
 487
 488    This dispatch cannot be implemented using filtered dispatch, except
 489    by generating anonymous classes with all the right mime-types as
 490    direct superclasses in dispatch order; the filter would generate
 491 #+begin_src lisp
 492 (ensure-class nil :direct-superclasses
 493  '(text/html image/webp ...))
 494 #+end_src
 495    and dispatch would operate using those anonymous classes.  While
 496    this is possible to do, it is awkward to express content-type
 497    negotiation in this way, as it means that the dispatcher must know
 498    about the universe of mime-types that clients might declare that
 499    they accept, rather than merely the set of mime-types that a
 500    particular generic function is capable of serving; handling
 501    wildcards in accept strings is particularly awkward in the
 502    filtering paradigm.
 503
 504    Note that in this example, the method on =specializer<= involves a
 505    non-trivial ordering of methods based on the =q= values specified
 506    in the accept header (whereas in sections [[#Cons]] and [[#Signum]] only a
 507    single extended specializer could be applicable to any given
 508    argument).
 509
 510    Also note that the accept specializer protocol is straightforwardly
 511    extensible to other suitable objects; for example, one simple
 512    debugging aid is to define that an =accept-specializer= should be
 513    applicable to =string= objects.  This can be done in a modular
 514    fashion (see the code below, which can be completely disconnected
 515    from the code for Hunchentoot request objects), and generalizes to
 516    dealing with multiple web server libraries, so that
 517    content-negotiation methods are applicable to each web server's
 518    request objects.
 519
 520 #+begin_src lisp
 521 (defmethod generalizer-of-using-class
 522     ((gf accept-generic-function)
 523      (s string))
 524   (make-instance 'accept-generalizer
 525                  :header s
 526                  :next (call-next-method)))
 527 (defmethod specializer-accepts-p
 528     ((s accept-specializer) (o string))
 529   (let* ((tree (parse-accept-string o))
 530          (q (q (media-type s) tree)))
 531     (and q (> q 0))))
 532 #+end_src
 533
 534    The =next= slot in the =accept-generalizer= is used to deal with
 535    the case of methods specialized on the classes of objects as well
 536    as on the acceptable media types; there is a method on
 537    =specializer-accepts-generalizer-p= for specializers that are not
 538    of type =accept-specializer= which calls the generic function again
 539    with the next generalizer, so that methods specialized on the
 540    classes =tbnl:request= and =string= are treated as applicable to
 541    corresponding objects, though less specific than methods with
 542    =accept-specializer= specializations.
 543
 544 ** COMMENT Pattern / xpattern / regex / optima
 545    Here's the /really/ interesting bit, but on the other hand we're
 546    probably going to run out of space, and the full description of
 547    these is going to take us into =make-method-lambda= territory.
 548    A second paper?  Future work?
 549 * Protocol
 550   :PROPERTIES:
 551   :CUSTOM_ID: Protocol
 552   :END:
 553
 554   In section [[#Examples]], we have seen a number of code fragments as
 555   partial implementations of particular non-standard method dispatch
 556   strategies, using =generalizer= metaobjects to mediate between the
 557   methods of the generic function and the actual arguments passed to
 558   it.  In section [[#Generalizer metaobjects]], we go into more detail
 559   regarding these =generalizer= metaobjects, describing the generic
 560   function invocation protocol in full, and showing how this protocol
 561   allows a similar form of effective method cacheing as the standard
 562   one does.  In section [[#Generalizer performance]], we show the results
 563   of some simple performance measurements on our implementation of
 564   this protocol in the SBCL implementation \cite{Rhodes:2008} of
 565   Common Lisp to highlight the improvement that this protocol can
 566   bring over a naïve implementation of generalized dispatch, as well
 567   as to make the potential for further improvement clear.
 568
 569 ** Generalizer metaobjects
 570    :PROPERTIES:
 571    :CUSTOM_ID: Generalizer metaobjects
 572    :END:
 573
 574 *** Generic function invocation
 575     As in the standard generic function invocation protocol, the
 576     generic function's actual functionality is provided by a
 577     discriminating function.  The functionality described in this
 578     protocol is implemented by having a distinct subclass of
 579     =standard-generic-function=, and a method on
 580     =compute-discriminating-function= which produces a custom
 581     discriminating function.  The basic outline of the discriminating
 582     function is the same as the standard one: it must first compute the
 583     set of applicable methods given particular arguments; from that, it
 584     must compute the effective method by combining the methods
 585     appropriately according to the generic function's method
 586     combination; finally, it must call the effective method with the
 587     arguments.
 588
 589     Computing the set of applicable methods is done using a pair of
 590     functions: =compute-applicable-methods=, the standard metaobject
 591     function, and a new function
 592     =compute-applicable-methods-using-generalizers=.  We define a
 593     custom method on =compute-applicable-methods= which tests the
 594     applicability of a particular specializer against a given argument
 595     using =specializer-accepts-p=, a new protocol function with
 596     default implementations on =class= and =eql-specializer= to
 597     implement the expected behaviour.  To order the methods, as
 598     required by the protocol, we define a pairwise comparison operator
 599     =specializer<= which defines an ordering between specializers for
 600     a given generalizer argument (remembering that even in standard
 601     CLOS the ordering between =class= specializers can change
 602     depending on the actual class of the argument).
 603
 604     The new =compute-applicable-methods-using-generalizers= is the
 605     analogue of the MOP's =compute-applicable-methods-using-classes=.
 606     Instead of calling it with the =class-of= each argument, we
 607     compute the generalizers of each argument using the new function
 608     =generalizer-of-using-class= (where the =-using-class= refers to
 609     the class of the generic function rather than the class of the
 610     object), and call =compute-applicable-methods-using-generalizers=
 611     with the generic function and list of generalizers.  As with
 612     =compute-applicable-methods-using-classes=, a secondary return
 613     value indicates whether the result of the function is definitive
 614     for that list of generalizers.
 615
 616     Thus, in generic function invocation, we first compute the
 617     generalizers of the arguments; we compute the ordered set of
 618     applicable methods, either from the generalizers or (if that is
 619     not definitive) from the arguments themselves; then the normal
 620     effective method computation and call can occur.  Unfortunately,
 621     the nature of an effective method function is not specified, so we
 622     have to reach into implementation internals a little in order to
 623     call it, but otherwise the remainder of the generic function
 624     invocation protocol is unchanged from the standard one.  In
 625     particular, method combination is completely unchanged;
 626     programmers can choose arbitrary method combinations, including
 627     user-defined long form combinations, for their generic functions
 628     involving generalized dispatch.
 629
 630 *** Effective method memoization
 631     :PROPERTIES:
 632     :CUSTOM_ID: Memoization
 633     :END:
 634     The potential efficiency benefit to having =generalizer=
 635     metaobjects lies in the use of
 636     =compute-applicable-methods-using-generalizers=.  If a particular
 637     generalized specializer accepts a variety of objects (such as the
 638     =signum= specializer accepting all reals with a given sign, or the
 639     =accept= specializer accepting all HTTP requests with a particular
 640     =Accept= header), then there is the possibility of cacheing and
 641     reusing the results of the applicable and effective method
 642     computation.  If the computation of the applicable method from
 643     =compute-applicable-methods-using-generalizers= is definitive,
 644     then the ordered set of applicable methods and the effective
 645     method can be cached.
 646
 647     One issue is what to use as the key for that cache.  We cannot use
 648     the generalizers themselves, as two generalizers that should be
 649     considered equal for cache lookup will not compare as =equal= –
 650     and indeed even the standard generalizer, the =class=, cannot
 651     easily be used as we must be able to invalidate cache entries upon
 652     class redefinition.  The issue of =class= generalizers we can
 653     solve as in \cite{Kiczales.Rodriguez:1990} by using the =wrapper=
 654     of a class, which is distinct for each distinct (re)definition of
 655     a class; for arbitrary generalizers, however, there is /a priori/
 656     no good way of computing a suitable hash key automatically, so we
 657     allow the metaprogrammer to specify one by defining a method on
 658     =generalizer-equal-hash-key=, and combining the hash keys for all
 659     required arguments in a list to use as a key in an =equal=
 660     hash-table.
 661
 662 #+begin_comment
 663     [could we actually compute a suitable hash key using the
 664     generalizer's class name and initargs?]
 665 #+end_comment
 666
 667 *** COMMENT
 668     - [X] =generalizer-of-using-class= (NB class of gf not class of object)
 669     - [X] =compute-applicable-methods-using-generalizers=
 670     - [X] =generalizer-equal-hash-key=
 671     - [X] =specializer-accepts-generalizer-p=
 672     - [X] =specializer-accepts-p=
 673     - [X] =specializer<=
 674 ** Performance
 675    :PROPERTIES:
 676    :CUSTOM_ID: Generalizer performance
 677    :END:
 678    We have argued that the protocol presented here allows for
 679    expressive control of method dispatch while preserving the
 680    possibility of efficiency.  In this section, we quantify the
 681    efficiency that the memoization protocol described in section
 682    [[#Memoization]] achieves, by comparing it both to the same protocol
 683    with no memoization, as well as with equivalent dispatch
 684    implementations in the context of methods with regular specializers
 685    (in an implementation similar to that in
 686    \cite{Kiczales.Rodriguez:1990}), and with implementation in
 687    straightforward functions.
 688
 689    In the case of the =cons-specializer=, we benchmark the walker
 690    acting on a small but non-trivial form.  The implementation
 691    strategies in the table below refer to: an implementation in a
 692    single function with a large =typecase= to dispatch between all the
 693    cases; the natural implementation in terms of a standard generic
 694    function with multiple methods (the method on =cons= having a
 695    slightly reduced =typecase= to dispatch on the first element, and
 696    other methods handling =symbol= and other atoms); and three
 697    separate cases using =cons-specializer= objects.  As well as
 698    measuring the effect of memoization against the full invocation
 699    protocol, we can also introduce a special case: when only one
 700    argument participates in method selection (all the other required
 701    arguments only being specialized on =t=), we can avoid the
 702    construction of a list of hash keys and simply use the key
 703    from the single active generalizer directly.
 704
 705    | implementation        | time (µs/call) | overhead |
 706    |-----------------------+----------------+----------|
 707    | function              |           3.17 |          |
 708    | standard-gf/methods   |            3.6 |     +14% |
 709    | cons-gf/one-arg-cache |            7.4 |    +130% |
 710    | cons-gf               |             15 |    +370% |
 711    | cons-gf/no-cache      |             90 |   +2700% |
 712
 713    The benchmarking results from this exercise are promising: in
 714    particular, the introduction of the effective method cache speeds
 715    up the use of generic specializers in this case by a factor of 6,
 716    and the one-argument special case by another factor of 2.  For this
 717    workload, even the one-argument special case only gets to within a
 718    factor of 2-3 of the function and standard generic function
 719    implementations, but the overall picture is that the memoizability
 720    in the protocol does indeed drastically reduce the overhead
 721    compared with the full invocation.
 722
 723    For the =signum-specializer= case, we choose to benchmark the
 724    computation of 20!, because that is the largest factorial whose
 725    answer fits in SBCL's 63-bit fixnums – in an attempt to measure the
 726    worst case for generic dispatch, where the work done within the
 727    methods is as small as possible without being meaningless, and in
 728    particular does not cause heap allocation or garbage collection to
 729    obscure the picture.
 730
 731 #+begin_src lisp :exports none
 732 (progn (gc :full t) (time (dotimes (i 10000) (%fact 20))))
 733 #+end_src
 734
 735    | implementation          | time (µs/call) | overhead |
 736    |-------------------------+----------------+----------|
 737    | function                |            0.6 |          |
 738    | standard-gf/fixnum      |            1.2 |    +100% |
 739    | signum-gf/one-arg-cache |            7.5 |   +1100% |
 740    | signum-gf               |             23 |   +3800% |
 741    | signum-gf/no-cache      |            240 |  +41000% |
 742
 743    The relative picture is similar to the =cons-specializer= case;
 744    including a cache saves a factor of 10 in this case, and another
 745    factor of 3 for the one-argument cache special case.  The cost of
 746    the genericity of the protocol here is starker; even the
 747    one-argument cache is a factor of 6 slower than the standard
 748    generic-function implementation, and a further factor of 2 away
 749    from the implementation of factorial as a function.  We discuss
 750    ways in which we expect to be able to improve performance in
 751    section [[#Future Work]].
 752
 753    We could allow the metaprogrammer to improve on the one-argument
 754    performance by constructing a specialized cache: for =signum=
 755    arguments of =rational= arguments, the logical cache structure is
 756    to index a three-element vector with =(1+ signum)=.  The current
 757    protocol does not provide a way of eliding the two generic function
 758    calls for the generic cache; we discuss possible approaches in
 759    section [[#Conclusions]].
 760 ** Full protocol
 761    The protocol described in this paper is only part of a complete
 762    protocol for =specializer= and =generalizer= metaobjects.  Our
 763    development of this protocol is as yet incomplete; the work
 764    described here augments that in \cite{Newton.Rhodes:2008}, but is
 765    yet relatively untested – and additionally our recent experience of
 766    working with that earlier protocol suggests that there might be
 767    useful additions to the handling of =specializer= metaobjects,
 768    independent of the =generalizer= idea presented here.
 769
 770 *** COMMENT ideas
 771     Description and specification left for reasons of space (we'll see?)
 772     - [ ] =same-specializer-p=
 773     - [ ] =parse/unparse-specializer-using-class=
 774     - [ ] =make-method-specializers-form=
 775     - [ ] jmoringe: In an email, I suggested
 776       =make-specializer-form-using-class=:
 777
 778       #+begin_quote
 779       Could we change =make-method-specializers-form='s default
 780       behaviour to call a new generic function
 781       #+begin_src
 782         make-specializer-form-using-class gf method name env
 783       #+end_src
 784       with builtin methods on =sb-mop:specializer=, =symbol=, =cons= (for
 785       eql-specializers)? This would make it unnecessary to repeat
 786       boilerplate along the lines of
 787       #+begin_src lisp
 788       (flet ((make-parse-form (name)
 789                (if <name-is-interesting>
 790                  <handle-interesting-specializer>
 791                  <repeat-handling-of-standard-specializers>)))
 792         `(list ,@(mapcar #'make-parse-form specializer-names)))
 793       #+end_src
 794       for each generic function class.
 795       #+end_quote
 796     - [ ] =make-method-lambda= revision (use environment arg?)
 797
 798       jmoringe: would only be relevant for pattern dispatch, right? I
 799       think, we didn't finish the discussion regarding special
 800       variables vs. environment vs. new protocol function
 801
 802 * Related Work
 803   :PROPERTIES:
 804   :CUSTOM_ID: Related Work
 805   :END:
 806
 807   The work presented here builds on specializer-oriented programming
 808   described in \cite{Newton.Rhodes:2008}.  Approximately
 809   contemporaneously, filtered dispatch \cite{Costanza.etal:2008} was
 810   introduced to address some of the same use cases: filtered dispatch
 811   works by having a custom discriminating function which wraps the
 812   usual one, where the wrapping function augments the set of
 813   applicable methods with applicable methods from other (hidden)
 814   generic functions, one per filter group; this step is not memoized,
 815   and using =eql= methods to capture behaviours of equivalence classes
 816   means that it is hard to see how it could be.  The methods are then
 817   combined using a custom method combination to mimic the standard
 818   one; in principle implementors of other method combinations could
 819   cater for filtered dispatch, but they would have to explicitly
 820   modify their method combinations.  The Clojure programming language
 821   supports multimethods[fn:6] with a variant of filtered dispatch as
 822   well as hierarchical and identity-based method selectors.
 823
 824   In context-oriented programming
 825   \cite{Hirschfeld.etal:2008,Vallejos.etal:2010}, context dispatch
 826   occurs by maintaining the context state as an anonymous class with
 827   the superclasses representing all the currently active layers; this
 828   is then passed as a hidden argument to context-aware functions.  The
 829   set of layers is known and under programmer control, as layers must
 830   be defined beforehand.
 831
 832   In some sense, all dispatch schemes are specializations of predicate
 833   dispatch \cite{Ernst.etal:1998}.  The main problem with predicate
 834   dispatch is its expressiveness: with arbitrary predicates able to
 835   control dispatch, it is essentially impossible to perform any
 836   substantial precomputation, or even to automatically determine an
 837   ordering of methods given a set of arguments.  Even Clojure's
 838   restricted dispatch scheme provides an explicit operator for stating
 839   a preference order among methods, where here we provide an operator
 840   to order specializers; in filtered dispatch the programmer
 841   implicitly gives the system an order of precedence, through the
 842   lexical ordering of filter specification in a filtered function
 843   definition.
 844
 845   The Slate programming environment combines prototype-oriented
 846   programming with multiple dispatch \cite{Salzman.Aldrich:2005}; in
 847   that context, the analogue of an argument's class (in Common Lisp)
 848   as a representation of the equivalence class of objects with the
 849   same behaviour is the tuple of roles and delegations: objects with
 850   the same roles and delegations tuple behave the same, much as
 851   objects with the same generalizer have the same behaviour in the
 852   protocol described in this paper.
 853
 854   The idea of generalization is of course not new, and arises in other
 855   contexts.  Perhaps of particular interest is generalization in the
 856   context of partial evaluation; for example, \cite{Ruf:1993}
 857   considers generalization in online partial evaluation, where sets of
 858   possible values are represented by a type system construct
 859   representing an upper bound.  Exploring the relationship between
 860   generalizer metaobjects and approximation in type systems might
 861   yield strategies for automatically computing suitable generalizers
 862   and cache functions for a variety of forms of generalized dispatch.
 863 * Conclusions
 864   :PROPERTIES:
 865   :CUSTOM_ID: Conclusions
 866   :END:
 867   In this paper, we have presented a new generalizer metaobject
 868   protocol allowing the metaprogrammer to implement in a
 869   straightforward manner metaobjects to implement custom method
 870   selection, rather than the standard method selection as standardized
 871   in Common Lisp.  This protocol seamlessly interoperates with the
 872   rest of CLOS and Common Lisp in general; the programmer (the user of
 873   the custom specializer metaobjects) may without constraints use
 874   arbitrary method combination, intercede in effective method
 875   combination, or write custom method function implementations.  The
 876   protocol is expressive, in that it handles forms of dispatch not
 877   possible in more restricted dispatch systems, while not suffering
 878   from the indeterminism present in predicate dispatch through the use
 879   of explicit ordering predicates.
 880
 881   The protocol is also reasonably efficient; the metaprogrammer can
 882   indicate that a particular effective method computation can be
 883   memoized, and under those circumstances much of the overhead is
 884   amortized (though there remains a substantial overhead compared with
 885   standard generic-function or regular function calls).  We discuss
 886   how the efficiency could be improved below.
 887 ** Future work
 888    :PROPERTIES:
 889    :CUSTOM_ID: Future Work
 890    :END:
 891    Although the protocol described in this paper allows for a more
 892    efficient implementation, as described in section [[#Memoization]],
 893    than computing the applicable and effective methods at each generic
 894    function call, the efficiency is still some way away from a
 895    baseline of the standard generic-function, let alone a standard
 896    function.  Most of the invocation protocol is memoized, but there
 897    are still two full standard generic-function calls –
 898    =generalizer-of-using-class= and =generalizer-equal-hash-key= – per
 899    argument per call to a generic function with extended specializers,
 900    not to mention a hash table lookup.
 901
 902    For many applications, the additional flexibility afforded by
 903    generalized specializers might be worth the cost in efficiency, but
 904    it would still be worth investigating how much the overhead from
 905    generalized specializers can be reduced; one possible avenue for
 906    investigation is giving greater control over the cacheing strategy
 907    to the metaprogrammer.
 908
 909    As an example, consider the =signum-specializer=.  The natural
 910    cache structure for a single argument generic function specializing
 911    on =signum= is probably a four-element vector, where the first
 912    three elements hold the effective methods for =signum= values of
 913    -1, 0, and 1, and the fourth holds the cached effective methods for
 914    everything else.  This would make the invocation of such functions
 915    very fast for the (presumed) common case where the argument is in
 916    fact a real number.  We hope to develop and show the effectiveness
 917    of an appropriate protocol to allow the metaprogrammer to construct
 918    and exploit such cacheing strategies, and (more speculatively) to
 919    implement the lookup of an effective method function in other ways.
 920
 921    We also aim to demonstrate support within this protocol for some
 922    particular cases of generalized specializers which seem to have
 923    widespread demand (in as much as any language extension can be said
 924    to be in “demand”).  In particular, we have preliminary work
 925    towards supporting efficient dispatch over pattern specializers
 926    such as implemented in the \textsf{Optima} library[fn:7], and over
 927    a prototype object system similar to that in Slate
 928    \cite{Salzman.Aldrich:2005}.  Our current source code for the work
 929    described in this paper can be seen in the git source code
 930    repository at [[http://christophe.rhodes.io/git/specializable.git]],
 931    which will be updated with future developments.
 932
 933    Finally, after further experimentation (and, ideally, non-trivial
 934    use in production) if this protocol stands up to use as we hope, we
 935    aim to produce a standards-quality document so that other
 936    implementors of Common Lisp can, if they choose, independently
 937    reimplement the protocol, and so that users can use the protocol
 938    with confidence that the semantics will not change in a
 939    backwards-incompatible fashion.
 940 ** Acknowledgments
 941    We thank Lee Salzman, Pascal Costanza and Mikel Evins for helpful
 942    and informative discussions, and all the respondents to the first
 943    author's request for imaginative uses for generalized specializers.
 944
 945 \bibliographystyle{plain}
 946 \bibliography{crhodes,specializers}
 947
 948 * Footnotes
 949
 950 [fn:1] GNU CLISP, at http://www.clisp.org/
 951
 952 [fn:2] Clozure Common Lisp, at http://ccl.clozure.com/
 953
 954 [fn:3] the \textsf{Closer to MOP} project, at
 955    http://common-lisp.net/project/closer/, attempts to harmonize the
 956    different implementations of the metaobject protocol in Common
 957    Lisp.
 958
 959 [fn:4] the tag =els2014-submission= in
 960 http://christophe.rhodes.io/git/specializable.git corresponds to the
 961 code repository at the point of submitting this paper.
 962
 963 [fn:5] Hunchentoot is a web server written in Common Lisp, allowing
 964 the user to write handler functions to compute responses to requests;
 965 http://weitz.de/hunchentoot/
 966
 967 [fn:6] http://clojure.org/multimethods
 968
 969 [fn:7] https://github.com/m2ym/optima