els-specializers.org

   1 #+TITLE: Generalizers: New Metaobjects for Generalized Dispatch
   2 #+AUTHOR: Christophe Rhodes, Jan Moringen, David Lichteblau
   3 #+OPTIONS: toc:nil
   4
   5 #+LaTeX_CLASS: acm_proc_article-sp
   6 #+LaTeX_HEADER: \DeclareTextFontCommand{\texttt}{\ttfamily\hyphenchar\font=45\relax}
   7
   8 #+begin_src elisp :exports none
   9 ;;; use C-x C-e on this if org refuses to export
  10 (add-to-list 'org-latex-classes
  11              '("acm_proc_article-sp" "\\documentclass{acm_proc_article-sp}"
  12                ("\\section{%s}" . "\\section*{%s}")
  13                ("\\subsection{%s}" . "\\subsection*{%s}")
  14                ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
  15                ("\\paragraph{%s}" . "\\paragraph*{%s}")
  16                ("\\subparagraph{%s}" . "\\subparagraph*{%s}")))
  17 (set (make-local-variable 'org-latex-title-command)
  18      "\\numberofauthors{3}
  19 \\author{
  20 \\alignauthor Christophe Rhodes\\\\
  21   \\affaddr{Department of Computing}\\\\
  22   \\affaddr{Goldsmiths, University of London}\\\\
  23   \\affaddr{London SE14 6NW}\\\\
  24   \\email{c.rhodes@gold.ac.uk}
  25 \\alignauthor Jan Moringen\\\\
  26   \\affaddr{Universität Bielefeld}\\\\
  27   \\affaddr{Technische Fakultät}\\\\
  28   \\affaddr{33594 Bielefeld}\\\\
  29   \\email{jmoringe@techfak.uni-bielefeld.de}
  30 \\alignauthor David Lichteblau\\\\
  31   \\affaddr{ZenRobotics Ltd}\\\\
  32   \\affaddr{Vilhonkatu 5 A}\\\\
  33   \\affaddr{FI-00100 Helsinki}\\\\
  34   \\email{david@lichteblau.com}
  35 }
  36 \\maketitle")
  37 #+end_src
  38
  39 #+begin_abstract
  40 This paper introduces a new metaobject, the generalizer, which
  41 complements the existing specializer metaobject.  With the help of
  42 examples, we show that this metaobject allows for the efficient
  43 implementation of complex non-class-based dispatch within the
  44 framework of existing metaobject protocols.  We present our
  45 modifications to the generic function invocation protocol from the
  46 /Art of the Metaobject Protocol/; in combination with previous work,
  47 this produces a fully-functional extension of the existing mechanism
  48 for method selection and combination, including support for method
  49 combination completely independent from method selection.  We discuss
  50 our implementation, within the SBCL implementation of Common Lisp, and
  51 in that context compare the performance of the new protocol with the
  52 standard one, demonstrating that the new protocol can be tolerably
  53 efficient.
  54 #+end_abstract
  55
  56 #+begin_LaTeX
  57 \category{D.1}{Software}{Programming Techniques}[Object-oriented Programming]
  58 \category{D.3.3}{Programming Languages}{Language Constructs and Features}
  59 \terms{Languages, Design}
  60 \keywords{generic functions, specialization-oriented programming, method selection, method combination}
  61 #+end_LaTeX
  62
  63 * Introduction
  64   The revisions to the original Common Lisp language \cite{CLtL}
  65   included the detailed specification of an object system, known as
  66   the Common Lisp Object System (CLOS), which was eventually
  67   standardized as part of the ANSI Common Lisp standard \cite{CLtS}.
  68   The object system as presented to the standardization committee was
  69   formed of three chapters.  The first two chapters covered programmer
  70   interface concepts and the functions in the programmer interface
  71   \cite[Chapter 28]{CLtL2} and were largely incorporated into the
  72   final standard; the third chapter, covering a Metaobject Protocol
  73   (MOP) for CLOS, was not.
  74
  75   Nevertheless, the CLOS MOP has proven to be a robust design, and
  76   while many implementations have derived their implementations of
  77   CLOS from either the Closette illustrative implementation in
  78   \cite{AMOP}, or the Portable Common Loops implementation of CLOS
  79   from Xerox Parc, there have been largely from-scratch
  80   reimplementations of CLOS (in CLISP[fn:1] and CCL[fn:2], at least)
  81   incorporating substantial fractions of the Metaobject Protocol as
  82   described.
  83
  84   Although it has stood the test of time, the CLOS MOP is neither
  85   without issues (e.g. semantic problems with =make-method-lambda=
  86   \cite{Costanza.Herzeel:2008}; useful functions such as
  87   =compute-effective-slot-definition-initargs= being missing from the
  88   standard) nor is it a complete framework for the metaprogrammer to
  89   implement all conceivable variations of object-oriented behaviour.
  90   While metaprogramming offers some possibilities for customization of
  91   the object system behaviour, those possibilities cannot extend
  92   arbitrarily in all directions.  There is still an expectation that
  93   functionality is implemented with methods on generic functions,
  94   acting on objects with slots; it is not possible, for example, to
  95   transparently implement support for “message not understood” as in
  96   the message-passing paradigm, because the analogue of messages
  97   (generic functions) need to be defined before they are used.
  98
  99   Nevertheless, the MOP is flexible, and is used for a number of
 100   things, including: documentation generation (where introspective
 101   functionality in the MOP is used to extract information from a
 102   running system); object-relational mapping and other approaches to
 103   object persistence; alternative backing stores for slots
 104   (hash-tables or symbols); and programmatic construction of
 105   metaobjects, for example for IDL compilers and model
 106   transformations.
 107
 108   [ XXX: A picture on MOP flexibility here would be good; I have in my mind
 109   one where an object system is a point and the MOP opens up a blob
 110   around that point, and I'm sure I've seen it somewhere but I can't
 111   remember where.  Alternatively, there's Kiczales et al "MOPs: why we
 112   want them and what else they can do", fig. 2 ]
 113   [AMOP, page 5] paints that picture, but again, only using words :)
 114
 115   One area of functionality where there is scope for customization by
 116   the metaprogrammer is in the mechanics and semantics of method
 117   applicability and dispatch.  While in principle AMOP allows
 118   customization of dispatch in various different ways (the
 119   metaprogrammer can define methods on protocol functions such as
 120   =compute-applicable-methods=,
 121   =compute-applicable-methods-using-classes=), for example, in
 122   practice implementation support for this was weak until relatively
 123   recently[fn:3].
 124
 125   Another potential mechanism for customizing dispatch is implicit in
 126   the class structure defined by AMOP: standard specializer objects
 127   (instances of =class= and =eql-specializer=) are generalized
 128   instances of the =specializer= protocol class, and in principle
 129   there are no restrictions on the metaprogrammer constructing
 130   additional subclasses.  Previous work \cite{Newton.Rhodes:2008} has
 131   explored the potential for customizing generic function dispatch
 132   using extended specializers, but as of that work the metaprogrammer
 133   must override the entirety of the generic function invocation
 134   protocol (from =compute-discriminating-function= on down), leading
 135   to toy implementations and duplicated effort.
 136
 137   This paper introduces a protocol for efficient and controlled
 138   handling of new subclasses of =specializer=.  In particular, it
 139   introduces the =generalizer= protocol class, which generalizes the
 140   return value of =class-of= in method applicability computation, and
 141   allows the metaprogrammer to hook into cacheing schemes to avoid
 142   needless recomputation of effective methods for sufficiently similar
 143   generic function arguments (See Figure\nbsp\ref{fig:dispatch}).
 144
 145   #+CAPTION:    Dispatch Comparison
 146   #+LABEL:      fig:dispatch
 147   #+ATTR_LATEX: width=0.9\linewidth float
 148   [[file:figures/dispatch-comparison.pdf]]
 149
 150   The remaining sections in this paper can be read in any order.  We
 151   give some motivating examples in section [[#Examples]], including
 152   reimplementations of examples from previous work, as well as
 153   examples which are poorly supported by previous protocols.  We
 154   describe the protocol itself in section [[#Protocol]], describing each
 155   protocol function in detail and, where applicable, relating it to
 156   existing protocol functions within the CLOS MOP.  We survey related
 157   work in more detail in section [[#Related Work]], touching on work on
 158   customized dispatch schemes in other environments.  Finally, we draw
 159   our conclusions from this work, and indicate directions for further
 160   development, in section [[#Conclusions]]; reading that section before the
 161   others indicates substantial trust in the authors' work.
 162 * Examples
 163   :PROPERTIES:
 164   :CUSTOM_ID: Examples
 165   :END:
 166   In this section, we present a number of examples of dispatch
 167   implemented using our protocol, which we describe in section
 168   [[#Protocol]].  For reasons of space, the metaprogram code examples in
 169   this section do not include some of the necessary support code to
 170   run; complete implementations of each of these cases are included in
 171   an appendix / in the accompanying repository snapshot / at this
 172   location.
 173
 174   A note on terminology: we will attempt to distinguish between the
 175   user of an individual case of generalized dispatch (the
 176   “programmer”), the implementor of a particular case of generalized
 177   dispatch (the “metaprogrammer”), and the authors as the designers
 178   and implementors of our generalized dispatch protocol (the
 179   “metametaprogammer”, or more likely “we”).
 180 ** CONS specializers
 181    :PROPERTIES:
 182    :CUSTOM_ID: Cons
 183    :END:
 184    One motivation for the use of generalized dispatch is in an
 185    extensible code walker: a new special form can be handled simply by
 186    writing an additional method on the walking generic function,
 187    seamlessly interoperating with all existing methods. In this
 188    use-case, dispatch is performed on the first element of lists.
 189    Semantically, we allow the programmer to specialize any argument of
 190    methods with a new kind of specializer, =cons-specializer=, which
 191    is applicable if and only if the corresponding object is a =cons=
 192    whose =car= is =eql= to the symbol associated with the
 193    =cons-specializer=; these specializers are more specific than the
 194    =cons= class, but less specific than an =eql-specializer= on any
 195    given =cons=.
 196
 197    The programmer code using these specializers is unchanged from
 198    \cite{Newton.Rhodes:2008}; the benefits of the protocol described
 199    here are: that the separation of concerns is complete – method
 200    selection is independent of method combination – and that the
 201    protocol allows where possible for efficient implementation even
 202    when method selection is customized.  In an application such as
 203    walking source code, we would expect to encounter special forms
 204    (distinguished by particular atoms in the =car= position) multiple
 205    times, and hence to dispatch to the same effective method
 206    repeatedly.  We discuss the efficiency aspects of the protocol in
 207    more detail in section [[#Memoization]]; we present the metaprogrammer
 208    code to implement the =cons-specializer= below.
 209
 210 #+begin_src lisp
 211 (defclass cons-specializer (specializer)
 212   ((%car :reader %car :initarg :car)))
 213 (defclass cons-generalizer (generalizer)
 214   ((%car :reader %car :initarg :car)))
 215 (defmethod generalizer-of-using-class
 216     ((gf cons-generic-function) arg)
 217   (typecase arg
 218     ((cons symbol)
 219      (make-instance 'cons-generalizer
 220                     :car (car arg)))
 221     (t (call-next-method))))
 222 (defmethod generalizer-equal-hash-key
 223     ((gf cons-generic-function)
 224      (g cons-generalizer))
 225   (%car g))
 226 (defmethod specializer-accepts-generalizer-p
 227     ((gf cons-generic-function)
 228      (s cons-specializer)
 229      (g cons-generalizer))
 230   (if (eql (%car s) (%car g))
 231       (values t t)
 232       (values nil t)))
 233 (defmethod specializer-accepts-p
 234     ((s cons-specializer) o)
 235   (and (consp o) (eql (car o) (%car s))))
 236 #+end_src
 237
 238 The code above shows a minimal use of our protocol.  We have elided
 239 some support code for parsing and unparsing specializers, and for
 240 handling introspective functions such as finding generic functions for
 241 a given specializer.  We have also elided methods on the protocol
 242 functions =specializer<= and =same-specializer-p=; for
 243 =cons-specializer= objects, specializer ordering is trivial, as only
 244 one =cons-specializer= (up to equality) can ever be applicable to any
 245 given argument.  See section [[#Accept]] for a case where specializer
 246 ordering is non-trivial.
 247
 248 As in \cite{Newton.Rhodes:2008}, the programmer can use these
 249 specializers to implement a modular code walker, where they define one
 250 method per special operator.  We show two of those methods below, in
 251 the context of a walker which checks for unused bindings and uses of
 252 unbound variables.
 253
 254 #+begin_src
 255 (defgeneric walk (form env stack)
 256   (:generic-function-class cons-generic-function))
 257 (defmethod walk ((expr (cons lambda)) env call-stack)
 258   (let ((lambda-list (cadr expr))
 259         (body (cddr expr)))
 260     (with-checked-bindings
 261         ((bindings-from-ll lambda-list) env call-stack)
 262       (dolist (form body)
 263         (walk form env (cons form call-stack))))))
 264 (defmethod walk ((expr (cons let)) env call-stack)
 265   (flet ((let-binding (x)
 266            (walk (cadr x) env (cons (cadr x) call-stack))
 267            (cons (car x) (make-instance 'binding))))
 268     (with-checked-bindings
 269         ((mapcar #'let-binding (cadr expr)) env call-stack)
 270       (dolist (form (cddr expr))
 271         (walk form env (cons form call-stack))))))
 272 #+end_src
 273
 274    Note that in this example there is no strict need for
 275    =cons-specializer= and =cons-generalizer= to be distinct classes.
 276    In standard generic function dispatch, the =class= functions both
 277    as the specializer for methods and as the generalizer for generic
 278    function arguments; we can think of the dispatch implemented by
 279    =cons-specializer= objects as providing for subclasses of the
 280    =cons= class distinguished by the =car= of the =cons=.  This
 281    analogy also characterizes those use cases where the metaprogrammer
 282    could straightforwardly use filtered dispatch
 283    \cite{Costanza.etal:2008} to implement their dispatch semantics.
 284    We will see in section [[#Accept]] an example of a case where filtered
 285    dispatch is incapable of straightforwardly expressing the dispatch,
 286    but first we present our implementation of the motivating case from
 287    \cite{Costanza.etal:2008}.
 288 ** SIGNUM specializers
 289    :PROPERTIES:
 290    :CUSTOM_ID: Signum
 291    :END:
 292    Our second example of the implementation and use of generalized
 293    specializers is a reimplementation of one of the examples in
 294    \cite{Costanza.etal:2008}: specifically, the factorial function.
 295    Here, we will perform dispatch based on the =signum= of the
 296    argument, and again, at most one method with a =signum= specializer
 297    will be appliable to any given argument, which makes the structure
 298    of the specializer implementation very similar to the =cons=
 299    specializers in the previous section.
 300
 301    The metaprogrammer has chosen in the example below to compare
 302    signum values using \texttt{=}, which means that a method with
 303    specializer =(signum 1)= will be applicable to positive
 304    floating-point arguments (see the first method on
 305    =specializer-accepts-generalizer-p= and the method on
 306    =specializer-accepts-p= below).  This leads to one subtle
 307    difference in behaviour compared to that of the =cons=
 308    specializers: in the case of =signum= specializers, the /next/
 309    method after any =signum= specializer can be different, depending
 310    on the class of the argument.  This aspect of the dispatch is
 311    handled by the second method on =specializer-accepts-generalizer-p=
 312    below.
 313
 314 #+begin_src lisp
 315 (defclass signum-specializer (specializer)
 316   ((%signum :reader %signum :initarg :signum)))
 317 (defclass signum-generalizer (generalizer)
 318   ((%signum :reader %signum :initarg :signum)))
 319 (defmethod generalizer-of-using-class
 320     ((gf signum-generic-function) (arg real))
 321   (make-instance 'signum-generalizer
 322                  :signum (signum arg)))
 323 (defmethod generalizer-equal-hash-key
 324     ((gf signum-generic-function)
 325      (g signum-generalizer))
 326   (%signum g))
 327 (defmethod specializer-accepts-generalizer-p
 328     ((gf signum-generic-function)
 329      (s signum-specializer)
 330      (g signum-generalizer))
 331   (if (= (%signum s) (%signum g))
 332       (values t t)
 333       (values nil t)))
 334
 335 (defmethod specializer-accepts-generalizer-p
 336     ((gf signum-generic-function)
 337      (s specializer)
 338      (g signum-generalizer))
 339   (specializer-accepts-generalizer-p
 340    gf s (class-of (%signum g))))
 341
 342 (defmethod specializer-accepts-p
 343     ((s signum-specializer) o)
 344   (and (realp o) (= (%signum s) (signum o))))
 345 #+end_src
 346
 347    Given these definitions, and once again some more straightforward
 348    ones elided for reasons of space, the programmer can implement the
 349    factorial function as follows:
 350
 351 #+begin_src lisp
 352 (defgeneric fact (n)
 353   (:generic-function-class signum-generic-function))
 354 (defmethod fact ((n (signum 0))) 1)
 355 (defmethod fact ((n (signum 1))) (* n (fact (1- n))))
 356 #+end_src
 357
 358    We do not need to include a method on =(signum -1)=, as the
 359    standard =no-applicable-method= protocol will automatically apply to
 360    negative real or non-real arguments.
 361 ** Accept HTTP header specializers
 362    :PROPERTIES:
 363    :CUSTOM_ID: Accept
 364    :END:
 365    In this section, we implement a non-trivial form of dispatch.  The
 366    application in question is a web server, and specifically to allow
 367    the programmer to support RFC 2616 \cite{rfc2616} content
 368    negotiation, of particular interest to publishers and consumers of
 369    REST-style Web APIs.
 370
 371    The basic mechanism in content negotiation is as follows: the web
 372    client sends an HTTP request with an =Accept= header, which is a
 373    string describing the media types it is willing to receive as a
 374    response to the request, along with numerical preferences.  The web
 375    server compares these stated client preferences with the resources
 376    it has available to satisfy this request, and sends the best
 377    matching resource in its response.
 378
 379    For example, a graphical web browser might by default send an
 380    =Accept= header such as
 381    =text/html,application/xml;q=0.9,*/*;q=0.8=.  This should be
 382    interpreted as meaning that if for a given resource the server can
 383    provide content of type =text/html= (i.e. HTML), then it should do
 384    so.  Otherwise, if it can provide =application/xml= content
 385    (i.e. XML of any schema), then that should be provided; failing
 386    that, any other content type is acceptable.
 387
 388    In the case where there are static files on the filesystem, and the
 389    web server must merely select between them, there is not much more
 390    to say.  However, it is not unusual for a web service to be backed
 391    by some other form of data, and responses computed and sent on the
 392    fly, and in these circumstances the web server must compute which
 393    of its known output formats it can use to satisfy the request
 394    before actually generating the best matching response.  This can be
 395    modelled as one generic function responsible for generating the
 396    response, with methods corresponding to content-types -- and the
 397    generic function must then perform method selection against the
 398    request's =Accept= header to compute the appropriate response.
 399
 400    The =accept-specializer= below implements this dispatch.  It depends
 401    on a lazily-computed =tree= slot to represent the information in
 402    the accept header (generated by =parse-accept-string=), and a
 403    function =q= to compute the (defaulted) preference level for a
 404    given content-type and =tree=; then, method selection and ordering
 405    involves finding the =q= for each =accept-specializer='s content
 406    type given the =tree=, and sorting them according to the preference
 407    level.
 408
 409 #+begin_src lisp
 410 (defclass accept-specializer (specializer)
 411   ((media-type :initarg :media-type :reader media-type)))
 412 (defclass accept-generalizer (generalizer)
 413   ((header :initarg :header :reader header)
 414    (tree)
 415    (next :initarg :next :reader next)))
 416 (defmethod generalizer-equal-hash-key
 417     ((gf accept-generic-function)
 418      (g accept-generalizer))
 419   `(accept-generalizer ,(header g)))
 420 (defmethod specializer-accepts-generalizer-p
 421     ((gf accept-generic-function)
 422      (s accept-specializer)
 423      (g accept-generalizer))
 424   (values (q (media-type s) (tree g)) t))
 425 (defmethod specializer-accepts-generalizer-p
 426     ((gf accept-generic-function)
 427      (s specializer)
 428      (g accept-generalizer))
 429   (specializer-accepts-generalizer-p
 430    gf s (next g)))
 431
 432 (defmethod specializer<
 433     ((gf accept-generic-function)
 434      (s1 accept-specializer)
 435      (s2 accept-specializer)
 436      (g accept-generalizer))
 437   (let ((m1 (media-type s1))
 438         (m2 (media-type s2))
 439         (tree (tree g)))
 440     (cond
 441       ((string= m1 m2) '=)
 442       (t (let ((q1 (q m1 tree)))
 443                (q2 (q m2 tree))))
 444            (cond
 445              ((= q1 q2) '=)
 446              ((< q1 q2) '>)
 447              (t '<))))))
 448 #+end_src
 449
 450    The metaprogrammer can then add support for objects representing
 451    client requesting, such as instances of the =request= class in the
 452    Hunchentoot web server, by translating these into
 453    =accept-generalizer= instances.  The code below implements this, by
 454    defining the computation of a =generalizer= object for a given
 455    request, and specifying how to compute whether the specializer
 456    accepts the given request object (=q= returns a number between 0
 457    and 1 if any pattern in the =tree= matches the media type, and
 458    =nil= if the media type cannot be matched at all).
 459
 460 #+begin_src
 461 (defmethod generalizer-of-using-class
 462     ((gf accept-generic-function)
 463      (arg tbnl:request))
 464   (make-instance 'accept-generalizer
 465                  :header (tbnl:header-in :accept arg)
 466                  :next (class-of arg)))
 467 (defmethod specializer-accepts-p
 468     ((s accept-specializer)
 469      (o tbnl:request))
 470   (let* ((accept (tbnl:header-in :accept o))
 471          (tree (parse-accept-string accept))
 472          (q (q (media-type s) tree)))
 473     (and q (> q 0))))
 474 #+end_src
 475
 476    This dispatch cannot be implemented using filtered dispatch, except
 477    by generating anonymous classes with all the right mime-types as
 478    direct superclasses in dispatch order; the filter would generate
 479 #+begin_src lisp
 480 (ensure-class nil :direct-superclasses
 481  '(text/html image/webp ...))
 482 #+end_src
 483    and dispatch operates using those anonymous classes.  While
 484    this is possible to do, it is awkward to express content-type
 485    negotiation in this way, as it means that the dispatcher must know
 486    about the universe of mime-types that clients might declare that
 487    they accept, rather than merely the set of mime-types that a
 488    particular generic function is capable of serving; handling
 489    wildcards in accept strings is particularly awkward in the
 490    filtering paradigm.
 491
 492    Note that in this example, the method on =specializer<= involves a
 493    nontrivial ordering of methods based on the =q= values specified in
 494    the accept header (whereas in sections [[#Cons]] and [[#Signum]] only a
 495    single extended specializer could be applicable to any given
 496    argument).
 497
 498    Also note that the accept specializer protocol is straightforwardly
 499    extensible to other suitable objects; for example, one simple
 500    debugging aid is to define that an =accept-specializer= should be
 501    applicable to =string= objects.  This can be done in a modular
 502    fashion (see the code below, which can be completely disconnected
 503    from the code for Hunchentoot request objects), and generalizes to
 504    dealing with multiple web server libraries, so that
 505    content-negotiation methods are applicable to each web server's
 506    request objects.
 507
 508 #+begin_src lisp
 509 (defmethod generalizer-of-using-class
 510     ((gf accept-generic-function)
 511      (s string))
 512   (make-instance 'accept-generalizer
 513                  :header s
 514                  :next (class-of s)))
 515 (defmethod specializer-accepts-p
 516     ((s accept-specializer) (o string))
 517   (let* ((tree (parse-accept-string o))
 518          (q (q (media-type s) tree)))
 519     (and q (> q 0))))
 520 #+end_src
 521
 522    The =next= slot in the =accept-generalizer= is present to deal with
 523    the case of methods specialized on the classes of objects as well
 524    as on the acceptable media types; there is a method on
 525    =specializer-accepts-generalizer-p= for specializers that are not
 526    of type =accept-specializer= which calls the generic function again
 527    with the next generalizer, so that methods specialized on the
 528    classes =tbnl:request= and =string= are treated as applicable to
 529    corresponding objects, though less specific than methods with
 530    =accept-specializer= specializations.
 531
 532 ** COMMENT Pattern / xpattern / regex / optima
 533    Here's the /really/ interesting bit, but on the other hand we're
 534    probably going to run out of space, and the full description of
 535    these is going to take us into =make-method-lambda= territory.
 536    A second paper?  Future work?
 537 * Protocol
 538   :PROPERTIES:
 539   :CUSTOM_ID: Protocol
 540   :END:
 541
 542   In section [[#Examples]], we have seen a number of code fragments as
 543   partial implementations of particular non-standard method dispatch,
 544   using =generalizer= metaobjects to mediate between the methods of
 545   the generic function and the actual arguments passed to it.  In
 546   section [[#Generalizer metaobjects]], we go into more detail regarding
 547   these =generalizer= metaobjects, describing the generic function
 548   invocation protocol in full, and showing how this protocol allows a
 549   similar form of effective method cacheing as the standard one does.
 550   In section [[#Generalizer performance]], we show the results of some
 551   simple performance measurements on our implementation of this
 552   protocol in the SBCL implementation \cite{Rhodes:2008} of Common
 553   Lisp to highlight the improvement that this protocol can bring over
 554   a naïve implementation of generalized dispatch, as well as
 555   to make the potential for further improvement clear.
 556
 557 ** Generalizer metaobjects
 558    :PROPERTIES:
 559    :CUSTOM_ID: Generalizer metaobjects
 560    :END:
 561
 562 *** Generic function invocation
 563     As in the standard generic function invocation protocol, the
 564     generic function's actual functionality is provided by a
 565     discriminating function.  The functionality described in this
 566     protocol is implemented by having a distinct subclass of
 567     =standard-generic-function=, and a method on
 568     =compute-discriminating-function= which produces a custom
 569     discriminating function.  The basic outline of the discriminating
 570     function is the same as the standard one: it must first compute the
 571     set of applicable methods given particular arguments; from that, it
 572     must compute the effective method by combining the methods
 573     appropriately according to the generic function's method
 574     combination; finally, it must call the effective method with the
 575     arguments.
 576
 577     Computing the set of applicable methods is done using a pair of
 578     functions: =compute-applicable-methods=, the standard metaobject
 579     function, and a new function
 580     =compute-applicable-methods-using-generalizers=.  We define a
 581     custom method on =compute-applicable-methods= which tests the
 582     applicability of a particular specializer against a given argument
 583     using =specializer-accepts-p=, a new protocol function with
 584     default implementations on =class= and =eql-specializer= to
 585     implement the expected behaviour.  In order to order the methods,
 586     as required by the protocol, we define a pairwise comparison
 587     operator =specializer<= which defines an ordering between
 588     specializers for a given generalizer argument (remembering that
 589     even in standard CLOS the ordering between =class= specializers
 590     can change depending on the actual class of the argument).
 591
 592     The new =compute-applicable-methods-using-generalizers= is the
 593     analogue of the MOP's =compute-applicable-methods-using-classes=.
 594     Instead of calling it with the =class-of= each argument, we compute
 595     the generalizers of each argument using the new function
 596     =generalizer-of-using-class= (where the =-using-class= refers to
 597     the class of the generic function rather than the class of the
 598     object), and call it with the list of generalizers.  As with the
 599     standard function, a secondary return value indicates whether the
 600     result of the function is definitive for that list of generalizers.
 601
 602     Thus, in generic function invocation, we first compute the
 603     generalizers of the arguments; we compute the ordered set of
 604     applicable methods, either from the generalizers or (if that is
 605     not definitive) from the arguments themselves; then the normal
 606     effective method computation and call can occur.  Unfortunately,
 607     the nature of an effective method object is not specified, so we
 608     have to reach into implementation internals a little in order to
 609     call it, but otherwise the remainder of the generic function
 610     invocation protocol is unchanged from the standard one.  In
 611     particular, method combination is completely unchanged;
 612     programmers can choose arbitrary method combinations, including
 613     user-defined long form combinations, for their generic functions
 614     involving generalized dispatch.
 615
 616 *** Effective method memoization
 617     :PROPERTIES:
 618     :CUSTOM_ID: Memoization
 619     :END:
 620     The potential efficiency benefit to having =generalizer=
 621     metaobjects lies in the use of
 622     =compute-applicable-methods-using-generalizers=.  If a particular
 623     generalized specializer accepts a variety of objects (such as the
 624     =signum= specializer accepting all reals with a given sign, or the
 625     =accept= specializer accepting all HTTP requests with a particular
 626     =Accept= header), then there is the possibility of cacheing and
 627     reusing the results of the applicable and effective method
 628     computation.  If the computation of the applicable method from
 629     =compute-applicable-methods-using-generalizers= is definitive,
 630     then the ordered set of applicable methods and the effective
 631     method can be cached.
 632
 633     One issue is what to use as the key for that cache.  We cannot use
 634     the generalizers themselves, as two generalizers that should be
 635     considered equal for cache lookup will not compare as =equal= –
 636     and indeed even the standard generalizer, the =class=, cannot be
 637     used as we must be able to invalidate cache entries upon class
 638     redefinition.  The issue of =class= generalizers we can solve as
 639     in \cite{Kiczales.Rodriguez:1990} by using the =wrapper= of a
 640     class, which is distinct for each distinct (re)definition of a
 641     class; for arbitrary generalizers, however, there is /a priori/ no
 642     good way of computing a suitable hash key automatically, so we
 643     allow the metaprogrammer to specify one by defining a method on
 644     =generalizer-equal-hash-key=, and combining the hash keys for all
 645     required arguments in a list to use as a key in an =equal=
 646     hash-table.
 647
 648     [XXX could we actually compute a suitable hash key using the
 649     generalizer's class name and initargs?]
 650
 651 *** COMMENT
 652     - [X] =generalizer-of-using-class= (NB class of gf not class of object)
 653     - [X] =compute-applicable-methods-using-generalizers=
 654     - [X] =generalizer-equal-hash-key=
 655     - [X] =specializer-accepts-generalizer-p=
 656     - [X] =specializer-accepts-p=
 657     - [X] =specializer<=
 658 ** Performance
 659    :PROPERTIES:
 660    :CUSTOM_ID: Generalizer performance
 661    :END:
 662    We have argued that the protocol presented here allows for
 663    expressive control of method dispatch while preserving the
 664    possibility of efficiency.  In this section, we quantify the
 665    efficiency that the memoization protocol described in section
 666    [[#Memoization]] achieves, by comparing it both to the same protocol
 667    with no memoization, as well as with equivalent dispatch
 668    implementations in the context of methods with regular specializers
 669    (in an implementation similar to that in
 670    \cite{Kiczales.Rodriguez:1990}), and with implementation in
 671    straightforward functions.
 672
 673    In the case of the =cons-specializer=, we benchmark the walker
 674    acting on a small but non-trivial form.  The implementation
 675    strategies in the table below refer to: an implementation in a
 676    single function with a large =typecase= to dispatch between all the
 677    cases; the natural implementation in terms of a standard generic
 678    function with multiple methods (the method on =cons= having a
 679    slightly reduced =typecase= to dispatch on the first element, and
 680    other methods handling =symbol= and other atoms); and three
 681    separate cases using =cons-specializer= objects.  As well as
 682    measuring the effect of memoization against the full invocation
 683    protocol, we can also introduce a special case: when only one
 684    argument participates in method selection (all the other required
 685    arguments only being specialized on =t=), we can avoid the
 686    construction of a list of hash keys and simply use the key
 687    from the single active generalizer directly.
 688
 689    | implementation        | time (µs/call) | overhead |
 690    |-----------------------+----------------+----------|
 691    | function              |           3.17 |          |
 692    | standard-gf/methods   |            3.6 |     +14% |
 693    | cons-gf/one-arg-cache |            7.4 |    +130% |
 694    | cons-gf               |             15 |    +370% |
 695    | cons-gf/no-cache      |             90 |   +2700% |
 696
 697    The benchmarking results from this exercise are promising: in
 698    particular, the introduction of the effective method cache speeds
 699    up the use of generic specializers in this case by a factor of 6,
 700    and the one-argument special case by another factor of 2.  For this
 701    workload, even the one-argument special case only gets to within a
 702    factor of 2-3 of the function and standard generic function
 703    implementations, but the overall picture is that the memoizability
 704    in the protocol does indeed drastically reduce the overhead
 705    compared with the full invocation.
 706
 707    For the =signum-specializer= case, we choose to benchmark the
 708    computation of 20!, because that is the largest factorial whose
 709    answer fits in SBCL's 63-bit fixnums – in an attempt to measure the
 710    worst case for generic dispatch, where the work done within the
 711    methods is as small as possible without being meaningless, and in
 712    particular does not cause allocation or garbage collection to
 713    obscure the picture.
 714
 715 #+begin_src lisp :exports none
 716 (progn (gc :full t) (time (dotimes (i 10000) (%fact 20))))
 717 #+end_src
 718
 719    | implementation          | time (µs/call) | overhead |
 720    |-------------------------+----------------+----------|
 721    | function                |            0.6 |          |
 722    | standard-gf/fixnum      |            1.2 |    +100% |
 723    | signum-gf/one-arg-cache |            7.5 |   +1100% |
 724    | signum-gf               |             23 |   +3800% |
 725    | signum-gf/no-cache      |            240 |  +41000% |
 726
 727    The relative picture is similar to the =cons-specializer= case;
 728    including a cache saves a factor of 10 in this case, and another
 729    factor of 3 for the one-argument cache special case.  The cost of
 730    the genericity of the protocol here is starker; even the
 731    one-argument cache is a factor of 6 slower than the standard
 732    generic-function implementation, and a further factor of 2 away
 733    from the implementation of factorial as a function.  We discuss
 734    ways in which we expect to be able to improve performance in
 735    section [[#Future Work]].
 736
 737    We could allow the metaprogrammer to improve on the one-argument
 738    performance by constructing a specialized cache: for =signum=
 739    arguments of =rational= arguments, the logical cache structure is
 740    to index a three-element vector with =(1+ signum)=.  The current
 741    protocol does not provide a way of eliding the two generic function
 742    calls for the generic cache; we discuss possible approaches in
 743    section [[#Conclusions]].
 744 ** Full protocol
 745    The protocol described in this paper is only part of a complete
 746    protocol for =specializer= and =generalizer= metaobjects.  Our
 747    development of this protocol is as yet incomplete; the work
 748    described here augments that in \cite{Newton.Rhodes:2008}, but is
 749    yet relatively untested – and additionally our recent experience of
 750    working with that earlier protocol suggests that there might be
 751    useful additions to the handling of =specializer= metaobjects,
 752    independent of the =generalizer= idea presented here.
 753
 754 *** COMMENT ideas
 755     Description and specification left for reasons of space (we'll see?)
 756     - [ ] =same-specializer-p=
 757     - [ ] =parse/unparse-specializer-using-class=
 758     - [ ] =make-method-specializers-form=
 759     - [ ] jmoringe: In an email, I suggested
 760       =make-specializer-form-using-class=:
 761
 762       #+begin_quote
 763       Could we change =make-method-specializers-form='s default
 764       behaviour to call a new generic function
 765       #+begin_src
 766         make-specializer-form-using-class gf method name env
 767       #+end_src
 768       with builtin methods on =sb-mop:specializer=, =symbol=, =cons= (for
 769       eql-specializers)? This would make it unnecessary to repeat
 770       boilerplate along the lines of
 771       #+begin_src lisp
 772       (flet ((make-parse-form (name)
 773                (if <name-is-interesting>
 774                  <handle-interesting-specializer>
 775                  <repeat-handling-of-standard-specializers>)))
 776         `(list ,@(mapcar #'make-parse-form specializer-names)))
 777       #+end_src
 778       for each generic function class.
 779       #+end_quote
 780     - [ ] =make-method-lambda= revision (use environment arg?)
 781
 782       jmoringe: would only be relevant for pattern dispatch, right? I
 783       think, we didn't finish the discussion regarding special
 784       variables vs. environment vs. new protocol function
 785
 786 * Related Work
 787   :PROPERTIES:
 788   :CUSTOM_ID: Related Work
 789   :END:
 790
 791   The work presented here builds on specializer-oriented programming
 792   described in \cite{Newton.Rhodes:2008}.  Approximately
 793   contemporaneously, filtered dispatch \cite{Costanza.etal:2008} was
 794   introduced to address some of the same use cases: filtered dispatch
 795   works by having a custom discriminating function which wraps the
 796   usual one, where the wrapping function augments the set of
 797   applicable methods with applicable methods from other (hidden)
 798   generic functions, one per filter group; this step is not memoized,
 799   and using =eql= methods to capture behaviours of equivalence classes
 800   means that it is hard to see how it could be.  The methods are then
 801   combined using a custom method combination to mimic the standard
 802   one; in principle implementors of other method combinations could
 803   cater for filtered dispatch, but they would have to explicitly
 804   modify their method combinations.  The Clojure programming language
 805   supports multimethods[fn:5] with a variant of filtered dispatch as
 806   well as hierachical and identity-based method selectors.
 807
 808   In context-oriented programming
 809   \cite{Hirschfeld.etal:2008,Vallejos.etal:2010}, context dispatch
 810   occurs by maintaining the context state as an anonymous class with
 811   the superclasses representing all the currently active layers; this
 812   is then passed as a hidden argument to context-aware functions.  The
 813   set of layers is known and under programmer control, as layers must
 814   be defined beforehand.
 815
 816   In some sense, all dispatch schemes are specializations of predicate
 817   dispatch \cite{Ernst.etal:1998}.  The main problem with predicate
 818   dispatch is its expressiveness: with arbitrary predicates able to
 819   control dispatch, it is essentially impossible to perform any
 820   substantial precomputation, or even to automatically determine an
 821   ordering of methods given a set of arguments.  Even Clojure's
 822   restricted dispatch scheme provides an explicit operator for stating
 823   a preference order among methods, where here we provide an operator
 824   to order specializers; in filtered dispatch the programmer
 825   implicitly gives the system an order of precedence, through the
 826   lexical ordering of filter specification in a filtered function
 827   definition.
 828
 829   The Slate programming environment combines prototype-oriented
 830   programming with multiple dispatch \cite{Salzman.Aldrich:2005}; in
 831   that context, the analogue of an argument's class (in Common Lisp)
 832   as a representation of the equivalence class of objects with the
 833   same behaviour is the tuple of roles and delegations: objects with
 834   the same roles and delegations tuple behave the same, much as
 835   objects with the same generalizer have the same behaviour in the
 836   protocol described in this paper.
 837
 838   The idea of generalization is of course not new, and arises in other
 839   contexts.  Perhaps of particular interest is generalization in the
 840   context of partial evaluation; for example, \cite{Ruf:1993}
 841   considers generalization in online partial evaluation, where sets of
 842   possible values are represented by a type system construct
 843   representing an upper bound.  The relationship between generalizer
 844   metaobjects and approximation in type systems could be further
 845   explored.
 846 * Conclusions
 847   :PROPERTIES:
 848   :CUSTOM_ID: Conclusions
 849   :END:
 850   In this paper, we have presented a new generalizer metaobject
 851   protocol allowing the metaprogrammer to implement in a
 852   straightforward manner metaobjects to implement custom method
 853   selection, rather than the standard method selection as standardized
 854   in Common Lisp.  This protocol seamlessly interoperates with the
 855   rest of CLOS and Common Lisp in general; the programmer (the user of
 856   the custom specializer metaobjects) may without constraints use
 857   arbitrary method combination, intercede in effective method
 858   combination, or write custom method function implementations.  The
 859   protocol is expressive, in that it handles forms of dispatch not
 860   possible in more restricted dispatch systems, while not suffering
 861   from the indeterminism present in predicate dispatch through the use
 862   of explicit ordering predicates.
 863
 864   The protocol is also reasonably efficient; the metaprogrammer can
 865   indicate that a particular effective method computation can be
 866   memoized, and under those circumstances much of the overhead is
 867   amortized (though there remains a substantial overhead compared with
 868   standard generic-function or regular function calls).  We discuss
 869   how the efficiency could be improved below.
 870 ** Future Work
 871    :PROPERTIES:
 872    :CUSTOM_ID: Future Work
 873    :END:
 874    Although the protocol described in this paper allows for a more
 875    efficient implementation, as described in section [[#Memoization]],
 876    than computing the applicable and effective methods at each generic
 877    function call, the efficiency is still some way away from a
 878    baseline of the standard generic-function, let alone a standard
 879    function.  Most of the invocation protocol is memoized, but there
 880    are still two full standard generic-function calls –
 881    =generalizer-of-using-class= and =generalizer-equal-hash-key= – per
 882    argument per call to a generic function with extended specializers,
 883    not to mention a hash table lookup.
 884
 885    For many applications, the additional flexibility afforded by
 886    generalized specializers might be worth the cost in efficiency, but
 887    it would still be worth investigating how much the overhead from
 888    generalized specializers can be reduced; one possible avenue for
 889    investigation is giving greater control over the cacheing strategy
 890    to the metaprogrammer.
 891
 892    As an example, consider the =signum-specializer=.  The natural
 893    cache structure for a single argument generic function specializing
 894    on =signum= is probably a four-element vector, where the first
 895    three elements hold the effective methods for =signum= values of
 896    -1, 0, and 1, and the fourth holds the cached effective methods for
 897    everything else.  This would make the invocation of such functions
 898    very fast for the (presumed) common case where the argument is in
 899    fact a real number.  We hope to develop and show the effectiveness
 900    of an appropriate protocol to allow the metaprogrammer to construct
 901    and exploit such cacheing strategies, and (more speculatively) to
 902    implement the lookup of an effective method function in other ways.
 903
 904    We also aim to demonstrate support within this protocol for some
 905    particular cases of generalized specializers which seem to have
 906    widespread demand (in as much as any language extension can be said
 907    to be in “demand”).  In particular, we have preliminary work
 908    towards supporting efficient dispatch over pattern specializers
 909    such as implemented in the \textsf{Optima} library[fn:4], and over
 910    a prototype object system similar to that in Slate
 911    \cite{Salzman.Aldrich:2005}.  Our current source code for the work
 912    described in this paper can be seen in the git source code
 913    repository at [[http://christophe.rhodes.io/git/specializable.git]],
 914    which will be updated with future developments.
 915
 916    Finally, after further experimentation (and, ideally, non-trivial
 917    use in production) if this protocol stands up to use as we hope, we
 918    aim to produce a standards-quality document so that other
 919    implementors of Common Lisp can, if they choose, independently
 920    reimplement the protocol, and so that users can use the protocol
 921    with confidence that the semantics will not change in a
 922    backwards-incompatible fashion.
 923 ** Acknowledgments
 924    We thank Lee Salzman, Pascal Costanza and Mikel Evins for helpful
 925    and informative discussions, and all the respondents to one
 926    author's request for imaginative uses for generalized specializers.
 927
 928 \bibliographystyle{plain}
 929 \bibliography{crhodes,specializers}
 930
 931 * Footnotes
 932
 933 [fn:1] GNU CLISP, at http://www.clisp.org/
 934
 935 [fn:2] Clozure Common Lisp, at http://ccl.clozure.com/
 936
 937 [fn:3] the \textsf{Closer to MOP} project, at
 938    http://common-lisp.net/project/closer/, attempts to harmonize the
 939    different implementations of the metaobject protocol in Common
 940    Lisp.
 941
 942 [fn:4] https://github.com/m2ym/optima
 943
 944 [fn:5] http://clojure.org/multimethods