els-specializers.org

   1 #+TITLE: Generalizers: New Metaobjects for Generalized Dispatch
   2 #+AUTHOR: Christophe Rhodes, Jan Moringen, David Lichteblau
   3 #+OPTIONS: toc:nil
   4
   5 #+LaTeX_CLASS: acm_proc_article-sp
   6 #+LaTeX_HEADER: \DeclareTextFontCommand{\texttt}{\ttfamily\hyphenchar\font=45\relax}
   7
   8 #+begin_src elisp :exports none
   9 (add-to-list 'org-latex-classes
  10              '("acm_proc_article-sp" "\\documentclass{acm_proc_article-sp}"
  11                ("\\section{%s}" . "\\section*{%s}")
  12                ("\\subsection{%s}" . "\\subsection*{%s}")
  13                ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
  14                ("\\paragraph{%s}" . "\\paragraph*{%s}")
  15                ("\\subparagraph{%s}" . "\\subparagraph*{%s}")))
  16 #+end_src
  17
  18 #+begin_abstract
  19 1. This paper introduces a new metaobject, the generalizer, which
  20    complements the existing specializer metaobject.
  21 2. With the help of examples, we show that this metaobject allows for
  22    the efficient implementation of complex non-class-based dispatch
  23    within the framework of existing metaobject protocols
  24 3. We present the generalizer protocol, implemented within the SBCL
  25    implementation of Common Lisp
  26 4. In combination with previous work, this produces a fully-functional
  27    extension of the existing mechanism for method selection and
  28    effective method computation, including support for standard and
  29    user-defined method combination independent from method selection.
  30 #+end_abstract
  31
  32 * Introduction
  33   The revisions to the original Common Lisp language \cite{CLtL}
  34   included the detailed specification of an object system, known as
  35   the Common Lisp Object System (CLOS), which was eventually
  36   standardized as part of the ANSI Common Lisp standard \cite{CLtS}.
  37   The object system as presented to the standardization committee was
  38   formed of three chapters.  The first two chapters covered programmer
  39   interface concepts and the functions in the programmer interface
  40   \cite[Chapter 28]{CLtL2} and were largely incorporated into the
  41   final standard; the third chapter, covering a Metaobject Protocol
  42   (MOP) for CLOS, was not.
  43
  44   Nevertheless, the CLOS MOP has proven to be a robust design, and
  45   while many implementations have derived their implementations of
  46   CLOS from either the Closette illustrative implementation in
  47   \cite{AMOP}, or the Portable Common Loops implementation of CLOS
  48   from Xerox Parc, there have been from-scratch reimplementations of
  49   CLOS (in at least CLISP; check for others -- Lisp500?  CCL?)
  50   incorporating substantial fractions of the Metaobject Protocol as
  51   described.
  52
  53   Although it has stood the test of time, the CLOS MOP is neither
  54   without issues (e.g. semantic problems with =make-method-lambda=
  55   \cite{Costanza.Herzeel:2008}; useful functions such as
  56   =compute-effective-slot-definition-initargs= being missing from the
  57   standard) nor is it a complete framework for the metaprogrammer to
  58   implement all conceivable variations of object-oriented behaviour.
  59   While metaprogramming offers some possibilities for customization of
  60   the object system behaviour, those possibilities cannot extend
  61   arbitrarily in all directions.  There is still an expectation that
  62   functionality is implemented with methods on generic functions,
  63   acting on objects with slots.  Nevertheless, the MOP is flexible,
  64   and is used for a number of things, including: documentation
  65   generation (where introspective functionality in the MOP is used to
  66   extract information from a running system); object-relational
  67   mapping and other approaches to object persistence; alternative
  68   backing stores for slots (hash-tables or symbols); and programmatic
  69   construction of metaobjects, for example for IDL compilers and model
  70   transformations.
  71
  72   [ A picture on MOP flexibility here would be good; I have in my mind
  73   one where an object system is a point and the MOP opens up a blob
  74   around that point, and I'm sure I've seen it somewhere but I can't
  75   remember where.  Alternatively, there's Kiczales et al "MOPs: why we
  76   want them and what else they can do", fig. 2 ]
  77
  78   One area of functionality where there is scope for customization by
  79   the metaprogrammer is in the mechanics and semantics of method
  80   applicability and dispatch.  While in principle AMOP allows
  81   customization of dispatch in various different ways (the
  82   metaprogrammer can define methods on protocol functions such as
  83   =compute-applicable-methods=,
  84   =compute-applicable-methods-using-classes=), for example, in
  85   practice implementation support for this was weak until relatively
  86   recently[fn:1].
  87
  88   Another potential mechanism for customizing dispatch is implicit in
  89   the class structure defined by AMOP: standard specializer objects
  90   (instances of =class= and =eql-specializer=) are generalized
  91   instances of the =specializer= protocol class, and in principle
  92   there are no restrictions on the metaprogrammer constructing
  93   additional subclasses.  Previous work \cite{Newton.Rhodes:2008} has
  94   explored the potential for customizing generic function dispatch
  95   using extended specializers, but as of that work the metaprogrammer
  96   must override the entirety of the generic function invocation
  97   protocol (from =compute-discriminating-function= on down), leading
  98   to toy implementations and duplicated effort.
  99
 100   This paper introduces a protocol for efficient and controlled
 101   handling of new subclasses of =specializer=.  In particular, it
 102   introduces the =generalizer= protocol class, which generalizes the
 103   return value of =class-of= in method applicability computation, and
 104   allows the metaprogrammer to hook into cacheing schemes to avoid
 105   needless recomputation of effective methods for sufficiently similar
 106   generic function arguments (See Figure\nbsp\ref{fig:dispatch}).
 107
 108   #+CAPTION:    Dispatch Comparison
 109   #+LABEL:      fig:dispatch
 110   #+ATTR_LATEX: width=0.9\linewidth float
 111   [[file:figures/dispatch-comparison.pdf]]
 112
 113   The remaining sections in this paper can be read in any order.  We
 114   give some motivating examples in section [[#Examples]], including
 115   reimplementations of examples from previous work, as well as
 116   examples which are poorly supported by previous protocols.  We
 117   describe the protocol itself in section [[#Protocol]], describing each
 118   protocol function in detail and, where applicable, relating it to
 119   existing protocol functions within the CLOS MOP.  We survey related
 120   work in more detail in section [[#Related Work]], touching on work on
 121   customized dispatch schemes in other environments.  Finally, we draw
 122   our conclusions from this work, and indicate directions for further
 123   development, in section [[#Conclusions]]; reading that section before the
 124   others indicates substantial trust in the authors' work.
 125 * Examples
 126   :PROPERTIES:
 127   :CUSTOM_ID: Examples
 128   :END:
 129   In this section, we present a number of examples of dispatch
 130   implemented using our protocol, which we describe in section
 131   [[#Protocol]].  For reasons of space, the metaprogram code examples in
 132   this section do not include some of the necessary support code to
 133   run; complete implementations of each of these cases are included in
 134   an appendix / in the accompanying repository snapshot / at this
 135   location.
 136
 137   A note on terminology: we will attempt to distinguish between the
 138   user of an individual case of generalized dispatch (the
 139   “programmer”), the implementor of a particular case of generalized
 140   dispatch (the “metaprogrammer”), and the authors as the designers
 141   and implementors of our generalized dispatch protocol (the
 142   “metametaprogammer”, or more likely “we”).
 143 ** CONS specializers
 144    :PROPERTIES:
 145    :CUSTOM_ID: Cons
 146    :END:
 147    We start by presenting our original use case, performing
 148    dispatching on the first element of lists.  Semantically, we allow
 149    the programmer to specialize any argument of methods with a new
 150    kind of specializer, =cons-specializer=, which is applicable if and
 151    only if the corresponding object is a =cons= whose =car= is =eql=
 152    to the symbol associated with the =cons-specializer=; these
 153    specializers are more specific than the =cons= class, but less
 154    specific than an =eql-specializer= on any given =cons=.
 155
 156    One motivation for the use of this specializer is in an extensible
 157    code walker: a new special form can be handled simply by writing an
 158    additional method on the walking generic function, seamlessly
 159    interoperating with all existing methods.
 160
 161    The programmer code using these specializers is unchanged from
 162    \cite{Newton.Rhodes:2008}; the benefits of the protocol described
 163    here are centered on performance and generality: in an application
 164    such as walking source code, we would expect to encounter special
 165    forms (distinguished by particular atoms in the =car= position)
 166    multiple times, and hence to dispatch to the same effective method
 167    repeatedly.  We discuss this in more detail in section [[#Memoization]];
 168    we present the metaprogrammer code below.
 169
 170 #+begin_src lisp
 171 (defclass cons-specializer (specializer)
 172   ((%car :reader %car :initarg :car)))
 173 (defclass cons-generalizer (generalizer)
 174   ((%car :reader %car :initarg :car)))
 175 (defmethod generalizer-of-using-class
 176     ((gf cons-generic-function) arg)
 177   (typecase arg
 178     ((cons symbol)
 179      (make-instance 'cons-generalizer
 180                     :car (car arg)))
 181     (t (call-next-method))))
 182 (defmethod generalizer-equal-hash-key
 183     ((gf cons-generic-function)
 184      (g cons-generalizer))
 185   (%car g))
 186 (defmethod specializer-accepts-generalizer-p
 187     ((gf cons-generic-function)
 188      (s cons-specializer)
 189      (g cons-generalizer))
 190   (if (eql (%car s) (%car g))
 191       (values t t)
 192       (values nil t)))
 193 (defmethod specializer-accepts-p
 194     ((s cons-specializer) o)
 195   (and (consp o) (eql (car o) (%car s))))
 196 #+end_src
 197
 198 The code above shows the core of the use of our protocol.  We have
 199 elided some support code for parsing and unparsing specializers, and
 200 for handling introspective functions such as finding generic functions
 201 for a given specializer.  We have also elided methods on the protocol
 202 function =specializer<=; for =cons-specializers= here, specializer
 203 ordering is trivial, as only one =cons-specializer= can ever be
 204 applicable to any given argument.  See section [[#Accept]] for a case
 205 where specializer ordering is substantially different.
 206
 207 As in \cite{Newton.Rhodes:2008}, we can use these specializers to
 208 implement a modular code walker, where we define one method per
 209 special operator.  We show two of those methods below, in the context
 210 of a walker which checks for unused bindings and uses of unbound
 211 variables.
 212
 213 #+begin_src
 214 (defgeneric walk (form env stack)
 215   (:generic-function-class cons-generic-function))
 216 (defmethod walk ((expr (cons lambda)) env call-stack)
 217   (let ((lambda-list (cadr expr))
 218         (body (cddr expr)))
 219     (with-checked-bindings
 220         ((bindings-from-ll lambda-list) env call-stack)
 221       (dolist (form body)
 222         (walk form env (cons form call-stack))))))
 223 (defmethod walk ((expr (cons let)) env call-stack)
 224   (flet ((let-binding (x)
 225            (walk (cadr x) env (cons (cadr x) call-stack))
 226            (cons (car x) (make-instance 'binding))))
 227     (with-checked-bindings
 228         ((mapcar #'let-binding (cadr expr)) env call-stack)
 229       (dolist (form (cddr expr))
 230         (walk form env (cons form call-stack))))))
 231 #+end_src
 232
 233    Note that in this example there is no strict need for
 234    =cons-specializer= and =cons-generalizer= to be distinct classes –
 235    just as in the normal protocol involving
 236    =compute-applicable-methods= and
 237    =compute-applicable-methods-using-classes=, the specializer object
 238    for mediating dispatch contains the same information as the object
 239    representing the equivalence class of objects to which that
 240    specializer is applicable: here it is the =car= of the =cons=
 241    (which we wrap in a distinct object); in the standard dispatch it
 242    is the =class= of the object.  This feature also characterizes
 243    those use cases where the metaprogrammer could straightforwardly
 244    use filtered dispatch \cite{Costanza.etal:2008} to implement their
 245    dispatch semantics.  We will see in section [[#Accept]] an example
 246    of a case where filtered dispatch is incapable of straightforwardly
 247    expressing the dispatch, but first we present our implementation of
 248    the motivating case from \cite{Costanza.etal:2008}.
 249 ** SIGNUM specializers
 250    :PROPERTIES:
 251    :CUSTOM_ID: Signum
 252    :END:
 253    Our second example of the implementation and use of generalized
 254    specializers is a reimplementation of one of the examples in
 255    \cite{Costanza.etal:2008}: specifically, the factorial function.
 256    Here, we will perform dispatch based on the =signum= of the
 257    argument, and again, at most one method with a =signum= specializer
 258    will be appliable to any given argument, which makes the structure
 259    of the specializer implementation very similar to the =cons=
 260    specializers in the previous section.
 261
 262    We have chosen to compare signum values using \texttt{=}, which
 263    means that a method with specializer =(signum 1)= will be
 264    applicable to positive floating-point arguments (see the first
 265    method on =specializer-accepts-generalizer-p= and the method on
 266    =specializer=accepts-p= below).  This leads to one subtle
 267    difference in behaviour compared to that of the =cons=
 268    specializers: in the case of =signum= specializers, the /next/
 269    method after any =signum= specializer can be different, depending
 270    on the class of the argument.  This aspect of the dispatch is
 271    handled by the second method on =specializer-accepts-generalizer-p=
 272    below.
 273 #+begin_src lisp
 274 (defclass signum-specializer (specializer)
 275   ((%signum :reader %signum :initarg :signum)))
 276 (defclass signum-generalizer (generalizer)
 277   ((%signum :reader %signum :initarg :signum)))
 278 (defmethod generalizer-of-using-class
 279     ((gf signum-generic-function) arg)
 280   (typecase arg
 281     (real (make-instance 'signum-generalizer
 282                          :signum (signum arg)))
 283     (t (call-next-method))))
 284 (defmethod generalizer-equal-hash-key
 285     ((gf signum-generic-function)
 286      (g signum-specializer))
 287   (%signum g))
 288 (defmethod specializer-accepts-generalizer-p
 289     ((gf signum-generic-function)
 290      (s signum-specializer)
 291      (g signum-generalizer))
 292   (if (= (%signum s) (%signum g)) ; or EQL?
 293       (values t t)
 294       (values nil t)))
 295
 296 (defmethod specializer-accepts-generalizer-p
 297     ((gf signum-generic-function)
 298      (specializer sb-mop:specializer)
 299      (thing signum-specializer))
 300   (specializer-accepts-generalizer-p
 301    gf specializer (class-of (%signum thing))))
 302
 303 (defmethod specializer-accepts-p
 304     ((s signum-specializer) o)
 305   (and (realp o) (= (%signum s) (signum o))))
 306 #+end_src
 307
 308    Given these definitions, and once again some more straightforward
 309    ones elided for reasons of space, we can implement the factorial
 310    function as follows:
 311
 312 #+begin_src lisp
 313 (defgeneric fact (n)
 314   (:generic-function-class signum-generic-function))
 315 (defmethod fact ((n (signum 0))) 1)
 316 (defmethod fact ((n (signum 1))) (* n (fact (1- n))))
 317 #+end_src
 318
 319    We do not need to include a method on =(signum -1)=, as the
 320    standard =no-applicable-method= protocol will automatically apply to
 321    negative real or non-real arguments.
 322 ** Accept HTTP header specializers
 323    :PROPERTIES:
 324    :CUSTOM_ID: Accept
 325    :END:
 326    In this section, we implement a non-trivial form of dispatch.  The
 327    application in question is a web server, and specifically to allow
 328    the programmer to support RFC 2616 \cite{rfc2616} content
 329    negotiation, of particular interest to publishers and consumers of
 330    REST-style Web APIs.
 331
 332    The basic mechanism in content negotiation is as follows: the web
 333    client sends an HTTP request with an =Accept= header, which is a
 334    string describing the media types it is willing to receive as a
 335    response to the request, along with numerical preferences.  The web
 336    server compares these stated client preferences with the resources
 337    it has available to satisfy this request, and sends the best
 338    matching resource in its response.
 339
 340    For example, a graphical web browser might by default send an
 341    =Accept= header such as
 342    =text/html,application/xml;q=0.9,*/*;q=0.8=.  This should be
 343    interpreted by a web server as meaning that if for a given resource
 344    the server can provide content of type =text/html= (i.e. HTML),
 345    then it should do so.  Otherwise, if it can provide
 346    =application/xml= content (i.e. XML of any schema), then that
 347    should be provided; failing that, any other content type is
 348    acceptable.
 349
 350    In the case where there are static files on the filesystem, and the
 351    web server must merely select between them, there is not much more
 352    to say.  However, it is not unusual for a web service to be backed
 353    by some other form of data, and responses computed and sent on the
 354    fly, and in these circumstances the web server must compute which
 355    of its known output formats it can use to satisfy the request
 356    before actually generating the best matching response.  This can be
 357    modelled as one generic function responsible for generating the
 358    response, with methods corresponding to content-types -- and the
 359    generic function must then perform method selection against the
 360    request's =Accept= header to compute the appropriate response.
 361
 362    The =accept-specializer= below implements the dispatch.  It depends
 363    on a lazily-computed =tree= slot to represent the information in
 364    the accept header (generated by =parse-accept-string=), and a
 365    function =q= to compute the (defaulted) preference level for a
 366    given content-type and =tree=; then, method selection and ordering
 367    involves finding the =q= for each =accept-specializer='s content
 368    type given the =tree=, and sorting them according to the preference
 369    level.
 370
 371 #+begin_src lisp
 372 (defclass accept-specializer (specializer)
 373   ((media-type :initarg :media-type :reader media-type)))
 374 (defclass accept-generalizer (generalizer)
 375   ((header :initarg :header :reader header)
 376    (tree)
 377    (next :initarg :next :reader next)))
 378 (defmethod generalizer-equal-hash-key
 379     ((gf accept-generic-function)
 380      (g accept-generalizer))
 381   `(accept-generalizer ,(header g)))
 382 (defmethod specializer-accepts-generalizer-p
 383     ((gf accept-generic-function)
 384      (s accept-specializer)
 385      (generalizer accept-generalizer))
 386   (values (q (media-type s) (tree generalizer)) t))
 387 (defmethod specializer-accepts-generalizer-p
 388     ((gf accept-generic-function)
 389      (s specializer)
 390      (generalizer accept-generalizer))
 391   (specializer-accepts-generalizer-p
 392    gf s (next generalizer)))
 393
 394 (defmethod specializer<
 395     ((gf accept-generic-function)
 396      (s1 accept-specializer)
 397      (s2 accept-specializer)
 398      (generalizer accept-generalizer))
 399   (let ((m1 (media-type s1))
 400         (m2 (media-type s2))
 401         (tree (tree generalizer)))
 402     (cond
 403       ((string= m1 m2) '=)
 404       (t (let ((q1 (q m1 tree)))
 405                (q2 (q m2 tree))))
 406            (cond
 407              ((= q1 q2) '=)
 408              ((< q1 q2) '>)
 409              (t '<))))))
 410 #+end_src
 411
 412    The metaprogrammer can then support dispatching in this way for
 413    suitable objects, such as the =request= object representing a
 414    client request in the Hunchentoot web server.  The code below
 415    implements this, by defining the computation of a suitable
 416    =generalizer= object for a given request, and specifying how to
 417    compute whether the specializer accepts the given request object
 418    (=q= returns a number between 0 and 1 if any pattern in the =tree=
 419    matches the media type, and =nil= if the media type cannot be
 420    matched at all).
 421
 422 #+begin_src
 423 (defmethod generalizer-of-using-class
 424     ((gf accept-generic-function)
 425      (arg tbnl:request))
 426   (make-instance 'accept-generalizer
 427                  :header (tbnl:header-in :accept arg)
 428                  :next (class-of arg)))
 429 (defmethod specializer-accepts-p
 430     ((specializer accept-specializer)
 431      (obj tbnl:request))
 432   (let* ((accept (tbnl:header-in :accept obj))
 433          (tree (parse-accept-string accept))
 434          (q (q (media-type specializer) tree)))
 435     (and q (> q 0))))
 436 #+end_src
 437
 438    This dispatch cannot be implemented using filtered dispatch, except
 439    by generating anonymous classes with all the right mime-types as
 440    direct superclasses in dispatch order; the filter would generate
 441 #+begin_src lisp
 442 (ensure-class nil :direct-superclasses
 443  '(text/html image/webp ...))
 444 #+end_src
 445    and dispatch the operates using those anonymous classes.  While
 446    this is possible to do, it is awkward to express content-type
 447    negotiation in this way, as it means that the dispatcher must know
 448    about the universe of mime-types that clients might declare that
 449    they accept, rather than merely the set of mime-types that a
 450    particular generic function is capable of serving; handling
 451    wildcards in accept strings is particularly awkward in the
 452    filtering paradigm.
 453
 454    Note that in this example, the method on =specializer<= involves a
 455    nontrivial ordering of methods based on the =q= values specified in
 456    the accept header (whereas in sections [[#Cons]] and [[#Signum]] only a
 457    single extended specializer could be applicable to any given
 458    argument).
 459
 460    Also note that the accept specializer protocol is straightforwardly
 461    extensible to other suitable objects; for example, one simple
 462    debugging aid is to define that an =accept-specializer= should be
 463    applicable to =string= objects.  This can be done in a modular
 464    fashion (see the code below, which can be completely disconnected
 465    from the code for Hunchentoot request objects), and generalizes to
 466    dealing with multiple web server libraries, so that
 467    content-negotiation methods are applicable to each web server's
 468    request objects.
 469
 470 #+begin_src lisp
 471 (defmethod generalizer-of-using-class
 472     ((gf accept-generic-function)
 473      (s string))
 474   (make-instance 'accept-generalizer
 475                  :header s
 476                  :next (class-of s)))
 477 (defmethod specializer-accepts-p
 478     ((s accept-specializer) (string string))
 479   (let* ((tree (parse-accept-string string))
 480          (q (q (media-type s) tree)))
 481     (and q (> q 0))))
 482 #+end_src
 483 ** Pattern / xpattern / regex / optima
 484    Here's the /really/ interesting bit, but on the other hand we're
 485    probably going to run out of space, and the full description of
 486    these is going to take us into =make-method-lambda= territory.
 487    A second paper?  Future work?
 488 * Protocol
 489   :PROPERTIES:
 490   :CUSTOM_ID: Protocol
 491   :END:
 492
 493   In section [[#Examples]], we have seen a number of code fragments as
 494   partial implementations of particular non-standard method dispatch,
 495   using =generalizer= metaobjects to mediate between the methods of
 496   the generic function and the actual arguments passed to it.  In
 497   section [[#Generalizer metaobjects]], we go into more detail regarding
 498   these =generalizer= metaobjects, describing the generic function
 499   invocation protocol in full, and showing how this protocol allows a
 500   similar form of effective method cacheing as the standard one does.
 501   In section [[#Generalizer performance]], we show the results of some
 502   simple performance measurements to highlight the improvement that
 503   this protocol can bring over a naïve implementation of generalized
 504   dispatch, as well as highlighting the potential for further
 505   improvement.
 506
 507 ** Generalizer metaobjects
 508    :PROPERTIES:
 509    :CUSTOM_ID: Generalizer metaobjects
 510    :END:
 511
 512 *** Generic function invocation
 513     As in the standard generic function invocation protocol, the
 514     generic function's actual functionality is provided by a
 515     discriminating function.  The functionality described in this
 516     protocol is implemented by having a distinct subclass of
 517     =standard-generic-function=, and a method on
 518     =compute-discriminating-function= which produces a custom
 519     discriminating function.  The basic outline of the discriminating
 520     function is the same as the standard one: it must first compute the
 521     set of applicable methods given particular arguments; from that, it
 522     must compute the effective method by combining the methods
 523     appropriately according to the generic function's method
 524     combination; finally, it must call the effective method with the
 525     arguments.
 526
 527     Computing the set of applicable methods is done using a pair of
 528     functions: =compute-applicable-methods=, the standard metaobject
 529     function, and a new function
 530     =compute-applicable-methods-using-generalizers=.  We define a
 531     custom method on =compute-applicable-methods= which tests the
 532     applicability of a particular specializer against a given argument
 533     using =specializer-accepts-p=, a new protocol function with
 534     default implementations on =class= and =eql-specializer= to
 535     implement the expected behaviour.  In order to order the methods,
 536     as required by the protocol, we define a pairwise comparison
 537     operator =specializer<= which defines an ordering between
 538     specializers for a given generalizer argument (remembering that
 539     even in standard CLOS the ordering between =class= specializers
 540     can change depending on the actual class of the argument).
 541
 542     The new =compute-applicable-methods-using-generalizers= is the
 543     analogue of the MOP's =compute-applicable-methods-using-classes=.
 544     Instead of calling it with the =class-of= each argument, we compute
 545     the generalizers of each argument using the new function
 546     =generalizer-of-using-class= (where the =-using-class= refers to
 547     the class of the generic function rather than the class of the
 548     object), and call it with the list of generalizers.  As with the
 549     standard function, a secondary return value indicates whether the
 550     result of the function is definitive for that list of generalizers.
 551
 552     Thus, in generic function invocation, we first compute the
 553     generalizers of the arguments; we compute the ordered set of
 554     applicable methods, either from the generalizers or (if that is
 555     not definitive) from the arguments themselves; then the normal
 556     effective method computation and call can occur.  Unfortunately,
 557     the nature of an effective method object is not specified, so we
 558     have to reach into implementation internals a little in order to
 559     call it, but otherwise the remainder of the generic function
 560     invocation protocol is unchanged from the standard one.  In
 561     particular, method combination is completely unchanged;
 562     programmers can choose arbitrary method combinations, including
 563     user-defined long form combinations, for their generic functions
 564     involving generalized dispatch.
 565
 566 *** Effective method memoization
 567     :PROPERTIES:
 568     :CUSTOM_ID: Memoization
 569     :END:
 570     The potential efficiency benefit to having =generalizer=
 571     metaobjects lies in the use of
 572     =compute-applicable-methods-using-generalizers=.  If a particular
 573     generalized specializer accepts a variety of objects (such as the
 574     =signum= specializer accepting all reals with a given sign, or the
 575     =accept= specializer accepting all HTTP requests with a particular
 576     =Accept= header), then there is the possibility of cacheing and
 577     reusing the results of the applicable and effective method
 578     computation.  If the computation of the applicable method from
 579     =compute-applicable-methods-using-generalizers= is definitive,
 580     then the ordered set of applicable methods and the effective
 581     method can be cached.
 582
 583     One issue is what to use as the key for that cache.  We cannot use
 584     the generalizers themselves, as two generalizers that should be
 585     considered equal for cache lookup will not compare as =equal= –
 586     and indeed even the standard generalizer, the =class=, cannot be
 587     used as we must be able to invalidate cache entries upon class
 588     redefinition.  The issue of =class= generalizers we can solve as
 589     in \cite{Kiczales.Rodriguez:1990} by using the =wrapper= of a
 590     class, which is distinct for each distinct (re)definition of a
 591     class; for arbitrary generalizers, however, there is /a priori/ no
 592     good way of computing a suitable hash key automatically, so we
 593     allow the metaprogrammer to specify one by defining a method on
 594     =generalizer-equal-hash-key=, and combining the hash keys for all
 595     required arguments in a list to use as a key in an =equal=
 596     hash-table.
 597
 598     [XXX could we actually compute a suitable hash key using the
 599     generalizer's class name and initargs?]
 600
 601     - [X] =generalizer-of-using-class= (NB class of gf not class of object)
 602     - [X] =compute-applicable-methods-using-generalizers=
 603     - [X] =generalizer-equal-hash-key=
 604     - [X] =specializer-accepts-generalizer-p=
 605     - [X] =specializer-accepts-p=
 606     - [X] =specializer<=
 607 ** Performance
 608    :PROPERTIES:
 609    :CUSTOM_ID: Generalizer performance
 610    :END:
 611    We have argued that the protocol presented here allows for
 612    expressive control of method dispatch while preserving the
 613    possibility of efficiency.  In this section, we quantify the
 614    efficiency that the memoization protocol described in section
 615    [[#Memoization]] achieves, by comparing it both to the same protocol
 616    with no memoization, as well as with equivalent dispatch
 617    implementations in the context of methods with regular
 618    specializers, and with implementation in straightforward functions.
 619
 620    In the case of the =cons-specializer=, we benchmark the walker
 621    acting on a small but non-trivial form.  The implementation
 622    strategies in the table below refer to: an implementation in a
 623    single function with a large =typecase= to dispatch between all the
 624    cases; the natural implementation in terms of a standard generic
 625    function with multiple methods (the method on =cons= having a
 626    slightly reduced =typecase= to dispatch on the first element, and
 627    other methods handling =symbol= and other atoms); and three
 628    separate cases using =cons-specializer= objects.  As well as
 629    measuring the effect of memoization against the full invocation
 630    protocol, we can also introduce a special case: when only one
 631    argument participates in method selection (all the other required
 632    arguments only being specialized on =t=), we can avoid the
 633    construction of a list of hash keys and simply use the key
 634    from the single active generalizer directly.
 635
 636    Refer to \cite{Kiczales.Rodriguez:1990}
 637
 638    | implementation        | time (µs/call) | overhead |
 639    |-----------------------+----------------+----------|
 640    | function              |           3.17 |          |
 641    | standard-gf/methods   |            3.6 |     +14% |
 642    | cons-gf/one-arg-cache |            7.4 |    +130% |
 643    | cons-gf               |             15 |    +370% |
 644    | cons-gf/no-cache      |             90 |   +2700% |
 645
 646    The benchmarking results from this exercise are promising: in
 647    particular, the introduction of the effective method cache speeds
 648    up the use of generic specializers in this case by a factor of 6,
 649    and the one-argument special case by another factor of 2.  For this
 650    workload, even the one-argument special case only gets to within a
 651    factor of 2-3 of the function and standard generic function
 652    implementations, but the overall picture is that the memoizability
 653    in the protocol does indeed drastically reduce the overhead
 654    compared with the full invocation.
 655
 656    For the =signum-specializer= case, we choose to benchmark the
 657    computation of 20!, because that is the largest factorial whose
 658    answer fits in SBCL's 63-bit fixnums – in an attempt to measure the
 659    worst case for generic dispatch, where the work done within the
 660    methods is as small as possible without being meaningless, and in
 661    particular does not cause allocation or garbage collection to
 662    obscure the picture.
 663
 664 #+begin_src lisp :exports none
 665 (progn (gc :full t) (time (dotimes (i 10000) (%fact 20))))
 666 #+end_src
 667
 668    | implementation          | time (µs/call) | overhead |
 669    |-------------------------+----------------+----------|
 670    | function                |            0.6 |          |
 671    | standard-gf/fixnum      |            1.2 |    +100% |
 672    | signum-gf/one-arg-cache |            7.5 |   +1100% |
 673    | signum-gf               |             23 |   +3800% |
 674    | signum-gf/no-cache      |            240 |  +41000% |
 675
 676    The relative picture is similar to the =cons-specializer= case;
 677    including a cache saves a factor of 10 in this case, and another
 678    factor of 3 for the one-argument cache special case.  The cost of
 679    the genericity of the protocol here is starker; even the
 680    one-argument cache is a factor of 6 slower than the standard
 681    generic-function implementation, and a further factor of 2 away
 682    from the implementation of factorial as a function.  We discuss
 683    ways in which we expect to be able to improve performance in
 684    section [[#Future Work]].
 685
 686    We could allow the metaprogrammer to improve on the one-argument
 687    performance by constructing a specialized cache: for =signum=
 688    arguments of =rational= arguments, the logical cache structure is
 689    to index a three-element vector with =(1+ signum)=.  The current
 690    protocol does not provide a way of eliding the two generic function
 691    calls for the generic cache; we discuss possible approaches in
 692    section [[#Conclusions]].
 693 ** Full protocol
 694    Description and specification left for reasons of space (we'll see?)
 695    - [ ] =same-specializer-p=
 696    - [ ] =parse/unparse-specializer-using-class=
 697    - [ ] =make-method-specializers-form=
 698    - [ ] jmoringe: In an email, I suggested
 699      =make-specializer-form-using-class=:
 700
 701      #+begin_quote
 702      Could we change =make-method-specializers-form='s default
 703      behaviour to call a new generic function
 704      #+begin_src
 705        make-specializer-form-using-class gf method name env
 706      #+end_src
 707      with builtin methods on =sb-mop:specializer=, =symbol=, =cons= (for
 708      eql-specializers)? This would make it unnecessary to repeat
 709      boilerplate along the lines of
 710      #+begin_src lisp
 711      (flet ((make-parse-form (name)
 712               (if <name-is-interesting>
 713                 <handle-interesting-specializer>
 714                 <repeat-handling-of-standard-specializers>)))
 715        `(list ,@(mapcar #'make-parse-form specializer-names)))
 716      #+end_src
 717      for each generic function class.
 718      #+end_quote
 719    - [ ] =make-method-lambda= revision (use environment arg?)
 720
 721      jmoringe: would only be relevant for pattern dispatch, right? I
 722      think, we didn't finish the discussion regarding special
 723      variables vs. environment vs. new protocol function
 724
 725 * Related Work
 726   :PROPERTIES:
 727   :CUSTOM_ID: Related Work
 728   :END:
 729   - [ ] Newton/Rhodes, \cite{Newton.Rhodes:2008}
 730   - [ ] filtered dispatch -- the point is that our work continues to
 731     be useful in cases where there are unbounded numbers of
 732     equivalence classes but each given invokation involves a small
 733     number of methods.  Filtered dispatch works by having a custom
 734     discriminating function which wraps the usual one, and augments
 735     the set of applicable methods computed with applicable methods
 736     from other (hidden) generic functions (one per filter group).  It
 737     then also has a custom method combination to handle combining
 738     these applicable methods.  \cite{Costanza.etal:2008}
 739   - [ ] ContextL / context-oriented programming -- dispatch occurs on
 740     hidden layer argument being an instance of an anonymous class with
 741     suitably arranged superclasses -- OK because set of layers is
 742     bounded and under programmer control.  \cite{Hirschfeld.etal:2008,Vallejos.etal:2010}
 743   - [ ] http://soft.vub.ac.be/Publications/2010/vub-tr-soft-10-04.pdf
 744   - [ ] http://soft.vub.ac.be/lambic/files/lambic-ilc09.pdf
 745   - [ ] http://soft.vub.ac.be/Publications/2011/vub-soft-phd-11-03.pdf
 746   - [ ] Prototypes with Multiple Dispatch
 747     http://sauerbraten.org/lee/ecoop.pdf -- extension of Self-style
 748     object system to handle multiple equally-privileged "receivers".
 749     A good test case for our protocol; handled adequately with
 750     generalizer being the tuple of (roles,delegations), with some
 751     thought needed for method redefinitions but otherwise working
 752     fine. \cite{Salzman.Aldrich:2005}
 753   - [ ] Sheeple
 754   - [ ] Multiple dispatch in Clojure
 755     http://clojure.org/multimethods -- seems to allow hierarchy-based,
 756     eql and the equivalent of filtered dispatch
 757 * Conclusions
 758   :PROPERTIES:
 759   :CUSTOM_ID: Conclusions
 760   :END:
 761   - protocol for straightforward definition of custom dispatch
 762     + interoperates seamlessly with rest of CLOS: method combination,
 763       etc.
 764     + tolerably efficient: two extra standard gf invokations and one
 765       hash table lookup per call on the fast path (but more to be
 766       done)
 767     + expressive: handles forms of dispatch not handled elsewhere; all
 768       the usual niceties of redefinition, modularity, introspection
 769 ** Future Work
 770    :PROPERTIES:
 771    :CUSTOM_ID: Future Work
 772    :END:
 773    Although the protocol described in this paper allows for a more
 774    efficient implementation, as described in section [[#Memoization]],
 775    than computing the applicable and effective methods at each generic
 776    function call, the efficiency is still some way away from a
 777    baseline of the standard generic-function, let alone a standard
 778    function.  Most of the invocation protocol is memoized, but there
 779    are still two full standard generic-function calls –
 780    =generalizer-of-using-class= and =generalizer-equal-hash-key= – per
 781    argument per call to a generic function with extended specializers,
 782    not to mention a hash table lookup.
 783
 784    For many applications, the additional flexibility afforded by
 785    generalized specializers might be worth the cost in efficiency, but
 786    it would still be worth investigating how much the overhead from
 787    generalized specializers can be reduced; one possible avenue for
 788    investigation is giving greater control over the cacheing strategy
 789    to the metaprogrammer.
 790
 791    As an example, consider the =signum-specializer=.  The natural
 792    cache structure for a single argument generic function specializing
 793    on =signum= is probably a four-element vector, where the first
 794    three elements hold the effective methods for =signum= values of
 795    -1, 0, and 1, and the fourth holds the cached effective methods for
 796    everything else.  This would make the invocation of such functions
 797    very fast for the (presumed) common case where the argument is in
 798    fact a real number.  We hope to develop and show the effectiveness
 799    of an appropriate protocol to allow the metaprogrammer to construct
 800    and exploit such cacheing strategies, and (more speculatively) to
 801    implement the lookup of an effective method function in other ways.
 802
 803    We also aim to demonstrate support within this protocol for some
 804    particular cases of generalized specializers which seem to have
 805    widespread demand (in as much as any language extension can be said
 806    to be in “demand”).  In particular, we have preliminary work
 807    towards supporting efficient dispatch over pattern specializers
 808    such as implemented in the \textsf{Optima} library[fn:2], and over
 809    a prototype object system similar to that in Slate
 810    \cite{Salzman.Aldrich:2005}.  Our current source code for the work
 811    described in this paper can be seen in the git source code
 812    repository at [[http://christophe.rhodes.io/git/specializable.git]],
 813    which will be updated with future developments.
 814 ** Acknowledgments
 815    We thank Lee Salzman, Pascal Costanza and Mikel Evins for helpful
 816    and informative discussions, and all the respondents to one
 817    author's call for imaginative uses for generalized specializers.
 818
 819 \bibliographystyle{plain}
 820 \bibliography{crhodes,specializers}
 821
 822 * Footnotes
 823
 824 [fn:1] the \textsf{Closer to MOP} project attempts to harmonize the
 825    different implementations of the metaobject protocol in Common Lisp.
 826
 827 [fn:2] https://github.com/m2ym/optima