els-specializers.org

   1 #+TITLE: Generalizers: New Metaobjects for Generalized Dispatch
   2 #+AUTHOR: Christophe Rhodes, Jan Moringen, David Lichteblau
   3 #+OPTIONS: toc:nil
   4
   5 #+LaTeX_CLASS: acm_proc_article-sp
   6 #+LaTeX_HEADER: \DeclareTextFontCommand{\texttt}{\ttfamily\hyphenchar\font=45\relax}
   7
   8 #+begin_src elisp :exports none
   9 ;;; use C-x C-e on this if org refuses to export
  10 (add-to-list 'org-latex-classes
  11              '("acm_proc_article-sp" "\\documentclass{acm_proc_article-sp}"
  12                ("\\section{%s}" . "\\section*{%s}")
  13                ("\\subsection{%s}" . "\\subsection*{%s}")
  14                ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
  15                ("\\paragraph{%s}" . "\\paragraph*{%s}")
  16                ("\\subparagraph{%s}" . "\\subparagraph*{%s}")))
  17 #+end_src
  18
  19 #+begin_abstract
  20 This paper introduces a new metaobject, the generalizer, which
  21 complements the existing specializer metaobject.  With the help of
  22 examples, we show that this metaobject allows for the efficient
  23 implementation of complex non-class-based dispatch within the
  24 framework of existing metaobject protocols.  We present our
  25 modifications to the generic function invocation protocol from the
  26 /Art of the Metaobject Protocol/; in combination with previous work,
  27 this produces a fully-functional extension of the existing mechanism
  28 for method selection and combination, including support for method
  29 combination completely independent from method selection.  We discuss
  30 our implementation, within the SBCL implementation of Common Lisp, and
  31 in that context compare the performance of the new protocol with the
  32 standard one, demonstrating that the new protocol can be tolerably
  33 efficient.
  34 #+end_abstract
  35
  36 #+begin_LaTeX
  37 \category{D.1}{Software}{Programming Techniques}[Object-oriented Programming]
  38 \category{D.3.3}{Programming Languages}{Language Constructs and Features}
  39 \terms{Languages, Design}
  40 \keywords{generic functions, specialization-oriented programming, method selection, method combination}
  41 #+end_LaTeX
  42
  43 * Introduction
  44   The revisions to the original Common Lisp language \cite{CLtL}
  45   included the detailed specification of an object system, known as
  46   the Common Lisp Object System (CLOS), which was eventually
  47   standardized as part of the ANSI Common Lisp standard \cite{CLtS}.
  48   The object system as presented to the standardization committee was
  49   formed of three chapters.  The first two chapters covered programmer
  50   interface concepts and the functions in the programmer interface
  51   \cite[Chapter 28]{CLtL2} and were largely incorporated into the
  52   final standard; the third chapter, covering a Metaobject Protocol
  53   (MOP) for CLOS, was not.
  54
  55   Nevertheless, the CLOS MOP has proven to be a robust design, and
  56   while many implementations have derived their implementations of
  57   CLOS from either the Closette illustrative implementation in
  58   \cite{AMOP}, or the Portable Common Loops implementation of CLOS
  59   from Xerox Parc, there have been largely from-scratch
  60   reimplementations of CLOS (in CLISP[fn:1] and CCL[fn:2], at least)
  61   incorporating substantial fractions of the Metaobject Protocol as
  62   described.
  63
  64   Although it has stood the test of time, the CLOS MOP is neither
  65   without issues (e.g. semantic problems with =make-method-lambda=
  66   \cite{Costanza.Herzeel:2008}; useful functions such as
  67   =compute-effective-slot-definition-initargs= being missing from the
  68   standard) nor is it a complete framework for the metaprogrammer to
  69   implement all conceivable variations of object-oriented behaviour.
  70   While metaprogramming offers some possibilities for customization of
  71   the object system behaviour, those possibilities cannot extend
  72   arbitrarily in all directions.  There is still an expectation that
  73   functionality is implemented with methods on generic functions,
  74   acting on objects with slots.  Nevertheless, the MOP is flexible,
  75   and is used for a number of things, including: documentation
  76   generation (where introspective functionality in the MOP is used to
  77   extract information from a running system); object-relational
  78   mapping and other approaches to object persistence; alternative
  79   backing stores for slots (hash-tables or symbols); and programmatic
  80   construction of metaobjects, for example for IDL compilers and model
  81   transformations.
  82
  83   [ XXX: A picture on MOP flexibility here would be good; I have in my mind
  84   one where an object system is a point and the MOP opens up a blob
  85   around that point, and I'm sure I've seen it somewhere but I can't
  86   remember where.  Alternatively, there's Kiczales et al "MOPs: why we
  87   want them and what else they can do", fig. 2 ]
  88
  89   One area of functionality where there is scope for customization by
  90   the metaprogrammer is in the mechanics and semantics of method
  91   applicability and dispatch.  While in principle AMOP allows
  92   customization of dispatch in various different ways (the
  93   metaprogrammer can define methods on protocol functions such as
  94   =compute-applicable-methods=,
  95   =compute-applicable-methods-using-classes=), for example, in
  96   practice implementation support for this was weak until relatively
  97   recently[fn:3].
  98
  99   Another potential mechanism for customizing dispatch is implicit in
 100   the class structure defined by AMOP: standard specializer objects
 101   (instances of =class= and =eql-specializer=) are generalized
 102   instances of the =specializer= protocol class, and in principle
 103   there are no restrictions on the metaprogrammer constructing
 104   additional subclasses.  Previous work \cite{Newton.Rhodes:2008} has
 105   explored the potential for customizing generic function dispatch
 106   using extended specializers, but as of that work the metaprogrammer
 107   must override the entirety of the generic function invocation
 108   protocol (from =compute-discriminating-function= on down), leading
 109   to toy implementations and duplicated effort.
 110
 111   This paper introduces a protocol for efficient and controlled
 112   handling of new subclasses of =specializer=.  In particular, it
 113   introduces the =generalizer= protocol class, which generalizes the
 114   return value of =class-of= in method applicability computation, and
 115   allows the metaprogrammer to hook into cacheing schemes to avoid
 116   needless recomputation of effective methods for sufficiently similar
 117   generic function arguments (See Figure\nbsp\ref{fig:dispatch}).
 118
 119   #+CAPTION:    Dispatch Comparison
 120   #+LABEL:      fig:dispatch
 121   #+ATTR_LATEX: width=0.9\linewidth float
 122   [[file:figures/dispatch-comparison.pdf]]
 123
 124   The remaining sections in this paper can be read in any order.  We
 125   give some motivating examples in section [[#Examples]], including
 126   reimplementations of examples from previous work, as well as
 127   examples which are poorly supported by previous protocols.  We
 128   describe the protocol itself in section [[#Protocol]], describing each
 129   protocol function in detail and, where applicable, relating it to
 130   existing protocol functions within the CLOS MOP.  We survey related
 131   work in more detail in section [[#Related Work]], touching on work on
 132   customized dispatch schemes in other environments.  Finally, we draw
 133   our conclusions from this work, and indicate directions for further
 134   development, in section [[#Conclusions]]; reading that section before the
 135   others indicates substantial trust in the authors' work.
 136 * Examples
 137   :PROPERTIES:
 138   :CUSTOM_ID: Examples
 139   :END:
 140   In this section, we present a number of examples of dispatch
 141   implemented using our protocol, which we describe in section
 142   [[#Protocol]].  For reasons of space, the metaprogram code examples in
 143   this section do not include some of the necessary support code to
 144   run; complete implementations of each of these cases are included in
 145   an appendix / in the accompanying repository snapshot / at this
 146   location.
 147
 148   A note on terminology: we will attempt to distinguish between the
 149   user of an individual case of generalized dispatch (the
 150   “programmer”), the implementor of a particular case of generalized
 151   dispatch (the “metaprogrammer”), and the authors as the designers
 152   and implementors of our generalized dispatch protocol (the
 153   “metametaprogammer”, or more likely “we”).
 154 ** CONS specializers
 155    :PROPERTIES:
 156    :CUSTOM_ID: Cons
 157    :END:
 158    We start by presenting our original use case, performing
 159    dispatching on the first element of lists.  Semantically, we allow
 160    the programmer to specialize any argument of methods with a new
 161    kind of specializer, =cons-specializer=, which is applicable if and
 162    only if the corresponding object is a =cons= whose =car= is =eql=
 163    to the symbol associated with the =cons-specializer=; these
 164    specializers are more specific than the =cons= class, but less
 165    specific than an =eql-specializer= on any given =cons=.
 166
 167    One motivation for the use of this specializer is in an extensible
 168    code walker: a new special form can be handled simply by writing an
 169    additional method on the walking generic function, seamlessly
 170    interoperating with all existing methods.
 171
 172    The programmer code using these specializers is unchanged from
 173    \cite{Newton.Rhodes:2008}; the benefits of the protocol described
 174    here are centered on performance and generality: in an application
 175    such as walking source code, we would expect to encounter special
 176    forms (distinguished by particular atoms in the =car= position)
 177    multiple times, and hence to dispatch to the same effective method
 178    repeatedly.  We discuss this in more detail in section [[#Memoization]];
 179    we present the metaprogrammer code below.
 180
 181 #+begin_src lisp
 182 (defclass cons-specializer (specializer)
 183   ((%car :reader %car :initarg :car)))
 184 (defclass cons-generalizer (generalizer)
 185   ((%car :reader %car :initarg :car)))
 186 (defmethod generalizer-of-using-class
 187     ((gf cons-generic-function) arg)
 188   (typecase arg
 189     ((cons symbol)
 190      (make-instance 'cons-generalizer
 191                     :car (car arg)))
 192     (t (call-next-method))))
 193 (defmethod generalizer-equal-hash-key
 194     ((gf cons-generic-function)
 195      (g cons-generalizer))
 196   (%car g))
 197 (defmethod specializer-accepts-generalizer-p
 198     ((gf cons-generic-function)
 199      (s cons-specializer)
 200      (g cons-generalizer))
 201   (if (eql (%car s) (%car g))
 202       (values t t)
 203       (values nil t)))
 204 (defmethod specializer-accepts-p
 205     ((s cons-specializer) o)
 206   (and (consp o) (eql (car o) (%car s))))
 207 #+end_src
 208
 209 The code above shows the core of the use of our protocol.  We have
 210 elided some support code for parsing and unparsing specializers, and
 211 for handling introspective functions such as finding generic functions
 212 for a given specializer.  We have also elided methods on the protocol
 213 function =specializer<=; for =cons-specializers= here, specializer
 214 ordering is trivial, as only one =cons-specializer= can ever be
 215 applicable to any given argument.  See section [[#Accept]] for a case
 216 where specializer ordering is substantially different.
 217
 218 As in \cite{Newton.Rhodes:2008}, we can use these specializers to
 219 implement a modular code walker, where we define one method per
 220 special operator.  We show two of those methods below, in the context
 221 of a walker which checks for unused bindings and uses of unbound
 222 variables.
 223
 224 #+begin_src
 225 (defgeneric walk (form env stack)
 226   (:generic-function-class cons-generic-function))
 227 (defmethod walk ((expr (cons lambda)) env call-stack)
 228   (let ((lambda-list (cadr expr))
 229         (body (cddr expr)))
 230     (with-checked-bindings
 231         ((bindings-from-ll lambda-list) env call-stack)
 232       (dolist (form body)
 233         (walk form env (cons form call-stack))))))
 234 (defmethod walk ((expr (cons let)) env call-stack)
 235   (flet ((let-binding (x)
 236            (walk (cadr x) env (cons (cadr x) call-stack))
 237            (cons (car x) (make-instance 'binding))))
 238     (with-checked-bindings
 239         ((mapcar #'let-binding (cadr expr)) env call-stack)
 240       (dolist (form (cddr expr))
 241         (walk form env (cons form call-stack))))))
 242 #+end_src
 243
 244    Note that in this example there is no strict need for
 245    =cons-specializer= and =cons-generalizer= to be distinct classes –
 246    just as in the normal protocol involving
 247    =compute-applicable-methods= and
 248    =compute-applicable-methods-using-classes=, the specializer object
 249    for mediating dispatch contains the same information as the object
 250    representing the equivalence class of objects to which that
 251    specializer is applicable: here it is the =car= of the =cons=
 252    (which we wrap in a distinct object); in the standard dispatch it
 253    is the =class= of the object.  This feature also characterizes
 254    those use cases where the metaprogrammer could straightforwardly
 255    use filtered dispatch \cite{Costanza.etal:2008} to implement their
 256    dispatch semantics.  We will see in section [[#Accept]] an example
 257    of a case where filtered dispatch is incapable of straightforwardly
 258    expressing the dispatch, but first we present our implementation of
 259    the motivating case from \cite{Costanza.etal:2008}.
 260 ** SIGNUM specializers
 261    :PROPERTIES:
 262    :CUSTOM_ID: Signum
 263    :END:
 264    Our second example of the implementation and use of generalized
 265    specializers is a reimplementation of one of the examples in
 266    \cite{Costanza.etal:2008}: specifically, the factorial function.
 267    Here, we will perform dispatch based on the =signum= of the
 268    argument, and again, at most one method with a =signum= specializer
 269    will be appliable to any given argument, which makes the structure
 270    of the specializer implementation very similar to the =cons=
 271    specializers in the previous section.
 272
 273    We have chosen to compare signum values using \texttt{=}, which
 274    means that a method with specializer =(signum 1)= will be
 275    applicable to positive floating-point arguments (see the first
 276    method on =specializer-accepts-generalizer-p= and the method on
 277    =specializer=accepts-p= below).  This leads to one subtle
 278    difference in behaviour compared to that of the =cons=
 279    specializers: in the case of =signum= specializers, the /next/
 280    method after any =signum= specializer can be different, depending
 281    on the class of the argument.  This aspect of the dispatch is
 282    handled by the second method on =specializer-accepts-generalizer-p=
 283    below.
 284 #+begin_src lisp
 285 (defclass signum-specializer (specializer)
 286   ((%signum :reader %signum :initarg :signum)))
 287 (defclass signum-generalizer (generalizer)
 288   ((%signum :reader %signum :initarg :signum)))
 289 (defmethod generalizer-of-using-class
 290     ((gf signum-generic-function) arg)
 291   (typecase arg
 292     (real (make-instance 'signum-generalizer
 293                          :signum (signum arg)))
 294     (t (call-next-method))))
 295 (defmethod generalizer-equal-hash-key
 296     ((gf signum-generic-function)
 297      (g signum-specializer))
 298   (%signum g))
 299 (defmethod specializer-accepts-generalizer-p
 300     ((gf signum-generic-function)
 301      (s signum-specializer)
 302      (g signum-generalizer))
 303   (if (= (%signum s) (%signum g)) ; or EQL?
 304       (values t t)
 305       (values nil t)))
 306
 307 (defmethod specializer-accepts-generalizer-p
 308     ((gf signum-generic-function)
 309      (specializer sb-mop:specializer)
 310      (thing signum-specializer))
 311   (specializer-accepts-generalizer-p
 312    gf specializer (class-of (%signum thing))))
 313
 314 (defmethod specializer-accepts-p
 315     ((s signum-specializer) o)
 316   (and (realp o) (= (%signum s) (signum o))))
 317 #+end_src
 318
 319    Given these definitions, and once again some more straightforward
 320    ones elided for reasons of space, we can implement the factorial
 321    function as follows:
 322
 323 #+begin_src lisp
 324 (defgeneric fact (n)
 325   (:generic-function-class signum-generic-function))
 326 (defmethod fact ((n (signum 0))) 1)
 327 (defmethod fact ((n (signum 1))) (* n (fact (1- n))))
 328 #+end_src
 329
 330    We do not need to include a method on =(signum -1)=, as the
 331    standard =no-applicable-method= protocol will automatically apply to
 332    negative real or non-real arguments.
 333 ** Accept HTTP header specializers
 334    :PROPERTIES:
 335    :CUSTOM_ID: Accept
 336    :END:
 337    In this section, we implement a non-trivial form of dispatch.  The
 338    application in question is a web server, and specifically to allow
 339    the programmer to support RFC 2616 \cite{rfc2616} content
 340    negotiation, of particular interest to publishers and consumers of
 341    REST-style Web APIs.
 342
 343    The basic mechanism in content negotiation is as follows: the web
 344    client sends an HTTP request with an =Accept= header, which is a
 345    string describing the media types it is willing to receive as a
 346    response to the request, along with numerical preferences.  The web
 347    server compares these stated client preferences with the resources
 348    it has available to satisfy this request, and sends the best
 349    matching resource in its response.
 350
 351    For example, a graphical web browser might by default send an
 352    =Accept= header such as
 353    =text/html,application/xml;q=0.9,*/*;q=0.8=.  This should be
 354    interpreted by a web server as meaning that if for a given resource
 355    the server can provide content of type =text/html= (i.e. HTML),
 356    then it should do so.  Otherwise, if it can provide
 357    =application/xml= content (i.e. XML of any schema), then that
 358    should be provided; failing that, any other content type is
 359    acceptable.
 360
 361    In the case where there are static files on the filesystem, and the
 362    web server must merely select between them, there is not much more
 363    to say.  However, it is not unusual for a web service to be backed
 364    by some other form of data, and responses computed and sent on the
 365    fly, and in these circumstances the web server must compute which
 366    of its known output formats it can use to satisfy the request
 367    before actually generating the best matching response.  This can be
 368    modelled as one generic function responsible for generating the
 369    response, with methods corresponding to content-types -- and the
 370    generic function must then perform method selection against the
 371    request's =Accept= header to compute the appropriate response.
 372
 373    The =accept-specializer= below implements the dispatch.  It depends
 374    on a lazily-computed =tree= slot to represent the information in
 375    the accept header (generated by =parse-accept-string=), and a
 376    function =q= to compute the (defaulted) preference level for a
 377    given content-type and =tree=; then, method selection and ordering
 378    involves finding the =q= for each =accept-specializer='s content
 379    type given the =tree=, and sorting them according to the preference
 380    level.
 381
 382 #+begin_src lisp
 383 (defclass accept-specializer (specializer)
 384   ((media-type :initarg :media-type :reader media-type)))
 385 (defclass accept-generalizer (generalizer)
 386   ((header :initarg :header :reader header)
 387    (tree)
 388    (next :initarg :next :reader next)))
 389 (defmethod generalizer-equal-hash-key
 390     ((gf accept-generic-function)
 391      (g accept-generalizer))
 392   `(accept-generalizer ,(header g)))
 393 (defmethod specializer-accepts-generalizer-p
 394     ((gf accept-generic-function)
 395      (s accept-specializer)
 396      (generalizer accept-generalizer))
 397   (values (q (media-type s) (tree generalizer)) t))
 398 (defmethod specializer-accepts-generalizer-p
 399     ((gf accept-generic-function)
 400      (s specializer)
 401      (generalizer accept-generalizer))
 402   (specializer-accepts-generalizer-p
 403    gf s (next generalizer)))
 404
 405 (defmethod specializer<
 406     ((gf accept-generic-function)
 407      (s1 accept-specializer)
 408      (s2 accept-specializer)
 409      (generalizer accept-generalizer))
 410   (let ((m1 (media-type s1))
 411         (m2 (media-type s2))
 412         (tree (tree generalizer)))
 413     (cond
 414       ((string= m1 m2) '=)
 415       (t (let ((q1 (q m1 tree)))
 416                (q2 (q m2 tree))))
 417            (cond
 418              ((= q1 q2) '=)
 419              ((< q1 q2) '>)
 420              (t '<))))))
 421 #+end_src
 422
 423    The metaprogrammer can then support dispatching in this way for
 424    suitable objects, such as the =request= object representing a
 425    client request in the Hunchentoot web server.  The code below
 426    implements this, by defining the computation of a suitable
 427    =generalizer= object for a given request, and specifying how to
 428    compute whether the specializer accepts the given request object
 429    (=q= returns a number between 0 and 1 if any pattern in the =tree=
 430    matches the media type, and =nil= if the media type cannot be
 431    matched at all).
 432
 433 #+begin_src
 434 (defmethod generalizer-of-using-class
 435     ((gf accept-generic-function)
 436      (arg tbnl:request))
 437   (make-instance 'accept-generalizer
 438                  :header (tbnl:header-in :accept arg)
 439                  :next (class-of arg)))
 440 (defmethod specializer-accepts-p
 441     ((specializer accept-specializer)
 442      (obj tbnl:request))
 443   (let* ((accept (tbnl:header-in :accept obj))
 444          (tree (parse-accept-string accept))
 445          (q (q (media-type specializer) tree)))
 446     (and q (> q 0))))
 447 #+end_src
 448
 449    This dispatch cannot be implemented using filtered dispatch, except
 450    by generating anonymous classes with all the right mime-types as
 451    direct superclasses in dispatch order; the filter would generate
 452 #+begin_src lisp
 453 (ensure-class nil :direct-superclasses
 454  '(text/html image/webp ...))
 455 #+end_src
 456    and dispatch the operates using those anonymous classes.  While
 457    this is possible to do, it is awkward to express content-type
 458    negotiation in this way, as it means that the dispatcher must know
 459    about the universe of mime-types that clients might declare that
 460    they accept, rather than merely the set of mime-types that a
 461    particular generic function is capable of serving; handling
 462    wildcards in accept strings is particularly awkward in the
 463    filtering paradigm.
 464
 465    Note that in this example, the method on =specializer<= involves a
 466    nontrivial ordering of methods based on the =q= values specified in
 467    the accept header (whereas in sections [[#Cons]] and [[#Signum]] only a
 468    single extended specializer could be applicable to any given
 469    argument).
 470
 471    Also note that the accept specializer protocol is straightforwardly
 472    extensible to other suitable objects; for example, one simple
 473    debugging aid is to define that an =accept-specializer= should be
 474    applicable to =string= objects.  This can be done in a modular
 475    fashion (see the code below, which can be completely disconnected
 476    from the code for Hunchentoot request objects), and generalizes to
 477    dealing with multiple web server libraries, so that
 478    content-negotiation methods are applicable to each web server's
 479    request objects.
 480
 481 #+begin_src lisp
 482 (defmethod generalizer-of-using-class
 483     ((gf accept-generic-function)
 484      (s string))
 485   (make-instance 'accept-generalizer
 486                  :header s
 487                  :next (class-of s)))
 488 (defmethod specializer-accepts-p
 489     ((s accept-specializer) (string string))
 490   (let* ((tree (parse-accept-string string))
 491          (q (q (media-type s) tree)))
 492     (and q (> q 0))))
 493 #+end_src
 494 ** COMMENT Pattern / xpattern / regex / optima
 495    Here's the /really/ interesting bit, but on the other hand we're
 496    probably going to run out of space, and the full description of
 497    these is going to take us into =make-method-lambda= territory.
 498    A second paper?  Future work?
 499 * Protocol
 500   :PROPERTIES:
 501   :CUSTOM_ID: Protocol
 502   :END:
 503
 504   In section [[#Examples]], we have seen a number of code fragments as
 505   partial implementations of particular non-standard method dispatch,
 506   using =generalizer= metaobjects to mediate between the methods of
 507   the generic function and the actual arguments passed to it.  In
 508   section [[#Generalizer metaobjects]], we go into more detail regarding
 509   these =generalizer= metaobjects, describing the generic function
 510   invocation protocol in full, and showing how this protocol allows a
 511   similar form of effective method cacheing as the standard one does.
 512   In section [[#Generalizer performance]], we show the results of some
 513   simple performance measurements on our implementation of this
 514   protocol in the SBCL implementation \cite{Rhodes:2008} of Common
 515   Lisp to highlight the improvement that this protocol can bring over
 516   a naïve implementation of generalized dispatch, as well as
 517   to make the potential for further improvement clear.
 518
 519 ** Generalizer metaobjects
 520    :PROPERTIES:
 521    :CUSTOM_ID: Generalizer metaobjects
 522    :END:
 523
 524 *** Generic function invocation
 525     As in the standard generic function invocation protocol, the
 526     generic function's actual functionality is provided by a
 527     discriminating function.  The functionality described in this
 528     protocol is implemented by having a distinct subclass of
 529     =standard-generic-function=, and a method on
 530     =compute-discriminating-function= which produces a custom
 531     discriminating function.  The basic outline of the discriminating
 532     function is the same as the standard one: it must first compute the
 533     set of applicable methods given particular arguments; from that, it
 534     must compute the effective method by combining the methods
 535     appropriately according to the generic function's method
 536     combination; finally, it must call the effective method with the
 537     arguments.
 538
 539     Computing the set of applicable methods is done using a pair of
 540     functions: =compute-applicable-methods=, the standard metaobject
 541     function, and a new function
 542     =compute-applicable-methods-using-generalizers=.  We define a
 543     custom method on =compute-applicable-methods= which tests the
 544     applicability of a particular specializer against a given argument
 545     using =specializer-accepts-p=, a new protocol function with
 546     default implementations on =class= and =eql-specializer= to
 547     implement the expected behaviour.  In order to order the methods,
 548     as required by the protocol, we define a pairwise comparison
 549     operator =specializer<= which defines an ordering between
 550     specializers for a given generalizer argument (remembering that
 551     even in standard CLOS the ordering between =class= specializers
 552     can change depending on the actual class of the argument).
 553
 554     The new =compute-applicable-methods-using-generalizers= is the
 555     analogue of the MOP's =compute-applicable-methods-using-classes=.
 556     Instead of calling it with the =class-of= each argument, we compute
 557     the generalizers of each argument using the new function
 558     =generalizer-of-using-class= (where the =-using-class= refers to
 559     the class of the generic function rather than the class of the
 560     object), and call it with the list of generalizers.  As with the
 561     standard function, a secondary return value indicates whether the
 562     result of the function is definitive for that list of generalizers.
 563
 564     Thus, in generic function invocation, we first compute the
 565     generalizers of the arguments; we compute the ordered set of
 566     applicable methods, either from the generalizers or (if that is
 567     not definitive) from the arguments themselves; then the normal
 568     effective method computation and call can occur.  Unfortunately,
 569     the nature of an effective method object is not specified, so we
 570     have to reach into implementation internals a little in order to
 571     call it, but otherwise the remainder of the generic function
 572     invocation protocol is unchanged from the standard one.  In
 573     particular, method combination is completely unchanged;
 574     programmers can choose arbitrary method combinations, including
 575     user-defined long form combinations, for their generic functions
 576     involving generalized dispatch.
 577
 578 *** Effective method memoization
 579     :PROPERTIES:
 580     :CUSTOM_ID: Memoization
 581     :END:
 582     The potential efficiency benefit to having =generalizer=
 583     metaobjects lies in the use of
 584     =compute-applicable-methods-using-generalizers=.  If a particular
 585     generalized specializer accepts a variety of objects (such as the
 586     =signum= specializer accepting all reals with a given sign, or the
 587     =accept= specializer accepting all HTTP requests with a particular
 588     =Accept= header), then there is the possibility of cacheing and
 589     reusing the results of the applicable and effective method
 590     computation.  If the computation of the applicable method from
 591     =compute-applicable-methods-using-generalizers= is definitive,
 592     then the ordered set of applicable methods and the effective
 593     method can be cached.
 594
 595     One issue is what to use as the key for that cache.  We cannot use
 596     the generalizers themselves, as two generalizers that should be
 597     considered equal for cache lookup will not compare as =equal= –
 598     and indeed even the standard generalizer, the =class=, cannot be
 599     used as we must be able to invalidate cache entries upon class
 600     redefinition.  The issue of =class= generalizers we can solve as
 601     in \cite{Kiczales.Rodriguez:1990} by using the =wrapper= of a
 602     class, which is distinct for each distinct (re)definition of a
 603     class; for arbitrary generalizers, however, there is /a priori/ no
 604     good way of computing a suitable hash key automatically, so we
 605     allow the metaprogrammer to specify one by defining a method on
 606     =generalizer-equal-hash-key=, and combining the hash keys for all
 607     required arguments in a list to use as a key in an =equal=
 608     hash-table.
 609
 610     [XXX could we actually compute a suitable hash key using the
 611     generalizer's class name and initargs?]
 612
 613 *** COMMENT
 614     - [X] =generalizer-of-using-class= (NB class of gf not class of object)
 615     - [X] =compute-applicable-methods-using-generalizers=
 616     - [X] =generalizer-equal-hash-key=
 617     - [X] =specializer-accepts-generalizer-p=
 618     - [X] =specializer-accepts-p=
 619     - [X] =specializer<=
 620 ** Performance
 621    :PROPERTIES:
 622    :CUSTOM_ID: Generalizer performance
 623    :END:
 624    We have argued that the protocol presented here allows for
 625    expressive control of method dispatch while preserving the
 626    possibility of efficiency.  In this section, we quantify the
 627    efficiency that the memoization protocol described in section
 628    [[#Memoization]] achieves, by comparing it both to the same protocol
 629    with no memoization, as well as with equivalent dispatch
 630    implementations in the context of methods with regular specializers
 631    (in an implementation similar to that in
 632    \cite{Kiczales.Rodriguez:1990}), and with implementation in
 633    straightforward functions.
 634
 635    In the case of the =cons-specializer=, we benchmark the walker
 636    acting on a small but non-trivial form.  The implementation
 637    strategies in the table below refer to: an implementation in a
 638    single function with a large =typecase= to dispatch between all the
 639    cases; the natural implementation in terms of a standard generic
 640    function with multiple methods (the method on =cons= having a
 641    slightly reduced =typecase= to dispatch on the first element, and
 642    other methods handling =symbol= and other atoms); and three
 643    separate cases using =cons-specializer= objects.  As well as
 644    measuring the effect of memoization against the full invocation
 645    protocol, we can also introduce a special case: when only one
 646    argument participates in method selection (all the other required
 647    arguments only being specialized on =t=), we can avoid the
 648    construction of a list of hash keys and simply use the key
 649    from the single active generalizer directly.
 650
 651    | implementation        | time (µs/call) | overhead |
 652    |-----------------------+----------------+----------|
 653    | function              |           3.17 |          |
 654    | standard-gf/methods   |            3.6 |     +14% |
 655    | cons-gf/one-arg-cache |            7.4 |    +130% |
 656    | cons-gf               |             15 |    +370% |
 657    | cons-gf/no-cache      |             90 |   +2700% |
 658
 659    The benchmarking results from this exercise are promising: in
 660    particular, the introduction of the effective method cache speeds
 661    up the use of generic specializers in this case by a factor of 6,
 662    and the one-argument special case by another factor of 2.  For this
 663    workload, even the one-argument special case only gets to within a
 664    factor of 2-3 of the function and standard generic function
 665    implementations, but the overall picture is that the memoizability
 666    in the protocol does indeed drastically reduce the overhead
 667    compared with the full invocation.
 668
 669    For the =signum-specializer= case, we choose to benchmark the
 670    computation of 20!, because that is the largest factorial whose
 671    answer fits in SBCL's 63-bit fixnums – in an attempt to measure the
 672    worst case for generic dispatch, where the work done within the
 673    methods is as small as possible without being meaningless, and in
 674    particular does not cause allocation or garbage collection to
 675    obscure the picture.
 676
 677 #+begin_src lisp :exports none
 678 (progn (gc :full t) (time (dotimes (i 10000) (%fact 20))))
 679 #+end_src
 680
 681    | implementation          | time (µs/call) | overhead |
 682    |-------------------------+----------------+----------|
 683    | function                |            0.6 |          |
 684    | standard-gf/fixnum      |            1.2 |    +100% |
 685    | signum-gf/one-arg-cache |            7.5 |   +1100% |
 686    | signum-gf               |             23 |   +3800% |
 687    | signum-gf/no-cache      |            240 |  +41000% |
 688
 689    The relative picture is similar to the =cons-specializer= case;
 690    including a cache saves a factor of 10 in this case, and another
 691    factor of 3 for the one-argument cache special case.  The cost of
 692    the genericity of the protocol here is starker; even the
 693    one-argument cache is a factor of 6 slower than the standard
 694    generic-function implementation, and a further factor of 2 away
 695    from the implementation of factorial as a function.  We discuss
 696    ways in which we expect to be able to improve performance in
 697    section [[#Future Work]].
 698
 699    We could allow the metaprogrammer to improve on the one-argument
 700    performance by constructing a specialized cache: for =signum=
 701    arguments of =rational= arguments, the logical cache structure is
 702    to index a three-element vector with =(1+ signum)=.  The current
 703    protocol does not provide a way of eliding the two generic function
 704    calls for the generic cache; we discuss possible approaches in
 705    section [[#Conclusions]].
 706 ** Full protocol
 707    The protocol described in this paper is only part of a complete
 708    protocol for =specializer= and =generalizer= metaobjects.  Our
 709    development of this protocol is as yet incomplete; the work
 710    described here augments that in \cite{Newton.Rhodes:2008}, but is
 711    yet relatively untested – and additionally our recent experience of
 712    working with that earlier protocol suggests that there might be
 713    useful additions to the handling of =specializer= metaobjects,
 714    independent of the =generalizer= idea presented here.
 715
 716 *** COMMENT ideas
 717     Description and specification left for reasons of space (we'll see?)
 718     - [ ] =same-specializer-p=
 719     - [ ] =parse/unparse-specializer-using-class=
 720     - [ ] =make-method-specializers-form=
 721     - [ ] jmoringe: In an email, I suggested
 722       =make-specializer-form-using-class=:
 723
 724       #+begin_quote
 725       Could we change =make-method-specializers-form='s default
 726       behaviour to call a new generic function
 727       #+begin_src
 728         make-specializer-form-using-class gf method name env
 729       #+end_src
 730       with builtin methods on =sb-mop:specializer=, =symbol=, =cons= (for
 731       eql-specializers)? This would make it unnecessary to repeat
 732       boilerplate along the lines of
 733       #+begin_src lisp
 734       (flet ((make-parse-form (name)
 735                (if <name-is-interesting>
 736                  <handle-interesting-specializer>
 737                  <repeat-handling-of-standard-specializers>)))
 738         `(list ,@(mapcar #'make-parse-form specializer-names)))
 739       #+end_src
 740       for each generic function class.
 741       #+end_quote
 742     - [ ] =make-method-lambda= revision (use environment arg?)
 743
 744       jmoringe: would only be relevant for pattern dispatch, right? I
 745       think, we didn't finish the discussion regarding special
 746       variables vs. environment vs. new protocol function
 747
 748 * Related Work
 749   :PROPERTIES:
 750   :CUSTOM_ID: Related Work
 751   :END:
 752
 753   The work presented here builds on specializer-oriented programming
 754   described in \cite{Newton.Rhodes:2008}.  Approximately
 755   contemporaneously, filtered dispatch \cite{Costanza.etal:2008} was
 756   introduced to address some of the same use cases: filtered dispatch
 757   works by having a custom discriminating function which wraps the
 758   usual one, where the wrapping function augments the set of
 759   applicable methods with applicable methods from other (hidden)
 760   generic functions, one per filter group; this step is not memoized,
 761   and using =eql= methods to capture behaviours of equivalence classes
 762   means that it is hard to see how it could be.  The methods are then
 763   combined using a custom method combination to mimic the standard
 764   one; in principle implementors of other method combinations could
 765   cater for filtered dispatch, but they would have to explicitly
 766   modify their method combinations.  The Clojure programming language
 767   supports multimethods[fn:5] with a variant of filtered dispatch as
 768   well as hierachical and identity-based method selectors.
 769
 770   In context-oriented programming
 771   \cite{Hirschfeld.etal:2008,Vallejos.etal:2010}, context dispatch
 772   occurs by maintaining the context state as an anonymous class with
 773   the superclasses representing all the currently active layers; this
 774   is then passed as a hidden argument to context-aware functions.  The
 775   set of layers is known and under programmer control, as layers must
 776   be defined beforehand.
 777
 778   In some sense, all dispatch schemes are specializations of predicate
 779   dispatch \cite{Ernst.etal:1998}.  The main problem with predicate
 780   dispatch is its expressiveness: with arbitrary predicates able to
 781   control dispatch, it is essentially impossible to perform any
 782   substantial precomputation, or even to automatically determine an
 783   ordering of methods given a set of arguments.  Even Clojure's
 784   restricted dispatch scheme provides an explicit operator for stating
 785   a preference order among methods, where here we provide an operator
 786   to order specializers; in filtered dispatch the programmer
 787   implicitly gives the system an order of precedence, through the
 788   lexical ordering of filter specification in a filtered function
 789   definition.
 790
 791   The Slate programming environment combines prototype-oriented
 792   programming with multiple dispatch \cite{Salzman.Aldrich:2005}; in
 793   that context, the analogue of an argument's class (in Common Lisp)
 794   as a representation of the equivalence class of objects with the
 795   same behaviour is the tuple of roles and delegations: objects with
 796   the same roles and delegations tuple behave the same, much as
 797   objects with the same generalizer have the same behaviour in the
 798   protocol described in this paper.
 799
 800   The idea of generalization is of course not new, and arises in other
 801   contexts.  Perhaps of particular interest is generalization in the
 802   context of partial evaluation; for example, \cite{Ruf:1993}
 803   considers generalization in online partial evaluation, where sets of
 804   possible values are represented by a type system construct
 805   representing an upper bound.  The relationship between generalizer
 806   metaobjects and approximation in type systems could be further
 807   explored.
 808 * Conclusions
 809   :PROPERTIES:
 810   :CUSTOM_ID: Conclusions
 811   :END:
 812   In this paper, we have presented a new generalizer metaobject
 813   protocol allowing the metaprogrammer to implement in a
 814   straightforward manner metaobjects to implement custom method
 815   selection, rather than the standard method selection as standardized
 816   in Common Lisp.  This protocol seamlessly interoperates with the
 817   rest of CLOS and Common Lisp in general; the programmer (the user of
 818   the custom specializer metaobjects) may without constraints use
 819   arbitrary method combination, intercede in effective method
 820   combination, or write custom method function implementations.  The
 821   protocol is expressive, in that it handles forms of dispatch not
 822   possible in more restricted dispatch systems, while not suffering
 823   from the indeterminism present in predicate dispatch through the use
 824   of explicit ordering predicates.
 825
 826   The protocol is also reasonably efficient; the metaprogrammer can
 827   indicate that a particular effective method computation can be
 828   memoized, and under those circumstances much of the overhead is
 829   amortized (though there remains a substantial overhead compared with
 830   standard generic-function or regular function calls).  We discuss
 831   how the efficiency could be improved below.
 832 ** Future Work
 833    :PROPERTIES:
 834    :CUSTOM_ID: Future Work
 835    :END:
 836    Although the protocol described in this paper allows for a more
 837    efficient implementation, as described in section [[#Memoization]],
 838    than computing the applicable and effective methods at each generic
 839    function call, the efficiency is still some way away from a
 840    baseline of the standard generic-function, let alone a standard
 841    function.  Most of the invocation protocol is memoized, but there
 842    are still two full standard generic-function calls –
 843    =generalizer-of-using-class= and =generalizer-equal-hash-key= – per
 844    argument per call to a generic function with extended specializers,
 845    not to mention a hash table lookup.
 846
 847    For many applications, the additional flexibility afforded by
 848    generalized specializers might be worth the cost in efficiency, but
 849    it would still be worth investigating how much the overhead from
 850    generalized specializers can be reduced; one possible avenue for
 851    investigation is giving greater control over the cacheing strategy
 852    to the metaprogrammer.
 853
 854    As an example, consider the =signum-specializer=.  The natural
 855    cache structure for a single argument generic function specializing
 856    on =signum= is probably a four-element vector, where the first
 857    three elements hold the effective methods for =signum= values of
 858    -1, 0, and 1, and the fourth holds the cached effective methods for
 859    everything else.  This would make the invocation of such functions
 860    very fast for the (presumed) common case where the argument is in
 861    fact a real number.  We hope to develop and show the effectiveness
 862    of an appropriate protocol to allow the metaprogrammer to construct
 863    and exploit such cacheing strategies, and (more speculatively) to
 864    implement the lookup of an effective method function in other ways.
 865
 866    We also aim to demonstrate support within this protocol for some
 867    particular cases of generalized specializers which seem to have
 868    widespread demand (in as much as any language extension can be said
 869    to be in “demand”).  In particular, we have preliminary work
 870    towards supporting efficient dispatch over pattern specializers
 871    such as implemented in the \textsf{Optima} library[fn:4], and over
 872    a prototype object system similar to that in Slate
 873    \cite{Salzman.Aldrich:2005}.  Our current source code for the work
 874    described in this paper can be seen in the git source code
 875    repository at [[http://christophe.rhodes.io/git/specializable.git]],
 876    which will be updated with future developments.
 877
 878    Finally, after further experimentation (and, ideally, non-trivial
 879    use in production) if this protocol stands up to use as we hope, we
 880    aim to produce a standards-quality document so that other
 881    implementors of Common Lisp can, if they choose, independently
 882    reimplement the protocol, and so that users can use the protocol
 883    with confidence that the semantics will not change in a
 884    backwards-incompatible fashion.
 885 ** Acknowledgments
 886    We thank Lee Salzman, Pascal Costanza and Mikel Evins for helpful
 887    and informative discussions, and all the respondents to one
 888    author's request for imaginative uses for generalized specializers.
 889
 890 \bibliographystyle{plain}
 891 \bibliography{crhodes,specializers}
 892
 893 * Footnotes
 894
 895 [fn:1] GNU CLISP, at http://www.clisp.org/
 896
 897 [fn:2] Clozure Common Lisp, at http://ccl.clozure.com/
 898
 899 [fn:3] the \textsf{Closer to MOP} project, at
 900    http://common-lisp.net/project/closer/, attempts to harmonize the
 901    different implementations of the metaobject protocol in Common
 902    Lisp.
 903
 904 [fn:4] https://github.com/m2ym/optima
 905
 906 [fn:5] http://clojure.org/multimethods
 907
 908