els-specializers.org

   1 #+TITLE: Generalizers: New Metaobjects for Generalized Dispatch
   2 #+AUTHOR: Christophe Rhodes, Jan Moringen, David Lichteblau
   3 #+OPTIONS: toc:nil
   4
   5 #+LaTeX_CLASS: acm_proc_article-sp
   6 #+LaTeX_HEADER: \DeclareTextFontCommand{\texttt}{\ttfamily\hyphenchar\font=45\relax}
   7
   8 #+begin_src elisp :exports none
   9 ;;; use C-x C-e on this if org refuses to export
  10 (add-to-list 'org-latex-classes
  11              '("acm_proc_article-sp" "\\documentclass{acm_proc_article-sp}"
  12                ("\\section{%s}" . "\\section*{%s}")
  13                ("\\subsection{%s}" . "\\subsection*{%s}")
  14                ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
  15                ("\\paragraph{%s}" . "\\paragraph*{%s}")
  16                ("\\subparagraph{%s}" . "\\subparagraph*{%s}")))
  17 (set (make-local-variable 'org-latex-title-command)
  18      "\\numberofauthors{3}
  19 \\author{
  20 \\alignauthor Christophe Rhodes\\\\
  21   \\affaddr{Department of Computing}\\\\
  22   \\affaddr{Goldsmiths, University of London}\\\\
  23   \\affaddr{London SE14 6NW}\\\\
  24   \\email{c.rhodes@gold.ac.uk}
  25 \\alignauthor Jan Moringen\\\\
  26   \\affaddr{Universität Bielefeld}\\\\
  27   \\affaddr{Technische Fakultät}\\\\
  28   \\affaddr{33594 Bielefeld}\\\\
  29   \\email{jmoringe@techfak.uni-bielefeld.de}
  30 \\alignauthor David Lichteblau\\\\
  31   \\affaddr{ZenRobotics Ltd}\\\\
  32   \\affaddr{Vilhonkatu 5 A}\\\\
  33   \\affaddr{FI-00100 Helsinki}\\\\
  34   \\email{david@lichteblau.com}
  35 #+end_src
  36
  37 #+begin_abstract
  38 This paper introduces a new metaobject, the generalizer, which
  39 complements the existing specializer metaobject.  With the help of
  40 examples, we show that this metaobject allows for the efficient
  41 implementation of complex non-class-based dispatch within the
  42 framework of existing metaobject protocols.  We present our
  43 modifications to the generic function invocation protocol from the
  44 /Art of the Metaobject Protocol/; in combination with previous work,
  45 this produces a fully-functional extension of the existing mechanism
  46 for method selection and combination, including support for method
  47 combination completely independent from method selection.  We discuss
  48 our implementation, within the SBCL implementation of Common Lisp, and
  49 in that context compare the performance of the new protocol with the
  50 standard one, demonstrating that the new protocol can be tolerably
  51 efficient.
  52 #+end_abstract
  53
  54 #+begin_LaTeX
  55 \category{D.1}{Software}{Programming Techniques}[Object-oriented Programming]
  56 \category{D.3.3}{Programming Languages}{Language Constructs and Features}
  57 \terms{Languages, Design}
  58 \keywords{generic functions, specialization-oriented programming, method selection, method combination}
  59 #+end_LaTeX
  60
  61 * Introduction
  62   The revisions to the original Common Lisp language \cite{CLtL}
  63   included the detailed specification of an object system, known as
  64   the Common Lisp Object System (CLOS), which was eventually
  65   standardized as part of the ANSI Common Lisp standard \cite{CLtS}.
  66   The object system as presented to the standardization committee was
  67   formed of three chapters.  The first two chapters covered programmer
  68   interface concepts and the functions in the programmer interface
  69   \cite[Chapter 28]{CLtL2} and were largely incorporated into the
  70   final standard; the third chapter, covering a Metaobject Protocol
  71   (MOP) for CLOS, was not.
  72
  73   Nevertheless, the CLOS MOP has proven to be a robust design, and
  74   while many implementations have derived their implementations of
  75   CLOS from either the Closette illustrative implementation in
  76   \cite{AMOP}, or the Portable Common Loops implementation of CLOS
  77   from Xerox Parc, there have been largely from-scratch
  78   reimplementations of CLOS (in CLISP[fn:1] and CCL[fn:2], at least)
  79   incorporating substantial fractions of the Metaobject Protocol as
  80   described.
  81
  82   Although it has stood the test of time, the CLOS MOP is neither
  83   without issues (e.g. semantic problems with =make-method-lambda=
  84   \cite{Costanza.Herzeel:2008}; useful functions such as
  85   =compute-effective-slot-definition-initargs= being missing from the
  86   standard) nor is it a complete framework for the metaprogrammer to
  87   implement all conceivable variations of object-oriented behaviour.
  88   While metaprogramming offers some possibilities for customization of
  89   the object system behaviour, those possibilities cannot extend
  90   arbitrarily in all directions.  There is still an expectation that
  91   functionality is implemented with methods on generic functions,
  92   acting on objects with slots.  Nevertheless, the MOP is flexible,
  93   and is used for a number of things, including: documentation
  94   generation (where introspective functionality in the MOP is used to
  95   extract information from a running system); object-relational
  96   mapping and other approaches to object persistence; alternative
  97   backing stores for slots (hash-tables or symbols); and programmatic
  98   construction of metaobjects, for example for IDL compilers and model
  99   transformations.
 100
 101   [ XXX: A picture on MOP flexibility here would be good; I have in my mind
 102   one where an object system is a point and the MOP opens up a blob
 103   around that point, and I'm sure I've seen it somewhere but I can't
 104   remember where.  Alternatively, there's Kiczales et al "MOPs: why we
 105   want them and what else they can do", fig. 2 ]
 106
 107   One area of functionality where there is scope for customization by
 108   the metaprogrammer is in the mechanics and semantics of method
 109   applicability and dispatch.  While in principle AMOP allows
 110   customization of dispatch in various different ways (the
 111   metaprogrammer can define methods on protocol functions such as
 112   =compute-applicable-methods=,
 113   =compute-applicable-methods-using-classes=), for example, in
 114   practice implementation support for this was weak until relatively
 115   recently[fn:3].
 116
 117   Another potential mechanism for customizing dispatch is implicit in
 118   the class structure defined by AMOP: standard specializer objects
 119   (instances of =class= and =eql-specializer=) are generalized
 120   instances of the =specializer= protocol class, and in principle
 121   there are no restrictions on the metaprogrammer constructing
 122   additional subclasses.  Previous work \cite{Newton.Rhodes:2008} has
 123   explored the potential for customizing generic function dispatch
 124   using extended specializers, but as of that work the metaprogrammer
 125   must override the entirety of the generic function invocation
 126   protocol (from =compute-discriminating-function= on down), leading
 127   to toy implementations and duplicated effort.
 128
 129   This paper introduces a protocol for efficient and controlled
 130   handling of new subclasses of =specializer=.  In particular, it
 131   introduces the =generalizer= protocol class, which generalizes the
 132   return value of =class-of= in method applicability computation, and
 133   allows the metaprogrammer to hook into cacheing schemes to avoid
 134   needless recomputation of effective methods for sufficiently similar
 135   generic function arguments (See Figure\nbsp\ref{fig:dispatch}).
 136
 137   #+CAPTION:    Dispatch Comparison
 138   #+LABEL:      fig:dispatch
 139   #+ATTR_LATEX: width=0.9\linewidth float
 140   [[file:figures/dispatch-comparison.pdf]]
 141
 142   The remaining sections in this paper can be read in any order.  We
 143   give some motivating examples in section [[#Examples]], including
 144   reimplementations of examples from previous work, as well as
 145   examples which are poorly supported by previous protocols.  We
 146   describe the protocol itself in section [[#Protocol]], describing each
 147   protocol function in detail and, where applicable, relating it to
 148   existing protocol functions within the CLOS MOP.  We survey related
 149   work in more detail in section [[#Related Work]], touching on work on
 150   customized dispatch schemes in other environments.  Finally, we draw
 151   our conclusions from this work, and indicate directions for further
 152   development, in section [[#Conclusions]]; reading that section before the
 153   others indicates substantial trust in the authors' work.
 154 * Examples
 155   :PROPERTIES:
 156   :CUSTOM_ID: Examples
 157   :END:
 158   In this section, we present a number of examples of dispatch
 159   implemented using our protocol, which we describe in section
 160   [[#Protocol]].  For reasons of space, the metaprogram code examples in
 161   this section do not include some of the necessary support code to
 162   run; complete implementations of each of these cases are included in
 163   an appendix / in the accompanying repository snapshot / at this
 164   location.
 165
 166   A note on terminology: we will attempt to distinguish between the
 167   user of an individual case of generalized dispatch (the
 168   “programmer”), the implementor of a particular case of generalized
 169   dispatch (the “metaprogrammer”), and the authors as the designers
 170   and implementors of our generalized dispatch protocol (the
 171   “metametaprogammer”, or more likely “we”).
 172 ** CONS specializers
 173    :PROPERTIES:
 174    :CUSTOM_ID: Cons
 175    :END:
 176    We start by presenting our original use case, performing
 177    dispatching on the first element of lists.  Semantically, we allow
 178    the programmer to specialize any argument of methods with a new
 179    kind of specializer, =cons-specializer=, which is applicable if and
 180    only if the corresponding object is a =cons= whose =car= is =eql=
 181    to the symbol associated with the =cons-specializer=; these
 182    specializers are more specific than the =cons= class, but less
 183    specific than an =eql-specializer= on any given =cons=.
 184
 185    One motivation for the use of this specializer is in an extensible
 186    code walker: a new special form can be handled simply by writing an
 187    additional method on the walking generic function, seamlessly
 188    interoperating with all existing methods.
 189
 190    The programmer code using these specializers is unchanged from
 191    \cite{Newton.Rhodes:2008}; the benefits of the protocol described
 192    here are centered on performance and generality: in an application
 193    such as walking source code, we would expect to encounter special
 194    forms (distinguished by particular atoms in the =car= position)
 195    multiple times, and hence to dispatch to the same effective method
 196    repeatedly.  We discuss this in more detail in section [[#Memoization]];
 197    we present the metaprogrammer code below.
 198
 199 #+begin_src lisp
 200 (defclass cons-specializer (specializer)
 201   ((%car :reader %car :initarg :car)))
 202 (defclass cons-generalizer (generalizer)
 203   ((%car :reader %car :initarg :car)))
 204 (defmethod generalizer-of-using-class
 205     ((gf cons-generic-function) arg)
 206   (typecase arg
 207     ((cons symbol)
 208      (make-instance 'cons-generalizer
 209                     :car (car arg)))
 210     (t (call-next-method))))
 211 (defmethod generalizer-equal-hash-key
 212     ((gf cons-generic-function)
 213      (g cons-generalizer))
 214   (%car g))
 215 (defmethod specializer-accepts-generalizer-p
 216     ((gf cons-generic-function)
 217      (s cons-specializer)
 218      (g cons-generalizer))
 219   (if (eql (%car s) (%car g))
 220       (values t t)
 221       (values nil t)))
 222 (defmethod specializer-accepts-p
 223     ((s cons-specializer) o)
 224   (and (consp o) (eql (car o) (%car s))))
 225 #+end_src
 226
 227 The code above shows the core of the use of our protocol.  We have
 228 elided some support code for parsing and unparsing specializers, and
 229 for handling introspective functions such as finding generic functions
 230 for a given specializer.  We have also elided methods on the protocol
 231 function =specializer<=; for =cons-specializers= here, specializer
 232 ordering is trivial, as only one =cons-specializer= can ever be
 233 applicable to any given argument.  See section [[#Accept]] for a case
 234 where specializer ordering is substantially different.
 235
 236 As in \cite{Newton.Rhodes:2008}, we can use these specializers to
 237 implement a modular code walker, where we define one method per
 238 special operator.  We show two of those methods below, in the context
 239 of a walker which checks for unused bindings and uses of unbound
 240 variables.
 241
 242 #+begin_src
 243 (defgeneric walk (form env stack)
 244   (:generic-function-class cons-generic-function))
 245 (defmethod walk ((expr (cons lambda)) env call-stack)
 246   (let ((lambda-list (cadr expr))
 247         (body (cddr expr)))
 248     (with-checked-bindings
 249         ((bindings-from-ll lambda-list) env call-stack)
 250       (dolist (form body)
 251         (walk form env (cons form call-stack))))))
 252 (defmethod walk ((expr (cons let)) env call-stack)
 253   (flet ((let-binding (x)
 254            (walk (cadr x) env (cons (cadr x) call-stack))
 255            (cons (car x) (make-instance 'binding))))
 256     (with-checked-bindings
 257         ((mapcar #'let-binding (cadr expr)) env call-stack)
 258       (dolist (form (cddr expr))
 259         (walk form env (cons form call-stack))))))
 260 #+end_src
 261
 262    Note that in this example there is no strict need for
 263    =cons-specializer= and =cons-generalizer= to be distinct classes –
 264    just as in the normal protocol involving
 265    =compute-applicable-methods= and
 266    =compute-applicable-methods-using-classes=, the specializer object
 267    for mediating dispatch contains the same information as the object
 268    representing the equivalence class of objects to which that
 269    specializer is applicable: here it is the =car= of the =cons=
 270    (which we wrap in a distinct object); in the standard dispatch it
 271    is the =class= of the object.  This feature also characterizes
 272    those use cases where the metaprogrammer could straightforwardly
 273    use filtered dispatch \cite{Costanza.etal:2008} to implement their
 274    dispatch semantics.  We will see in section [[#Accept]] an example
 275    of a case where filtered dispatch is incapable of straightforwardly
 276    expressing the dispatch, but first we present our implementation of
 277    the motivating case from \cite{Costanza.etal:2008}.
 278 ** SIGNUM specializers
 279    :PROPERTIES:
 280    :CUSTOM_ID: Signum
 281    :END:
 282    Our second example of the implementation and use of generalized
 283    specializers is a reimplementation of one of the examples in
 284    \cite{Costanza.etal:2008}: specifically, the factorial function.
 285    Here, we will perform dispatch based on the =signum= of the
 286    argument, and again, at most one method with a =signum= specializer
 287    will be appliable to any given argument, which makes the structure
 288    of the specializer implementation very similar to the =cons=
 289    specializers in the previous section.
 290
 291    We have chosen to compare signum values using \texttt{=}, which
 292    means that a method with specializer =(signum 1)= will be
 293    applicable to positive floating-point arguments (see the first
 294    method on =specializer-accepts-generalizer-p= and the method on
 295    =specializer=accepts-p= below).  This leads to one subtle
 296    difference in behaviour compared to that of the =cons=
 297    specializers: in the case of =signum= specializers, the /next/
 298    method after any =signum= specializer can be different, depending
 299    on the class of the argument.  This aspect of the dispatch is
 300    handled by the second method on =specializer-accepts-generalizer-p=
 301    below.
 302 #+begin_src lisp
 303 (defclass signum-specializer (specializer)
 304   ((%signum :reader %signum :initarg :signum)))
 305 (defclass signum-generalizer (generalizer)
 306   ((%signum :reader %signum :initarg :signum)))
 307 (defmethod generalizer-of-using-class
 308     ((gf signum-generic-function) arg)
 309   (typecase arg
 310     (real (make-instance 'signum-generalizer
 311                          :signum (signum arg)))
 312     (t (call-next-method))))
 313 (defmethod generalizer-equal-hash-key
 314     ((gf signum-generic-function)
 315      (g signum-specializer))
 316   (%signum g))
 317 (defmethod specializer-accepts-generalizer-p
 318     ((gf signum-generic-function)
 319      (s signum-specializer)
 320      (g signum-generalizer))
 321   (if (= (%signum s) (%signum g)) ; or EQL?
 322       (values t t)
 323       (values nil t)))
 324
 325 (defmethod specializer-accepts-generalizer-p
 326     ((gf signum-generic-function)
 327      (specializer sb-mop:specializer)
 328      (thing signum-specializer))
 329   (specializer-accepts-generalizer-p
 330    gf specializer (class-of (%signum thing))))
 331
 332 (defmethod specializer-accepts-p
 333     ((s signum-specializer) o)
 334   (and (realp o) (= (%signum s) (signum o))))
 335 #+end_src
 336
 337    Given these definitions, and once again some more straightforward
 338    ones elided for reasons of space, we can implement the factorial
 339    function as follows:
 340
 341 #+begin_src lisp
 342 (defgeneric fact (n)
 343   (:generic-function-class signum-generic-function))
 344 (defmethod fact ((n (signum 0))) 1)
 345 (defmethod fact ((n (signum 1))) (* n (fact (1- n))))
 346 #+end_src
 347
 348    We do not need to include a method on =(signum -1)=, as the
 349    standard =no-applicable-method= protocol will automatically apply to
 350    negative real or non-real arguments.
 351 ** Accept HTTP header specializers
 352    :PROPERTIES:
 353    :CUSTOM_ID: Accept
 354    :END:
 355    In this section, we implement a non-trivial form of dispatch.  The
 356    application in question is a web server, and specifically to allow
 357    the programmer to support RFC 2616 \cite{rfc2616} content
 358    negotiation, of particular interest to publishers and consumers of
 359    REST-style Web APIs.
 360
 361    The basic mechanism in content negotiation is as follows: the web
 362    client sends an HTTP request with an =Accept= header, which is a
 363    string describing the media types it is willing to receive as a
 364    response to the request, along with numerical preferences.  The web
 365    server compares these stated client preferences with the resources
 366    it has available to satisfy this request, and sends the best
 367    matching resource in its response.
 368
 369    For example, a graphical web browser might by default send an
 370    =Accept= header such as
 371    =text/html,application/xml;q=0.9,*/*;q=0.8=.  This should be
 372    interpreted by a web server as meaning that if for a given resource
 373    the server can provide content of type =text/html= (i.e. HTML),
 374    then it should do so.  Otherwise, if it can provide
 375    =application/xml= content (i.e. XML of any schema), then that
 376    should be provided; failing that, any other content type is
 377    acceptable.
 378
 379    In the case where there are static files on the filesystem, and the
 380    web server must merely select between them, there is not much more
 381    to say.  However, it is not unusual for a web service to be backed
 382    by some other form of data, and responses computed and sent on the
 383    fly, and in these circumstances the web server must compute which
 384    of its known output formats it can use to satisfy the request
 385    before actually generating the best matching response.  This can be
 386    modelled as one generic function responsible for generating the
 387    response, with methods corresponding to content-types -- and the
 388    generic function must then perform method selection against the
 389    request's =Accept= header to compute the appropriate response.
 390
 391    The =accept-specializer= below implements the dispatch.  It depends
 392    on a lazily-computed =tree= slot to represent the information in
 393    the accept header (generated by =parse-accept-string=), and a
 394    function =q= to compute the (defaulted) preference level for a
 395    given content-type and =tree=; then, method selection and ordering
 396    involves finding the =q= for each =accept-specializer='s content
 397    type given the =tree=, and sorting them according to the preference
 398    level.
 399
 400 #+begin_src lisp
 401 (defclass accept-specializer (specializer)
 402   ((media-type :initarg :media-type :reader media-type)))
 403 (defclass accept-generalizer (generalizer)
 404   ((header :initarg :header :reader header)
 405    (tree)
 406    (next :initarg :next :reader next)))
 407 (defmethod generalizer-equal-hash-key
 408     ((gf accept-generic-function)
 409      (g accept-generalizer))
 410   `(accept-generalizer ,(header g)))
 411 (defmethod specializer-accepts-generalizer-p
 412     ((gf accept-generic-function)
 413      (s accept-specializer)
 414      (generalizer accept-generalizer))
 415   (values (q (media-type s) (tree generalizer)) t))
 416 (defmethod specializer-accepts-generalizer-p
 417     ((gf accept-generic-function)
 418      (s specializer)
 419      (generalizer accept-generalizer))
 420   (specializer-accepts-generalizer-p
 421    gf s (next generalizer)))
 422
 423 (defmethod specializer<
 424     ((gf accept-generic-function)
 425      (s1 accept-specializer)
 426      (s2 accept-specializer)
 427      (generalizer accept-generalizer))
 428   (let ((m1 (media-type s1))
 429         (m2 (media-type s2))
 430         (tree (tree generalizer)))
 431     (cond
 432       ((string= m1 m2) '=)
 433       (t (let ((q1 (q m1 tree)))
 434                (q2 (q m2 tree))))
 435            (cond
 436              ((= q1 q2) '=)
 437              ((< q1 q2) '>)
 438              (t '<))))))
 439 #+end_src
 440
 441    The metaprogrammer can then support dispatching in this way for
 442    suitable objects, such as the =request= object representing a
 443    client request in the Hunchentoot web server.  The code below
 444    implements this, by defining the computation of a suitable
 445    =generalizer= object for a given request, and specifying how to
 446    compute whether the specializer accepts the given request object
 447    (=q= returns a number between 0 and 1 if any pattern in the =tree=
 448    matches the media type, and =nil= if the media type cannot be
 449    matched at all).
 450
 451 #+begin_src
 452 (defmethod generalizer-of-using-class
 453     ((gf accept-generic-function)
 454      (arg tbnl:request))
 455   (make-instance 'accept-generalizer
 456                  :header (tbnl:header-in :accept arg)
 457                  :next (class-of arg)))
 458 (defmethod specializer-accepts-p
 459     ((specializer accept-specializer)
 460      (obj tbnl:request))
 461   (let* ((accept (tbnl:header-in :accept obj))
 462          (tree (parse-accept-string accept))
 463          (q (q (media-type specializer) tree)))
 464     (and q (> q 0))))
 465 #+end_src
 466
 467    This dispatch cannot be implemented using filtered dispatch, except
 468    by generating anonymous classes with all the right mime-types as
 469    direct superclasses in dispatch order; the filter would generate
 470 #+begin_src lisp
 471 (ensure-class nil :direct-superclasses
 472  '(text/html image/webp ...))
 473 #+end_src
 474    and dispatch the operates using those anonymous classes.  While
 475    this is possible to do, it is awkward to express content-type
 476    negotiation in this way, as it means that the dispatcher must know
 477    about the universe of mime-types that clients might declare that
 478    they accept, rather than merely the set of mime-types that a
 479    particular generic function is capable of serving; handling
 480    wildcards in accept strings is particularly awkward in the
 481    filtering paradigm.
 482
 483    Note that in this example, the method on =specializer<= involves a
 484    nontrivial ordering of methods based on the =q= values specified in
 485    the accept header (whereas in sections [[#Cons]] and [[#Signum]] only a
 486    single extended specializer could be applicable to any given
 487    argument).
 488
 489    Also note that the accept specializer protocol is straightforwardly
 490    extensible to other suitable objects; for example, one simple
 491    debugging aid is to define that an =accept-specializer= should be
 492    applicable to =string= objects.  This can be done in a modular
 493    fashion (see the code below, which can be completely disconnected
 494    from the code for Hunchentoot request objects), and generalizes to
 495    dealing with multiple web server libraries, so that
 496    content-negotiation methods are applicable to each web server's
 497    request objects.
 498
 499 #+begin_src lisp
 500 (defmethod generalizer-of-using-class
 501     ((gf accept-generic-function)
 502      (s string))
 503   (make-instance 'accept-generalizer
 504                  :header s
 505                  :next (class-of s)))
 506 (defmethod specializer-accepts-p
 507     ((s accept-specializer) (string string))
 508   (let* ((tree (parse-accept-string string))
 509          (q (q (media-type s) tree)))
 510     (and q (> q 0))))
 511 #+end_src
 512 ** COMMENT Pattern / xpattern / regex / optima
 513    Here's the /really/ interesting bit, but on the other hand we're
 514    probably going to run out of space, and the full description of
 515    these is going to take us into =make-method-lambda= territory.
 516    A second paper?  Future work?
 517 * Protocol
 518   :PROPERTIES:
 519   :CUSTOM_ID: Protocol
 520   :END:
 521
 522   In section [[#Examples]], we have seen a number of code fragments as
 523   partial implementations of particular non-standard method dispatch,
 524   using =generalizer= metaobjects to mediate between the methods of
 525   the generic function and the actual arguments passed to it.  In
 526   section [[#Generalizer metaobjects]], we go into more detail regarding
 527   these =generalizer= metaobjects, describing the generic function
 528   invocation protocol in full, and showing how this protocol allows a
 529   similar form of effective method cacheing as the standard one does.
 530   In section [[#Generalizer performance]], we show the results of some
 531   simple performance measurements on our implementation of this
 532   protocol in the SBCL implementation \cite{Rhodes:2008} of Common
 533   Lisp to highlight the improvement that this protocol can bring over
 534   a naïve implementation of generalized dispatch, as well as
 535   to make the potential for further improvement clear.
 536
 537 ** Generalizer metaobjects
 538    :PROPERTIES:
 539    :CUSTOM_ID: Generalizer metaobjects
 540    :END:
 541
 542 *** Generic function invocation
 543     As in the standard generic function invocation protocol, the
 544     generic function's actual functionality is provided by a
 545     discriminating function.  The functionality described in this
 546     protocol is implemented by having a distinct subclass of
 547     =standard-generic-function=, and a method on
 548     =compute-discriminating-function= which produces a custom
 549     discriminating function.  The basic outline of the discriminating
 550     function is the same as the standard one: it must first compute the
 551     set of applicable methods given particular arguments; from that, it
 552     must compute the effective method by combining the methods
 553     appropriately according to the generic function's method
 554     combination; finally, it must call the effective method with the
 555     arguments.
 556
 557     Computing the set of applicable methods is done using a pair of
 558     functions: =compute-applicable-methods=, the standard metaobject
 559     function, and a new function
 560     =compute-applicable-methods-using-generalizers=.  We define a
 561     custom method on =compute-applicable-methods= which tests the
 562     applicability of a particular specializer against a given argument
 563     using =specializer-accepts-p=, a new protocol function with
 564     default implementations on =class= and =eql-specializer= to
 565     implement the expected behaviour.  In order to order the methods,
 566     as required by the protocol, we define a pairwise comparison
 567     operator =specializer<= which defines an ordering between
 568     specializers for a given generalizer argument (remembering that
 569     even in standard CLOS the ordering between =class= specializers
 570     can change depending on the actual class of the argument).
 571
 572     The new =compute-applicable-methods-using-generalizers= is the
 573     analogue of the MOP's =compute-applicable-methods-using-classes=.
 574     Instead of calling it with the =class-of= each argument, we compute
 575     the generalizers of each argument using the new function
 576     =generalizer-of-using-class= (where the =-using-class= refers to
 577     the class of the generic function rather than the class of the
 578     object), and call it with the list of generalizers.  As with the
 579     standard function, a secondary return value indicates whether the
 580     result of the function is definitive for that list of generalizers.
 581
 582     Thus, in generic function invocation, we first compute the
 583     generalizers of the arguments; we compute the ordered set of
 584     applicable methods, either from the generalizers or (if that is
 585     not definitive) from the arguments themselves; then the normal
 586     effective method computation and call can occur.  Unfortunately,
 587     the nature of an effective method object is not specified, so we
 588     have to reach into implementation internals a little in order to
 589     call it, but otherwise the remainder of the generic function
 590     invocation protocol is unchanged from the standard one.  In
 591     particular, method combination is completely unchanged;
 592     programmers can choose arbitrary method combinations, including
 593     user-defined long form combinations, for their generic functions
 594     involving generalized dispatch.
 595
 596 *** Effective method memoization
 597     :PROPERTIES:
 598     :CUSTOM_ID: Memoization
 599     :END:
 600     The potential efficiency benefit to having =generalizer=
 601     metaobjects lies in the use of
 602     =compute-applicable-methods-using-generalizers=.  If a particular
 603     generalized specializer accepts a variety of objects (such as the
 604     =signum= specializer accepting all reals with a given sign, or the
 605     =accept= specializer accepting all HTTP requests with a particular
 606     =Accept= header), then there is the possibility of cacheing and
 607     reusing the results of the applicable and effective method
 608     computation.  If the computation of the applicable method from
 609     =compute-applicable-methods-using-generalizers= is definitive,
 610     then the ordered set of applicable methods and the effective
 611     method can be cached.
 612
 613     One issue is what to use as the key for that cache.  We cannot use
 614     the generalizers themselves, as two generalizers that should be
 615     considered equal for cache lookup will not compare as =equal= –
 616     and indeed even the standard generalizer, the =class=, cannot be
 617     used as we must be able to invalidate cache entries upon class
 618     redefinition.  The issue of =class= generalizers we can solve as
 619     in \cite{Kiczales.Rodriguez:1990} by using the =wrapper= of a
 620     class, which is distinct for each distinct (re)definition of a
 621     class; for arbitrary generalizers, however, there is /a priori/ no
 622     good way of computing a suitable hash key automatically, so we
 623     allow the metaprogrammer to specify one by defining a method on
 624     =generalizer-equal-hash-key=, and combining the hash keys for all
 625     required arguments in a list to use as a key in an =equal=
 626     hash-table.
 627
 628     [XXX could we actually compute a suitable hash key using the
 629     generalizer's class name and initargs?]
 630
 631 *** COMMENT
 632     - [X] =generalizer-of-using-class= (NB class of gf not class of object)
 633     - [X] =compute-applicable-methods-using-generalizers=
 634     - [X] =generalizer-equal-hash-key=
 635     - [X] =specializer-accepts-generalizer-p=
 636     - [X] =specializer-accepts-p=
 637     - [X] =specializer<=
 638 ** Performance
 639    :PROPERTIES:
 640    :CUSTOM_ID: Generalizer performance
 641    :END:
 642    We have argued that the protocol presented here allows for
 643    expressive control of method dispatch while preserving the
 644    possibility of efficiency.  In this section, we quantify the
 645    efficiency that the memoization protocol described in section
 646    [[#Memoization]] achieves, by comparing it both to the same protocol
 647    with no memoization, as well as with equivalent dispatch
 648    implementations in the context of methods with regular specializers
 649    (in an implementation similar to that in
 650    \cite{Kiczales.Rodriguez:1990}), and with implementation in
 651    straightforward functions.
 652
 653    In the case of the =cons-specializer=, we benchmark the walker
 654    acting on a small but non-trivial form.  The implementation
 655    strategies in the table below refer to: an implementation in a
 656    single function with a large =typecase= to dispatch between all the
 657    cases; the natural implementation in terms of a standard generic
 658    function with multiple methods (the method on =cons= having a
 659    slightly reduced =typecase= to dispatch on the first element, and
 660    other methods handling =symbol= and other atoms); and three
 661    separate cases using =cons-specializer= objects.  As well as
 662    measuring the effect of memoization against the full invocation
 663    protocol, we can also introduce a special case: when only one
 664    argument participates in method selection (all the other required
 665    arguments only being specialized on =t=), we can avoid the
 666    construction of a list of hash keys and simply use the key
 667    from the single active generalizer directly.
 668
 669    | implementation        | time (µs/call) | overhead |
 670    |-----------------------+----------------+----------|
 671    | function              |           3.17 |          |
 672    | standard-gf/methods   |            3.6 |     +14% |
 673    | cons-gf/one-arg-cache |            7.4 |    +130% |
 674    | cons-gf               |             15 |    +370% |
 675    | cons-gf/no-cache      |             90 |   +2700% |
 676
 677    The benchmarking results from this exercise are promising: in
 678    particular, the introduction of the effective method cache speeds
 679    up the use of generic specializers in this case by a factor of 6,
 680    and the one-argument special case by another factor of 2.  For this
 681    workload, even the one-argument special case only gets to within a
 682    factor of 2-3 of the function and standard generic function
 683    implementations, but the overall picture is that the memoizability
 684    in the protocol does indeed drastically reduce the overhead
 685    compared with the full invocation.
 686
 687    For the =signum-specializer= case, we choose to benchmark the
 688    computation of 20!, because that is the largest factorial whose
 689    answer fits in SBCL's 63-bit fixnums – in an attempt to measure the
 690    worst case for generic dispatch, where the work done within the
 691    methods is as small as possible without being meaningless, and in
 692    particular does not cause allocation or garbage collection to
 693    obscure the picture.
 694
 695 #+begin_src lisp :exports none
 696 (progn (gc :full t) (time (dotimes (i 10000) (%fact 20))))
 697 #+end_src
 698
 699    | implementation          | time (µs/call) | overhead |
 700    |-------------------------+----------------+----------|
 701    | function                |            0.6 |          |
 702    | standard-gf/fixnum      |            1.2 |    +100% |
 703    | signum-gf/one-arg-cache |            7.5 |   +1100% |
 704    | signum-gf               |             23 |   +3800% |
 705    | signum-gf/no-cache      |            240 |  +41000% |
 706
 707    The relative picture is similar to the =cons-specializer= case;
 708    including a cache saves a factor of 10 in this case, and another
 709    factor of 3 for the one-argument cache special case.  The cost of
 710    the genericity of the protocol here is starker; even the
 711    one-argument cache is a factor of 6 slower than the standard
 712    generic-function implementation, and a further factor of 2 away
 713    from the implementation of factorial as a function.  We discuss
 714    ways in which we expect to be able to improve performance in
 715    section [[#Future Work]].
 716
 717    We could allow the metaprogrammer to improve on the one-argument
 718    performance by constructing a specialized cache: for =signum=
 719    arguments of =rational= arguments, the logical cache structure is
 720    to index a three-element vector with =(1+ signum)=.  The current
 721    protocol does not provide a way of eliding the two generic function
 722    calls for the generic cache; we discuss possible approaches in
 723    section [[#Conclusions]].
 724 ** Full protocol
 725    The protocol described in this paper is only part of a complete
 726    protocol for =specializer= and =generalizer= metaobjects.  Our
 727    development of this protocol is as yet incomplete; the work
 728    described here augments that in \cite{Newton.Rhodes:2008}, but is
 729    yet relatively untested – and additionally our recent experience of
 730    working with that earlier protocol suggests that there might be
 731    useful additions to the handling of =specializer= metaobjects,
 732    independent of the =generalizer= idea presented here.
 733
 734 *** COMMENT ideas
 735     Description and specification left for reasons of space (we'll see?)
 736     - [ ] =same-specializer-p=
 737     - [ ] =parse/unparse-specializer-using-class=
 738     - [ ] =make-method-specializers-form=
 739     - [ ] jmoringe: In an email, I suggested
 740       =make-specializer-form-using-class=:
 741
 742       #+begin_quote
 743       Could we change =make-method-specializers-form='s default
 744       behaviour to call a new generic function
 745       #+begin_src
 746         make-specializer-form-using-class gf method name env
 747       #+end_src
 748       with builtin methods on =sb-mop:specializer=, =symbol=, =cons= (for
 749       eql-specializers)? This would make it unnecessary to repeat
 750       boilerplate along the lines of
 751       #+begin_src lisp
 752       (flet ((make-parse-form (name)
 753                (if <name-is-interesting>
 754                  <handle-interesting-specializer>
 755                  <repeat-handling-of-standard-specializers>)))
 756         `(list ,@(mapcar #'make-parse-form specializer-names)))
 757       #+end_src
 758       for each generic function class.
 759       #+end_quote
 760     - [ ] =make-method-lambda= revision (use environment arg?)
 761
 762       jmoringe: would only be relevant for pattern dispatch, right? I
 763       think, we didn't finish the discussion regarding special
 764       variables vs. environment vs. new protocol function
 765
 766 * Related Work
 767   :PROPERTIES:
 768   :CUSTOM_ID: Related Work
 769   :END:
 770
 771   The work presented here builds on specializer-oriented programming
 772   described in \cite{Newton.Rhodes:2008}.  Approximately
 773   contemporaneously, filtered dispatch \cite{Costanza.etal:2008} was
 774   introduced to address some of the same use cases: filtered dispatch
 775   works by having a custom discriminating function which wraps the
 776   usual one, where the wrapping function augments the set of
 777   applicable methods with applicable methods from other (hidden)
 778   generic functions, one per filter group; this step is not memoized,
 779   and using =eql= methods to capture behaviours of equivalence classes
 780   means that it is hard to see how it could be.  The methods are then
 781   combined using a custom method combination to mimic the standard
 782   one; in principle implementors of other method combinations could
 783   cater for filtered dispatch, but they would have to explicitly
 784   modify their method combinations.  The Clojure programming language
 785   supports multimethods[fn:5] with a variant of filtered dispatch as
 786   well as hierachical and identity-based method selectors.
 787
 788   In context-oriented programming
 789   \cite{Hirschfeld.etal:2008,Vallejos.etal:2010}, context dispatch
 790   occurs by maintaining the context state as an anonymous class with
 791   the superclasses representing all the currently active layers; this
 792   is then passed as a hidden argument to context-aware functions.  The
 793   set of layers is known and under programmer control, as layers must
 794   be defined beforehand.
 795
 796   In some sense, all dispatch schemes are specializations of predicate
 797   dispatch \cite{Ernst.etal:1998}.  The main problem with predicate
 798   dispatch is its expressiveness: with arbitrary predicates able to
 799   control dispatch, it is essentially impossible to perform any
 800   substantial precomputation, or even to automatically determine an
 801   ordering of methods given a set of arguments.  Even Clojure's
 802   restricted dispatch scheme provides an explicit operator for stating
 803   a preference order among methods, where here we provide an operator
 804   to order specializers; in filtered dispatch the programmer
 805   implicitly gives the system an order of precedence, through the
 806   lexical ordering of filter specification in a filtered function
 807   definition.
 808
 809   The Slate programming environment combines prototype-oriented
 810   programming with multiple dispatch \cite{Salzman.Aldrich:2005}; in
 811   that context, the analogue of an argument's class (in Common Lisp)
 812   as a representation of the equivalence class of objects with the
 813   same behaviour is the tuple of roles and delegations: objects with
 814   the same roles and delegations tuple behave the same, much as
 815   objects with the same generalizer have the same behaviour in the
 816   protocol described in this paper.
 817
 818   The idea of generalization is of course not new, and arises in other
 819   contexts.  Perhaps of particular interest is generalization in the
 820   context of partial evaluation; for example, \cite{Ruf:1993}
 821   considers generalization in online partial evaluation, where sets of
 822   possible values are represented by a type system construct
 823   representing an upper bound.  The relationship between generalizer
 824   metaobjects and approximation in type systems could be further
 825   explored.
 826 * Conclusions
 827   :PROPERTIES:
 828   :CUSTOM_ID: Conclusions
 829   :END:
 830   In this paper, we have presented a new generalizer metaobject
 831   protocol allowing the metaprogrammer to implement in a
 832   straightforward manner metaobjects to implement custom method
 833   selection, rather than the standard method selection as standardized
 834   in Common Lisp.  This protocol seamlessly interoperates with the
 835   rest of CLOS and Common Lisp in general; the programmer (the user of
 836   the custom specializer metaobjects) may without constraints use
 837   arbitrary method combination, intercede in effective method
 838   combination, or write custom method function implementations.  The
 839   protocol is expressive, in that it handles forms of dispatch not
 840   possible in more restricted dispatch systems, while not suffering
 841   from the indeterminism present in predicate dispatch through the use
 842   of explicit ordering predicates.
 843
 844   The protocol is also reasonably efficient; the metaprogrammer can
 845   indicate that a particular effective method computation can be
 846   memoized, and under those circumstances much of the overhead is
 847   amortized (though there remains a substantial overhead compared with
 848   standard generic-function or regular function calls).  We discuss
 849   how the efficiency could be improved below.
 850 ** Future Work
 851    :PROPERTIES:
 852    :CUSTOM_ID: Future Work
 853    :END:
 854    Although the protocol described in this paper allows for a more
 855    efficient implementation, as described in section [[#Memoization]],
 856    than computing the applicable and effective methods at each generic
 857    function call, the efficiency is still some way away from a
 858    baseline of the standard generic-function, let alone a standard
 859    function.  Most of the invocation protocol is memoized, but there
 860    are still two full standard generic-function calls –
 861    =generalizer-of-using-class= and =generalizer-equal-hash-key= – per
 862    argument per call to a generic function with extended specializers,
 863    not to mention a hash table lookup.
 864
 865    For many applications, the additional flexibility afforded by
 866    generalized specializers might be worth the cost in efficiency, but
 867    it would still be worth investigating how much the overhead from
 868    generalized specializers can be reduced; one possible avenue for
 869    investigation is giving greater control over the cacheing strategy
 870    to the metaprogrammer.
 871
 872    As an example, consider the =signum-specializer=.  The natural
 873    cache structure for a single argument generic function specializing
 874    on =signum= is probably a four-element vector, where the first
 875    three elements hold the effective methods for =signum= values of
 876    -1, 0, and 1, and the fourth holds the cached effective methods for
 877    everything else.  This would make the invocation of such functions
 878    very fast for the (presumed) common case where the argument is in
 879    fact a real number.  We hope to develop and show the effectiveness
 880    of an appropriate protocol to allow the metaprogrammer to construct
 881    and exploit such cacheing strategies, and (more speculatively) to
 882    implement the lookup of an effective method function in other ways.
 883
 884    We also aim to demonstrate support within this protocol for some
 885    particular cases of generalized specializers which seem to have
 886    widespread demand (in as much as any language extension can be said
 887    to be in “demand”).  In particular, we have preliminary work
 888    towards supporting efficient dispatch over pattern specializers
 889    such as implemented in the \textsf{Optima} library[fn:4], and over
 890    a prototype object system similar to that in Slate
 891    \cite{Salzman.Aldrich:2005}.  Our current source code for the work
 892    described in this paper can be seen in the git source code
 893    repository at [[http://christophe.rhodes.io/git/specializable.git]],
 894    which will be updated with future developments.
 895
 896    Finally, after further experimentation (and, ideally, non-trivial
 897    use in production) if this protocol stands up to use as we hope, we
 898    aim to produce a standards-quality document so that other
 899    implementors of Common Lisp can, if they choose, independently
 900    reimplement the protocol, and so that users can use the protocol
 901    with confidence that the semantics will not change in a
 902    backwards-incompatible fashion.
 903 ** Acknowledgments
 904    We thank Lee Salzman, Pascal Costanza and Mikel Evins for helpful
 905    and informative discussions, and all the respondents to one
 906    author's request for imaginative uses for generalized specializers.
 907
 908 \bibliographystyle{plain}
 909 \bibliography{crhodes,specializers}
 910
 911 * Footnotes
 912
 913 [fn:1] GNU CLISP, at http://www.clisp.org/
 914
 915 [fn:2] Clozure Common Lisp, at http://ccl.clozure.com/
 916
 917 [fn:3] the \textsf{Closer to MOP} project, at
 918    http://common-lisp.net/project/closer/, attempts to harmonize the
 919    different implementations of the metaobject protocol in Common
 920    Lisp.
 921
 922 [fn:4] https://github.com/m2ym/optima
 923
 924 [fn:5] http://clojure.org/multimethods
 925
 926