els-specializers.org

   1 #+TITLE: Generalizers: New Metaobjects for Generalized Dispatch
   2 #+AUTHOR: Christophe Rhodes, Jan Moringen, David Lichteblau
   3 #+OPTIONS: toc:nil
   4
   5 #+LaTeX_CLASS: acm_proc_article-sp
   6 #+LaTeX_HEADER: \DeclareTextFontCommand{\texttt}{\ttfamily\hyphenchar\font=45\relax}
   7
   8 #+begin_src elisp :exports none
   9 ;;; use C-x C-e on this if org refuses to export
  10 (add-to-list 'org-latex-classes
  11              '("acm_proc_article-sp" "\\documentclass{acm_proc_article-sp}"
  12                ("\\section{%s}" . "\\section*{%s}")
  13                ("\\subsection{%s}" . "\\subsection*{%s}")
  14                ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
  15                ("\\paragraph{%s}" . "\\paragraph*{%s}")
  16                ("\\subparagraph{%s}" . "\\subparagraph*{%s}")))
  17 (set (make-local-variable 'org-latex-title-command)
  18      "\\numberofauthors{3}
  19 \\author{
  20 \\alignauthor Christophe Rhodes\\\\
  21   \\affaddr{Department of Computing}\\\\
  22   \\affaddr{Goldsmiths, University of London}\\\\
  23   \\affaddr{London SE14 6NW}\\\\
  24   \\email{c.rhodes@gold.ac.uk}
  25 \\alignauthor Jan Moringen\\\\
  26   \\affaddr{Universität Bielefeld}\\\\
  27   \\affaddr{Technische Fakultät}\\\\
  28   \\affaddr{33594 Bielefeld}\\\\
  29   \\email{jmoringe@techfak.uni-bielefeld.de}
  30 \\alignauthor David Lichteblau\\\\
  31   \\affaddr{ZenRobotics Ltd}\\\\
  32   \\affaddr{Vilhonkatu 5 A}\\\\
  33   \\affaddr{FI-00100 Helsinki}\\\\
  34   \\email{david@lichteblau.com}
  35 #+end_src
  36
  37 #+begin_abstract
  38 This paper introduces a new metaobject, the generalizer, which
  39 complements the existing specializer metaobject.  With the help of
  40 examples, we show that this metaobject allows for the efficient
  41 implementation of complex non-class-based dispatch within the
  42 framework of existing metaobject protocols.  We present our
  43 modifications to the generic function invocation protocol from the
  44 /Art of the Metaobject Protocol/; in combination with previous work,
  45 this produces a fully-functional extension of the existing mechanism
  46 for method selection and combination, including support for method
  47 combination completely independent from method selection.  We discuss
  48 our implementation, within the SBCL implementation of Common Lisp, and
  49 in that context compare the performance of the new protocol with the
  50 standard one, demonstrating that the new protocol can be tolerably
  51 efficient.
  52 #+end_abstract
  53
  54 #+begin_LaTeX
  55 \category{D.1}{Software}{Programming Techniques}[Object-oriented Programming]
  56 \category{D.3.3}{Programming Languages}{Language Constructs and Features}
  57 \terms{Languages, Design}
  58 \keywords{generic functions, specialization-oriented programming, method selection, method combination}
  59 #+end_LaTeX
  60
  61 * Introduction
  62   The revisions to the original Common Lisp language \cite{CLtL}
  63   included the detailed specification of an object system, known as
  64   the Common Lisp Object System (CLOS), which was eventually
  65   standardized as part of the ANSI Common Lisp standard \cite{CLtS}.
  66   The object system as presented to the standardization committee was
  67   formed of three chapters.  The first two chapters covered programmer
  68   interface concepts and the functions in the programmer interface
  69   \cite[Chapter 28]{CLtL2} and were largely incorporated into the
  70   final standard; the third chapter, covering a Metaobject Protocol
  71   (MOP) for CLOS, was not.
  72
  73   Nevertheless, the CLOS MOP has proven to be a robust design, and
  74   while many implementations have derived their implementations of
  75   CLOS from either the Closette illustrative implementation in
  76   \cite{AMOP}, or the Portable Common Loops implementation of CLOS
  77   from Xerox Parc, there have been largely from-scratch
  78   reimplementations of CLOS (in CLISP[fn:1] and CCL[fn:2], at least)
  79   incorporating substantial fractions of the Metaobject Protocol as
  80   described.
  81
  82   Although it has stood the test of time, the CLOS MOP is neither
  83   without issues (e.g. semantic problems with =make-method-lambda=
  84   \cite{Costanza.Herzeel:2008}; useful functions such as
  85   =compute-effective-slot-definition-initargs= being missing from the
  86   standard) nor is it a complete framework for the metaprogrammer to
  87   implement all conceivable variations of object-oriented behaviour.
  88   While metaprogramming offers some possibilities for customization of
  89   the object system behaviour, those possibilities cannot extend
  90   arbitrarily in all directions.  There is still an expectation that
  91   functionality is implemented with methods on generic functions,
  92   acting on objects with slots.
  93   Example of something not doable: handling "message not understood"
  94   message from message passing paradigm.
  95   Nevertheless, the MOP is flexible,
  96   and is used for a number of things, including: documentation
  97   generation (where introspective functionality in the MOP is used to
  98   extract information from a running system); object-relational
  99   mapping and other approaches to object persistence; alternative
 100   backing stores for slots (hash-tables or symbols); and programmatic
 101   construction of metaobjects, for example for IDL compilers and model
 102   transformations.
 103
 104   [ XXX: A picture on MOP flexibility here would be good; I have in my mind
 105   one where an object system is a point and the MOP opens up a blob
 106   around that point, and I'm sure I've seen it somewhere but I can't
 107   remember where.  Alternatively, there's Kiczales et al "MOPs: why we
 108   want them and what else they can do", fig. 2 ]
 109   [AMOP, page 5] paints that picture, but again, only using words :)
 110
 111   One area of functionality where there is scope for customization by
 112   the metaprogrammer is in the mechanics and semantics of method
 113   applicability and dispatch.  While in principle AMOP allows
 114   customization of dispatch in various different ways (the
 115   metaprogrammer can define methods on protocol functions such as
 116   =compute-applicable-methods=,
 117   =compute-applicable-methods-using-classes=), for example, in
 118   practice implementation support for this was weak until relatively
 119   recently[fn:3].
 120
 121   Another potential mechanism for customizing dispatch is implicit in
 122   the class structure defined by AMOP: standard specializer objects
 123   (instances of =class= and =eql-specializer=) are generalized
 124   instances of the =specializer= protocol class, and in principle
 125   there are no restrictions on the metaprogrammer constructing
 126   additional subclasses.  Previous work \cite{Newton.Rhodes:2008} has
 127   explored the potential for customizing generic function dispatch
 128   using extended specializers, but as of that work the metaprogrammer
 129   must override the entirety of the generic function invocation
 130   protocol (from =compute-discriminating-function= on down), leading
 131   to toy implementations and duplicated effort.
 132
 133   This paper introduces a protocol for efficient and controlled
 134   handling of new subclasses of =specializer=.  In particular, it
 135   introduces the =generalizer= protocol class, which generalizes the
 136   return value of =class-of= in method applicability computation, and
 137   allows the metaprogrammer to hook into cacheing schemes to avoid
 138   needless recomputation of effective methods for sufficiently similar
 139   generic function arguments (See Figure\nbsp\ref{fig:dispatch}).
 140
 141   #+CAPTION:    Dispatch Comparison
 142   #+LABEL:      fig:dispatch
 143   #+ATTR_LATEX: width=0.9\linewidth float
 144   [[file:figures/dispatch-comparison.pdf]]
 145
 146   The remaining sections in this paper can be read in any order.  We
 147   give some motivating examples in section [[#Examples]], including
 148   reimplementations of examples from previous work, as well as
 149   examples which are poorly supported by previous protocols.  We
 150   describe the protocol itself in section [[#Protocol]], describing each
 151   protocol function in detail and, where applicable, relating it to
 152   existing protocol functions within the CLOS MOP.  We survey related
 153   work in more detail in section [[#Related Work]], touching on work on
 154   customized dispatch schemes in other environments.  Finally, we draw
 155   our conclusions from this work, and indicate directions for further
 156   development, in section [[#Conclusions]]; reading that section before the
 157   others indicates substantial trust in the authors' work.
 158 * Examples
 159   :PROPERTIES:
 160   :CUSTOM_ID: Examples
 161   :END:
 162   In this section, we present a number of examples of dispatch
 163   implemented using our protocol, which we describe in section
 164   [[#Protocol]].  For reasons of space, the metaprogram code examples in
 165   this section do not include some of the necessary support code to
 166   run; complete implementations of each of these cases are included in
 167   an appendix / in the accompanying repository snapshot / at this
 168   location.
 169
 170   A note on terminology: we will attempt to distinguish between the
 171   user of an individual case of generalized dispatch (the
 172   “programmer”), the implementor of a particular case of generalized
 173   dispatch (the “metaprogrammer”), and the authors as the designers
 174   and implementors of our generalized dispatch protocol (the
 175   “metametaprogammer”, or more likely “we”).
 176 ** CONS specializers
 177    :PROPERTIES:
 178    :CUSTOM_ID: Cons
 179    :END:
 180    One motivation for the use of generalized dispatch is in an
 181    extensible code walker: a new special form can be handled simply by
 182    writing an additional method on the walking generic function,
 183    seamlessly interoperating with all existing methods. In this
 184    use-case, dispatch is performed on the first element of lists.
 185    Semantically, we allow the programmer to specialize any argument of
 186    methods with a new kind of specializer, =cons-specializer=, which
 187    is applicable if and only if the corresponding object is a =cons=
 188    whose =car= is =eql= to the symbol associated with the
 189    =cons-specializer=; these specializers are more specific than the
 190    =cons= class, but less specific than an =eql-specializer= on any
 191    given =cons=.
 192
 193    The programmer code using these specializers is unchanged from
 194    \cite{Newton.Rhodes:2008}; the benefits of the protocol described
 195    here are centered on performance and generality:
 196
 197    alternative suggestions: modularity/improved code reuse/separation
 198    of concerns since e.g. arbitrary method combinations can be used
 199
 200    in an application such as walking source code, we would expect to
 201    encounter special forms (distinguished by particular atoms in the
 202    =car= position) multiple times, and hence to dispatch to the same
 203    effective method repeatedly.  We discuss this in more detail in
 204    section [[#Memoization]]; we present the metaprogrammer code below.
 205
 206 #+begin_src lisp
 207 (defclass cons-specializer (specializer)
 208   ((%car :reader %car :initarg :car)))
 209 (defclass cons-generalizer (generalizer)
 210   ((%car :reader %car :initarg :car)))
 211 (defmethod generalizer-of-using-class
 212     ((gf cons-generic-function) arg)
 213   (typecase arg
 214     ((cons symbol)
 215      (make-instance 'cons-generalizer
 216                     :car (car arg)))
 217     (t (call-next-method))))
 218 (defmethod generalizer-equal-hash-key
 219     ((gf cons-generic-function)
 220      (g cons-generalizer))
 221   (%car g))
 222 (defmethod specializer-accepts-generalizer-p
 223     ((gf cons-generic-function)
 224      (s cons-specializer)
 225      (g cons-generalizer))
 226   (if (eql (%car s) (%car g))
 227       (values t t)
 228       (values nil t)))
 229 (defmethod specializer-accepts-p
 230     ((s cons-specializer) o)
 231   (and (consp o) (eql (car o) (%car s))))
 232 #+end_src
 233
 234 The code above shows a minimal use of our protocol.  We have
 235 elided some support code for parsing and unparsing specializers, and
 236 for handling introspective functions such as finding generic functions
 237 for a given specializer.  We have also elided methods on the protocol
 238 function =specializer<=
 239
 240 mention =same-specializer-p=? especially when "only one
 241 =cons-specializer=" touches on specializer equality.
 242
 243 ; for =cons-specializers= here, specializer
 244 ordering is trivial, as only one =cons-specializer= can ever be
 245 applicable to any given argument.  See section [[#Accept]] for a case
 246 where specializer ordering is non-trivial.
 247
 248 As in \cite{Newton.Rhodes:2008}, the programmer can use these
 249 specializers to implement a modular code walker, where they define one
 250 method per special operator.  We show two of those methods below, in
 251 the context of a walker which checks for unused bindings and uses of
 252 unbound variables.
 253
 254 #+begin_src
 255 (defgeneric walk (form env stack)
 256   (:generic-function-class cons-generic-function))
 257 (defmethod walk ((expr (cons lambda)) env call-stack)
 258   (let ((lambda-list (cadr expr))
 259         (body (cddr expr)))
 260     (with-checked-bindings
 261         ((bindings-from-ll lambda-list) env call-stack)
 262       (dolist (form body)
 263         (walk form env (cons form call-stack))))))
 264 (defmethod walk ((expr (cons let)) env call-stack)
 265   (flet ((let-binding (x)
 266            (walk (cadr x) env (cons (cadr x) call-stack))
 267            (cons (car x) (make-instance 'binding))))
 268     (with-checked-bindings
 269         ((mapcar #'let-binding (cadr expr)) env call-stack)
 270       (dolist (form (cddr expr))
 271         (walk form env (cons form call-stack))))))
 272 #+end_src
 273
 274    Note that in this example there is no strict need for
 275    =cons-specializer= and =cons-generalizer= to be distinct classes –
 276    just as in the normal protocol involving
 277    =compute-applicable-methods= and
 278    =compute-applicable-methods-using-classes=, the specializer object
 279    for mediating dispatch contains the same information as the object
 280    representing the equivalence class of objects to which that
 281    specializer is applicable: here it is the =car= of the =cons=
 282    (which we wrap in a distinct object);
 283
 284    the previous sentence is rather long and hard to understand
 285
 286    in the standard dispatch it
 287    is the =class= of the object.  This feature also characterizes
 288    those use cases where the metaprogrammer could straightforwardly
 289    use filtered dispatch \cite{Costanza.etal:2008} to implement their
 290    dispatch semantics.  We will see in section [[#Accept]] an example
 291    of a case where filtered dispatch is incapable of straightforwardly
 292    expressing the dispatch, but first we present our implementation of
 293    the motivating case from \cite{Costanza.etal:2008}.
 294 ** SIGNUM specializers
 295    :PROPERTIES:
 296    :CUSTOM_ID: Signum
 297    :END:
 298    Our second example of the implementation and use of generalized
 299    specializers is a reimplementation of one of the examples in
 300    \cite{Costanza.etal:2008}: specifically, the factorial function.
 301    Here, we will perform dispatch based on the =signum= of the
 302    argument, and again, at most one method with a =signum= specializer
 303    will be appliable to any given argument, which makes the structure
 304    of the specializer implementation very similar to the =cons=
 305    specializers in the previous section.
 306
 307    The metaprogrammer may choose to?
 308    We have chosen to compare signum values using \texttt{=}, which
 309    means that a method with specializer =(signum 1)= will be
 310    applicable to positive floating-point arguments (see the first
 311    method on =specializer-accepts-generalizer-p= and the method on
 312    =specializer-accepts-p= below).  This leads to one subtle
 313    difference in behaviour compared to that of the =cons=
 314    specializers: in the case of =signum= specializers, the /next/
 315    method after any =signum= specializer can be different, depending
 316    on the class of the argument.  This aspect of the dispatch is
 317    handled by the second method on =specializer-accepts-generalizer-p=
 318    below.
 319
 320 #+begin_src lisp
 321 (defclass signum-specializer (specializer)
 322   ((%signum :reader %signum :initarg :signum)))
 323 (defclass signum-generalizer (generalizer)
 324   ((%signum :reader %signum :initarg :signum)))
 325 ;; Not sure whether I am overlooking something but shouldn't this work:
 326 (defmethod generalizer-of-using-class
 327     ((gf signum-generic-function) (arg real))
 328     (make-instance 'signum-generalizer
 329                    :signum (signum arg)))
 330 (defmethod generalizer-equal-hash-key
 331     ((gf signum-generic-function)
 332      (g signum-generalizer))
 333   (%signum g))
 334 (defmethod specializer-accepts-generalizer-p
 335     ((gf signum-generic-function)
 336      (s signum-specializer)
 337      (g signum-generalizer))
 338   ;; Would EQL work here? I'm thinking there might be an invariant of the form
 339   ;; (specializer-accepts-p gf o)
 340   ;; <=>
 341   ;; (specializer-accepts-generalizer-p gf s (generalizer-of-using-class gf o))
 342   ;; Or if that wrong, maybe we should explain why.
 343   (if (= (%signum s) (%signum g)) ; or EQL?
 344       (values t t)
 345       (values nil t)))
 346
 347 (defmethod specializer-accepts-generalizer-p
 348     ((gf signum-generic-function)
 349      (s sb-mop:specializer)
 350      (g signum-generalizer))
 351   (specializer-accepts-generalizer-p
 352    gf s (class-of (%signum g))))
 353
 354 (defmethod specializer-accepts-p
 355     ((s signum-specializer) o)
 356   (and (realp o) (= (%signum s) (signum o))))
 357 #+end_src
 358
 359    Given these definitions, and once again some more straightforward
 360    ones elided for reasons of space, the programmer can implement the
 361    factorial function as follows:
 362
 363 #+begin_src lisp
 364 (defgeneric fact (n)
 365   (:generic-function-class signum-generic-function))
 366 (defmethod fact ((n (signum 0))) 1)
 367 (defmethod fact ((n (signum 1))) (* n (fact (1- n))))
 368 #+end_src
 369
 370    We do not need to include a method on =(signum -1)=, as the
 371    standard =no-applicable-method= protocol will automatically apply to
 372    negative real or non-real arguments.
 373 ** Accept HTTP header specializers
 374    :PROPERTIES:
 375    :CUSTOM_ID: Accept
 376    :END:
 377    In this section, we implement a non-trivial form of dispatch.  The
 378    application in question is a web server, and specifically to allow
 379    the programmer to support RFC 2616 \cite{rfc2616} content
 380    negotiation, of particular interest to publishers and consumers of
 381    REST-style Web APIs.
 382
 383    The basic mechanism in content negotiation is as follows: the web
 384    client sends an HTTP request with an =Accept= header, which is a
 385    string describing the media types it is willing to receive as a
 386    response to the request, along with numerical preferences.  The web
 387    server compares these stated client preferences with the resources
 388    it has available to satisfy this request, and sends the best
 389    matching resource in its response.
 390
 391    For example, a graphical web browser might by default send an
 392    =Accept= header such as
 393    =text/html,application/xml;q=0.9,*/*;q=0.8=.  This should be
 394    interpreted as meaning that if for a given resource the server can
 395    provide content of type =text/html= (i.e. HTML), then it should do
 396    so.  Otherwise, if it can provide =application/xml= content
 397    (i.e. XML of any schema), then that should be provided; failing
 398    that, any other content type is acceptable.
 399
 400    In the case where there are static files on the filesystem, and the
 401    web server must merely select between them, there is not much more
 402    to say.  However, it is not unusual for a web service to be backed
 403    by some other form of data, and responses computed and sent on the
 404    fly, and in these circumstances the web server must compute which
 405    of its known output formats it can use to satisfy the request
 406    before actually generating the best matching response.  This can be
 407    modelled as one generic function responsible for generating the
 408    response, with methods corresponding to content-types -- and the
 409    generic function must then perform method selection against the
 410    request's =Accept= header to compute the appropriate response.
 411
 412    The =accept-specializer= below implements this dispatch.  It depends
 413    on a lazily-computed =tree= slot to represent the information in
 414    the accept header (generated by =parse-accept-string=), and a
 415    function =q= to compute the (defaulted) preference level for a
 416    given content-type and =tree=; then, method selection and ordering
 417    involves finding the =q= for each =accept-specializer='s content
 418    type given the =tree=, and sorting them according to the preference
 419    level.
 420
 421 #+begin_src lisp
 422 (defclass accept-specializer (specializer)
 423   ((media-type :initarg :media-type :reader media-type)))
 424 (defclass accept-generalizer (generalizer)
 425   ((header :initarg :header :reader header)
 426    (tree)
 427    (next :initarg :next :reader next)))
 428 (defmethod generalizer-equal-hash-key
 429     ((gf accept-generic-function)
 430      (g accept-generalizer))
 431   `(accept-generalizer ,(header g)))
 432 (defmethod specializer-accepts-generalizer-p
 433     ((gf accept-generic-function)
 434      (s accept-specializer)
 435      (g accept-generalizer))
 436   (values (q (media-type s) (tree g)) t))
 437 (defmethod specializer-accepts-generalizer-p
 438     ((gf accept-generic-function)
 439      (s specializer) ; other listing has sb-mop:specializer
 440      (g accept-generalizer))
 441   (specializer-accepts-generalizer-p
 442    gf s (next g)))
 443
 444 (defmethod specializer<
 445     ((gf accept-generic-function)
 446      (s1 accept-specializer)
 447      (s2 accept-specializer)
 448      (g accept-generalizer))
 449   (let ((m1 (media-type s1))
 450         (m2 (media-type s2))
 451         (tree (tree g)))
 452     (cond
 453       ((string= m1 m2) '=)
 454       (t (let ((q1 (q m1 tree)))
 455                (q2 (q m2 tree))))
 456            (cond
 457              ((= q1 q2) '=)
 458              ((< q1 q2) '>)
 459              (t '<))))))
 460 #+end_src
 461
 462    The metaprogrammer can then add support for objects representing
 463    client requesting, such as instances of the =request= class in the
 464    Hunchentoot web server, by translating these into
 465    =accept-generalizer= instances.  The code below implements this, by
 466    defining the computation of a =generalizer= object for a given
 467    request, and specifying how to compute whether the specializer
 468    accepts the given request object (=q= returns a number between 0
 469    and 1 if any pattern in the =tree= matches the media type, and
 470    =nil= if the media type cannot be matched at all).
 471
 472 #+begin_src
 473 (defmethod generalizer-of-using-class
 474     ((gf accept-generic-function)
 475      (arg tbnl:request))
 476   (make-instance 'accept-generalizer
 477                  :header (tbnl:header-in :accept arg)
 478                  :next (class-of arg)))
 479 (defmethod specializer-accepts-p
 480     ((s accept-specializer)
 481      (o tbnl:request))
 482   (let* ((accept (tbnl:header-in :accept o))
 483          (tree (parse-accept-string accept))
 484          (q (q (media-type s) tree)))
 485     (and q (> q 0))))
 486 #+end_src
 487
 488    In the =next= slot in the previous listing and this listing is not
 489    discussed, I think.
 490
 491    This dispatch cannot be implemented using filtered dispatch, except
 492    by generating anonymous classes with all the right mime-types as
 493    direct superclasses in dispatch order; the filter would generate
 494 #+begin_src lisp
 495 (ensure-class nil :direct-superclasses
 496  '(text/html image/webp ...))
 497 #+end_src
 498    and dispatch operates using those anonymous classes.  While
 499    this is possible to do, it is awkward to express content-type
 500    negotiation in this way, as it means that the dispatcher must know
 501    about the universe of mime-types that clients might declare that
 502    they accept, rather than merely the set of mime-types that a
 503    particular generic function is capable of serving; handling
 504    wildcards in accept strings is particularly awkward in the
 505    filtering paradigm.
 506
 507    Note that in this example, the method on =specializer<= involves a
 508    nontrivial ordering of methods based on the =q= values specified in
 509    the accept header (whereas in sections [[#Cons]] and [[#Signum]] only a
 510    single extended specializer could be applicable to any given
 511    argument).
 512
 513    Also note that the accept specializer protocol is straightforwardly
 514    extensible to other suitable objects; for example, one simple
 515    debugging aid is to define that an =accept-specializer= should be
 516    applicable to =string= objects.  This can be done in a modular
 517    fashion (see the code below, which can be completely disconnected
 518    from the code for Hunchentoot request objects), and generalizes to
 519    dealing with multiple web server libraries, so that
 520    content-negotiation methods are applicable to each web server's
 521    request objects.
 522
 523 #+begin_src lisp
 524 (defmethod generalizer-of-using-class
 525     ((gf accept-generic-function)
 526      (s string))
 527   (make-instance 'accept-generalizer
 528                  :header s
 529                  :next (class-of s)))
 530 (defmethod specializer-accepts-p
 531     ((s accept-specializer) (o string))
 532   (let* ((tree (parse-accept-string o))
 533          (q (q (media-type s) tree)))
 534     (and q (> q 0))))
 535 #+end_src
 536 ** COMMENT Pattern / xpattern / regex / optima
 537    Here's the /really/ interesting bit, but on the other hand we're
 538    probably going to run out of space, and the full description of
 539    these is going to take us into =make-method-lambda= territory.
 540    A second paper?  Future work?
 541 * Protocol
 542   :PROPERTIES:
 543   :CUSTOM_ID: Protocol
 544   :END:
 545
 546   In section [[#Examples]], we have seen a number of code fragments as
 547   partial implementations of particular non-standard method dispatch,
 548   using =generalizer= metaobjects to mediate between the methods of
 549   the generic function and the actual arguments passed to it.  In
 550   section [[#Generalizer metaobjects]], we go into more detail regarding
 551   these =generalizer= metaobjects, describing the generic function
 552   invocation protocol in full, and showing how this protocol allows a
 553   similar form of effective method cacheing as the standard one does.
 554   In section [[#Generalizer performance]], we show the results of some
 555   simple performance measurements on our implementation of this
 556   protocol in the SBCL implementation \cite{Rhodes:2008} of Common
 557   Lisp to highlight the improvement that this protocol can bring over
 558   a naïve implementation of generalized dispatch, as well as
 559   to make the potential for further improvement clear.
 560
 561 ** Generalizer metaobjects
 562    :PROPERTIES:
 563    :CUSTOM_ID: Generalizer metaobjects
 564    :END:
 565
 566 *** Generic function invocation
 567     As in the standard generic function invocation protocol, the
 568     generic function's actual functionality is provided by a
 569     discriminating function.  The functionality described in this
 570     protocol is implemented by having a distinct subclass of
 571     =standard-generic-function=, and a method on
 572     =compute-discriminating-function= which produces a custom
 573     discriminating function.  The basic outline of the discriminating
 574     function is the same as the standard one: it must first compute the
 575     set of applicable methods given particular arguments; from that, it
 576     must compute the effective method by combining the methods
 577     appropriately according to the generic function's method
 578     combination; finally, it must call the effective method with the
 579     arguments.
 580
 581     Computing the set of applicable methods is done using a pair of
 582     functions: =compute-applicable-methods=, the standard metaobject
 583     function, and a new function
 584     =compute-applicable-methods-using-generalizers=.  We define a
 585     custom method on =compute-applicable-methods= which tests the
 586     applicability of a particular specializer against a given argument
 587     using =specializer-accepts-p=, a new protocol function with
 588     default implementations on =class= and =eql-specializer= to
 589     implement the expected behaviour.  In order to order the methods,
 590     as required by the protocol, we define a pairwise comparison
 591     operator =specializer<= which defines an ordering between
 592     specializers for a given generalizer argument (remembering that
 593     even in standard CLOS the ordering between =class= specializers
 594     can change depending on the actual class of the argument).
 595
 596     The new =compute-applicable-methods-using-generalizers= is the
 597     analogue of the MOP's =compute-applicable-methods-using-classes=.
 598     Instead of calling it with the =class-of= each argument, we compute
 599     the generalizers of each argument using the new function
 600     =generalizer-of-using-class= (where the =-using-class= refers to
 601     the class of the generic function rather than the class of the
 602     object), and call it with the list of generalizers.  As with the
 603     standard function, a secondary return value indicates whether the
 604     result of the function is definitive for that list of generalizers.
 605
 606     Thus, in generic function invocation, we first compute the
 607     generalizers of the arguments; we compute the ordered set of
 608     applicable methods, either from the generalizers or (if that is
 609     not definitive) from the arguments themselves; then the normal
 610     effective method computation and call can occur.  Unfortunately,
 611     the nature of an effective method object is not specified, so we
 612     have to reach into implementation internals a little in order to
 613     call it, but otherwise the remainder of the generic function
 614     invocation protocol is unchanged from the standard one.  In
 615     particular, method combination is completely unchanged;
 616     programmers can choose arbitrary method combinations, including
 617     user-defined long form combinations, for their generic functions
 618     involving generalized dispatch.
 619
 620 *** Effective method memoization
 621     :PROPERTIES:
 622     :CUSTOM_ID: Memoization
 623     :END:
 624     The potential efficiency benefit to having =generalizer=
 625     metaobjects lies in the use of
 626     =compute-applicable-methods-using-generalizers=.  If a particular
 627     generalized specializer accepts a variety of objects (such as the
 628     =signum= specializer accepting all reals with a given sign, or the
 629     =accept= specializer accepting all HTTP requests with a particular
 630     =Accept= header), then there is the possibility of cacheing and
 631     reusing the results of the applicable and effective method
 632     computation.  If the computation of the applicable method from
 633     =compute-applicable-methods-using-generalizers= is definitive,
 634     then the ordered set of applicable methods and the effective
 635     method can be cached.
 636
 637     One issue is what to use as the key for that cache.  We cannot use
 638     the generalizers themselves, as two generalizers that should be
 639     considered equal for cache lookup will not compare as =equal= –
 640     and indeed even the standard generalizer, the =class=, cannot be
 641     used as we must be able to invalidate cache entries upon class
 642     redefinition.  The issue of =class= generalizers we can solve as
 643     in \cite{Kiczales.Rodriguez:1990} by using the =wrapper= of a
 644     class, which is distinct for each distinct (re)definition of a
 645     class; for arbitrary generalizers, however, there is /a priori/ no
 646     good way of computing a suitable hash key automatically, so we
 647     allow the metaprogrammer to specify one by defining a method on
 648     =generalizer-equal-hash-key=, and combining the hash keys for all
 649     required arguments in a list to use as a key in an =equal=
 650     hash-table.
 651
 652     [XXX could we actually compute a suitable hash key using the
 653     generalizer's class name and initargs?]
 654
 655 *** COMMENT
 656     - [X] =generalizer-of-using-class= (NB class of gf not class of object)
 657     - [X] =compute-applicable-methods-using-generalizers=
 658     - [X] =generalizer-equal-hash-key=
 659     - [X] =specializer-accepts-generalizer-p=
 660     - [X] =specializer-accepts-p=
 661     - [X] =specializer<=
 662 ** Performance
 663    :PROPERTIES:
 664    :CUSTOM_ID: Generalizer performance
 665    :END:
 666    We have argued that the protocol presented here allows for
 667    expressive control of method dispatch while preserving the
 668    possibility of efficiency.  In this section, we quantify the
 669    efficiency that the memoization protocol described in section
 670    [[#Memoization]] achieves, by comparing it both to the same protocol
 671    with no memoization, as well as with equivalent dispatch
 672    implementations in the context of methods with regular specializers
 673    (in an implementation similar to that in
 674    \cite{Kiczales.Rodriguez:1990}), and with implementation in
 675    straightforward functions.
 676
 677    In the case of the =cons-specializer=, we benchmark the walker
 678    acting on a small but non-trivial form.  The implementation
 679    strategies in the table below refer to: an implementation in a
 680    single function with a large =typecase= to dispatch between all the
 681    cases; the natural implementation in terms of a standard generic
 682    function with multiple methods (the method on =cons= having a
 683    slightly reduced =typecase= to dispatch on the first element, and
 684    other methods handling =symbol= and other atoms); and three
 685    separate cases using =cons-specializer= objects.  As well as
 686    measuring the effect of memoization against the full invocation
 687    protocol, we can also introduce a special case: when only one
 688    argument participates in method selection (all the other required
 689    arguments only being specialized on =t=), we can avoid the
 690    construction of a list of hash keys and simply use the key
 691    from the single active generalizer directly.
 692
 693    | implementation        | time (µs/call) | overhead |
 694    |-----------------------+----------------+----------|
 695    | function              |           3.17 |          |
 696    | standard-gf/methods   |            3.6 |     +14% |
 697    | cons-gf/one-arg-cache |            7.4 |    +130% |
 698    | cons-gf               |             15 |    +370% |
 699    | cons-gf/no-cache      |             90 |   +2700% |
 700
 701    The benchmarking results from this exercise are promising: in
 702    particular, the introduction of the effective method cache speeds
 703    up the use of generic specializers in this case by a factor of 6,
 704    and the one-argument special case by another factor of 2.  For this
 705    workload, even the one-argument special case only gets to within a
 706    factor of 2-3 of the function and standard generic function
 707    implementations, but the overall picture is that the memoizability
 708    in the protocol does indeed drastically reduce the overhead
 709    compared with the full invocation.
 710
 711    For the =signum-specializer= case, we choose to benchmark the
 712    computation of 20!, because that is the largest factorial whose
 713    answer fits in SBCL's 63-bit fixnums – in an attempt to measure the
 714    worst case for generic dispatch, where the work done within the
 715    methods is as small as possible without being meaningless, and in
 716    particular does not cause allocation or garbage collection to
 717    obscure the picture.
 718
 719 #+begin_src lisp :exports none
 720 (progn (gc :full t) (time (dotimes (i 10000) (%fact 20))))
 721 #+end_src
 722
 723    | implementation          | time (µs/call) | overhead |
 724    |-------------------------+----------------+----------|
 725    | function                |            0.6 |          |
 726    | standard-gf/fixnum      |            1.2 |    +100% |
 727    | signum-gf/one-arg-cache |            7.5 |   +1100% |
 728    | signum-gf               |             23 |   +3800% |
 729    | signum-gf/no-cache      |            240 |  +41000% |
 730
 731    The relative picture is similar to the =cons-specializer= case;
 732    including a cache saves a factor of 10 in this case, and another
 733    factor of 3 for the one-argument cache special case.  The cost of
 734    the genericity of the protocol here is starker; even the
 735    one-argument cache is a factor of 6 slower than the standard
 736    generic-function implementation, and a further factor of 2 away
 737    from the implementation of factorial as a function.  We discuss
 738    ways in which we expect to be able to improve performance in
 739    section [[#Future Work]].
 740
 741    We could allow the metaprogrammer to improve on the one-argument
 742    performance by constructing a specialized cache: for =signum=
 743    arguments of =rational= arguments, the logical cache structure is
 744    to index a three-element vector with =(1+ signum)=.  The current
 745    protocol does not provide a way of eliding the two generic function
 746    calls for the generic cache; we discuss possible approaches in
 747    section [[#Conclusions]].
 748 ** Full protocol
 749    The protocol described in this paper is only part of a complete
 750    protocol for =specializer= and =generalizer= metaobjects.  Our
 751    development of this protocol is as yet incomplete; the work
 752    described here augments that in \cite{Newton.Rhodes:2008}, but is
 753    yet relatively untested – and additionally our recent experience of
 754    working with that earlier protocol suggests that there might be
 755    useful additions to the handling of =specializer= metaobjects,
 756    independent of the =generalizer= idea presented here.
 757
 758 *** COMMENT ideas
 759     Description and specification left for reasons of space (we'll see?)
 760     - [ ] =same-specializer-p=
 761     - [ ] =parse/unparse-specializer-using-class=
 762     - [ ] =make-method-specializers-form=
 763     - [ ] jmoringe: In an email, I suggested
 764       =make-specializer-form-using-class=:
 765
 766       #+begin_quote
 767       Could we change =make-method-specializers-form='s default
 768       behaviour to call a new generic function
 769       #+begin_src
 770         make-specializer-form-using-class gf method name env
 771       #+end_src
 772       with builtin methods on =sb-mop:specializer=, =symbol=, =cons= (for
 773       eql-specializers)? This would make it unnecessary to repeat
 774       boilerplate along the lines of
 775       #+begin_src lisp
 776       (flet ((make-parse-form (name)
 777                (if <name-is-interesting>
 778                  <handle-interesting-specializer>
 779                  <repeat-handling-of-standard-specializers>)))
 780         `(list ,@(mapcar #'make-parse-form specializer-names)))
 781       #+end_src
 782       for each generic function class.
 783       #+end_quote
 784     - [ ] =make-method-lambda= revision (use environment arg?)
 785
 786       jmoringe: would only be relevant for pattern dispatch, right? I
 787       think, we didn't finish the discussion regarding special
 788       variables vs. environment vs. new protocol function
 789
 790 * Related Work
 791   :PROPERTIES:
 792   :CUSTOM_ID: Related Work
 793   :END:
 794
 795   The work presented here builds on specializer-oriented programming
 796   described in \cite{Newton.Rhodes:2008}.  Approximately
 797   contemporaneously, filtered dispatch \cite{Costanza.etal:2008} was
 798   introduced to address some of the same use cases: filtered dispatch
 799   works by having a custom discriminating function which wraps the
 800   usual one, where the wrapping function augments the set of
 801   applicable methods with applicable methods from other (hidden)
 802   generic functions, one per filter group; this step is not memoized,
 803   and using =eql= methods to capture behaviours of equivalence classes
 804   means that it is hard to see how it could be.  The methods are then
 805   combined using a custom method combination to mimic the standard
 806   one; in principle implementors of other method combinations could
 807   cater for filtered dispatch, but they would have to explicitly
 808   modify their method combinations.  The Clojure programming language
 809   supports multimethods[fn:5] with a variant of filtered dispatch as
 810   well as hierachical and identity-based method selectors.
 811
 812   In context-oriented programming
 813   \cite{Hirschfeld.etal:2008,Vallejos.etal:2010}, context dispatch
 814   occurs by maintaining the context state as an anonymous class with
 815   the superclasses representing all the currently active layers; this
 816   is then passed as a hidden argument to context-aware functions.  The
 817   set of layers is known and under programmer control, as layers must
 818   be defined beforehand.
 819
 820   In some sense, all dispatch schemes are specializations of predicate
 821   dispatch \cite{Ernst.etal:1998}.  The main problem with predicate
 822   dispatch is its expressiveness: with arbitrary predicates able to
 823   control dispatch, it is essentially impossible to perform any
 824   substantial precomputation, or even to automatically determine an
 825   ordering of methods given a set of arguments.  Even Clojure's
 826   restricted dispatch scheme provides an explicit operator for stating
 827   a preference order among methods, where here we provide an operator
 828   to order specializers; in filtered dispatch the programmer
 829   implicitly gives the system an order of precedence, through the
 830   lexical ordering of filter specification in a filtered function
 831   definition.
 832
 833   The Slate programming environment combines prototype-oriented
 834   programming with multiple dispatch \cite{Salzman.Aldrich:2005}; in
 835   that context, the analogue of an argument's class (in Common Lisp)
 836   as a representation of the equivalence class of objects with the
 837   same behaviour is the tuple of roles and delegations: objects with
 838   the same roles and delegations tuple behave the same, much as
 839   objects with the same generalizer have the same behaviour in the
 840   protocol described in this paper.
 841
 842   The idea of generalization is of course not new, and arises in other
 843   contexts.  Perhaps of particular interest is generalization in the
 844   context of partial evaluation; for example, \cite{Ruf:1993}
 845   considers generalization in online partial evaluation, where sets of
 846   possible values are represented by a type system construct
 847   representing an upper bound.  The relationship between generalizer
 848   metaobjects and approximation in type systems could be further
 849   explored.
 850 * Conclusions
 851   :PROPERTIES:
 852   :CUSTOM_ID: Conclusions
 853   :END:
 854   In this paper, we have presented a new generalizer metaobject
 855   protocol allowing the metaprogrammer to implement in a
 856   straightforward manner metaobjects to implement custom method
 857   selection, rather than the standard method selection as standardized
 858   in Common Lisp.  This protocol seamlessly interoperates with the
 859   rest of CLOS and Common Lisp in general; the programmer (the user of
 860   the custom specializer metaobjects) may without constraints use
 861   arbitrary method combination, intercede in effective method
 862   combination, or write custom method function implementations.  The
 863   protocol is expressive, in that it handles forms of dispatch not
 864   possible in more restricted dispatch systems, while not suffering
 865   from the indeterminism present in predicate dispatch through the use
 866   of explicit ordering predicates.
 867
 868   The protocol is also reasonably efficient; the metaprogrammer can
 869   indicate that a particular effective method computation can be
 870   memoized, and under those circumstances much of the overhead is
 871   amortized (though there remains a substantial overhead compared with
 872   standard generic-function or regular function calls).  We discuss
 873   how the efficiency could be improved below.
 874 ** Future Work
 875    :PROPERTIES:
 876    :CUSTOM_ID: Future Work
 877    :END:
 878    Although the protocol described in this paper allows for a more
 879    efficient implementation, as described in section [[#Memoization]],
 880    than computing the applicable and effective methods at each generic
 881    function call, the efficiency is still some way away from a
 882    baseline of the standard generic-function, let alone a standard
 883    function.  Most of the invocation protocol is memoized, but there
 884    are still two full standard generic-function calls –
 885    =generalizer-of-using-class= and =generalizer-equal-hash-key= – per
 886    argument per call to a generic function with extended specializers,
 887    not to mention a hash table lookup.
 888
 889    For many applications, the additional flexibility afforded by
 890    generalized specializers might be worth the cost in efficiency, but
 891    it would still be worth investigating how much the overhead from
 892    generalized specializers can be reduced; one possible avenue for
 893    investigation is giving greater control over the cacheing strategy
 894    to the metaprogrammer.
 895
 896    As an example, consider the =signum-specializer=.  The natural
 897    cache structure for a single argument generic function specializing
 898    on =signum= is probably a four-element vector, where the first
 899    three elements hold the effective methods for =signum= values of
 900    -1, 0, and 1, and the fourth holds the cached effective methods for
 901    everything else.  This would make the invocation of such functions
 902    very fast for the (presumed) common case where the argument is in
 903    fact a real number.  We hope to develop and show the effectiveness
 904    of an appropriate protocol to allow the metaprogrammer to construct
 905    and exploit such cacheing strategies, and (more speculatively) to
 906    implement the lookup of an effective method function in other ways.
 907
 908    We also aim to demonstrate support within this protocol for some
 909    particular cases of generalized specializers which seem to have
 910    widespread demand (in as much as any language extension can be said
 911    to be in “demand”).  In particular, we have preliminary work
 912    towards supporting efficient dispatch over pattern specializers
 913    such as implemented in the \textsf{Optima} library[fn:4], and over
 914    a prototype object system similar to that in Slate
 915    \cite{Salzman.Aldrich:2005}.  Our current source code for the work
 916    described in this paper can be seen in the git source code
 917    repository at [[http://christophe.rhodes.io/git/specializable.git]],
 918    which will be updated with future developments.
 919
 920    Finally, after further experimentation (and, ideally, non-trivial
 921    use in production) if this protocol stands up to use as we hope, we
 922    aim to produce a standards-quality document so that other
 923    implementors of Common Lisp can, if they choose, independently
 924    reimplement the protocol, and so that users can use the protocol
 925    with confidence that the semantics will not change in a
 926    backwards-incompatible fashion.
 927 ** Acknowledgments
 928    We thank Lee Salzman, Pascal Costanza and Mikel Evins for helpful
 929    and informative discussions, and all the respondents to one
 930    author's request for imaginative uses for generalized specializers.
 931
 932 \bibliographystyle{plain}
 933 \bibliography{crhodes,specializers}
 934
 935 * Footnotes
 936
 937 [fn:1] GNU CLISP, at http://www.clisp.org/
 938
 939 [fn:2] Clozure Common Lisp, at http://ccl.clozure.com/
 940
 941 [fn:3] the \textsf{Closer to MOP} project, at
 942    http://common-lisp.net/project/closer/, attempts to harmonize the
 943    different implementations of the metaobject protocol in Common
 944    Lisp.
 945
 946 [fn:4] https://github.com/m2ym/optima
 947
 948 [fn:5] http://clojure.org/multimethods