els-specializers.org

   1 #+TITLE: Generalizers: New Metaobjects for Generalized Dispatch
   2 #+AUTHOR: Christophe Rhodes, Jan Moringen, David Lichteblau
   3 #+OPTIONS: toc:nil
   4
   5 #+LaTeX_CLASS: acm_proc_article-sp
   6 #+LaTeX_HEADER: \DeclareTextFontCommand{\texttt}{\ttfamily\hyphenchar\font=45\relax}
   7
   8 #+begin_src elisp :exports none
   9 ;;; use C-x C-e on this if org refuses to export
  10 (add-to-list 'org-latex-classes
  11              '("acm_proc_article-sp" "\\documentclass{acm_proc_article-sp}"
  12                ("\\section{%s}" . "\\section*{%s}")
  13                ("\\subsection{%s}" . "\\subsection*{%s}")
  14                ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
  15                ("\\paragraph{%s}" . "\\paragraph*{%s}")
  16                ("\\subparagraph{%s}" . "\\subparagraph*{%s}")))
  17 (set (make-local-variable 'org-latex-title-command)
  18      "\\numberofauthors{3}
  19 \\author{
  20 \\alignauthor Christophe Rhodes\\\\
  21   \\affaddr{Department of Computing}\\\\
  22   \\affaddr{Goldsmiths, University of London}\\\\
  23   \\affaddr{London SE14 6NW}\\\\
  24   \\email{c.rhodes@gold.ac.uk}
  25 \\alignauthor Jan Moringen\\\\
  26   \\affaddr{Universität Bielefeld}\\\\
  27   \\affaddr{Technische Fakultät}\\\\
  28   \\affaddr{33594 Bielefeld}\\\\
  29   \\email{jmoringe@techfak.uni-bielefeld.de}
  30 \\alignauthor David Lichteblau\\\\
  31   \\affaddr{ZenRobotics Ltd}\\\\
  32   \\affaddr{Vilhonkatu 5 A}\\\\
  33   \\affaddr{FI-00100 Helsinki}\\\\
  34   \\email{david@lichteblau.com}
  35 }
  36 \\maketitle")
  37 #+end_src
  38
  39 #+begin_abstract
  40 This paper introduces a new metaobject, the generalizer, which
  41 complements the existing specializer metaobject.  With the help of
  42 examples, we show that this metaobject allows for the efficient
  43 implementation of complex non-class-based dispatch within the
  44 framework of existing metaobject protocols.  We present our
  45 modifications to the generic function invocation protocol from the
  46 /Art of the Metaobject Protocol/; in combination with previous work,
  47 this produces a fully-functional extension of the existing mechanism
  48 for method selection and combination, including support for method
  49 combination completely independent from method selection.  We discuss
  50 our implementation, within the SBCL implementation of Common Lisp, and
  51 in that context compare the performance of the new protocol with the
  52 standard one, demonstrating that the new protocol can be tolerably
  53 efficient.
  54 #+end_abstract
  55
  56 #+begin_LaTeX
  57 \category{D.1}{Software}{Programming Techniques}[Object-oriented Programming]
  58 \category{D.3.3}{Programming Languages}{Language Constructs and Features}
  59 \terms{Languages, Design}
  60 \keywords{generic functions, specialization-oriented programming, method selection, method combination}
  61 #+end_LaTeX
  62
  63 * Introduction
  64   The revisions to the original Common Lisp language \cite{CLtL}
  65   included the detailed specification of an object system, known as
  66   the Common Lisp Object System (CLOS), which was eventually
  67   standardized as part of the ANSI Common Lisp standard \cite{CLtS}.
  68   The object system as presented to the standardization committee was
  69   formed of three chapters.  The first two chapters covered programmer
  70   interface concepts and the functions in the programmer interface
  71   \cite[Chapter 28]{CLtL2} and were largely incorporated into the
  72   final standard; the third chapter, covering a Metaobject Protocol
  73   (MOP) for CLOS, was not.
  74
  75   Nevertheless, the CLOS MOP has proven to be a robust design, and
  76   while many implementations have derived their implementations of
  77   CLOS from either the Closette illustrative implementation in
  78   \cite{AMOP}, or the Portable Common Loops implementation of CLOS
  79   from Xerox Parc, there have been largely from-scratch
  80   reimplementations of CLOS (in CLISP[fn:1] and CCL[fn:2], at least)
  81   incorporating substantial fractions of the Metaobject Protocol as
  82   described.
  83
  84   #+CAPTION:    MOP Design Space
  85   #+LABEL:      fig:mopdesign
  86   #+ATTR_LATEX: width=\linewidth float
  87   [[file:figures/mop-design-space.pdf]]
  88
  89   Although it has stood the test of time, the CLOS MOP is neither
  90   without issues (e.g. semantic problems with =make-method-lambda=
  91   \cite{Costanza.Herzeel:2008}; useful functions such as
  92   =compute-effective-slot-definition-initargs= being missing from the
  93   standard) nor is it a complete framework for the metaprogrammer to
  94   implement all conceivable variations of object-oriented behaviour.
  95   While metaprogramming offers some possibilities for customization of
  96   the object system behaviour, those possibilities cannot extend
  97   arbitrarily in all directions (conceptually, if a given object
  98   system is a point in design space, then a MOP for that object system
  99   allows exploration of a region of design space around that point;
 100   see figure \ref{fig:mopdesign}).  In the case of the CLOS MOP, there is
 101   still an expectation that functionality is implemented with methods
 102   on generic functions, acting on objects with slots; it is not
 103   possible, for example, to transparently implement support for
 104   “message not understood” as in the message-passing paradigm, because
 105   the analogue of messages (generic functions) need to be defined
 106   before they are used.
 107
 108   Nevertheless, the MOP is flexible, and is used for a number of
 109   things, including: documentation generation (where introspection in
 110   the MOP is used to extract information from a running system);
 111   object-relational mapping and other approaches to object
 112   persistence; alternative backing stores for slots (hash-tables or
 113   symbols); and programmatic construction of metaobjects, for example
 114   for IDL compilers and model transformations.
 115
 116   One area of functionality where there is scope for customization by
 117   the metaprogrammer is in the mechanics and semantics of method
 118   applicability and dispatch.  While in principle AMOP allows
 119   customization of dispatch in various different ways (the
 120   metaprogrammer can define methods on protocol functions such as
 121   =compute-applicable-methods=,
 122   =compute-applicable-methods-using-classes=), for example, in
 123   practice implementation support for this was weak until relatively
 124   recently[fn:3].
 125
 126   Another potential mechanism for customizing dispatch is implicit in
 127   the class structure defined by AMOP: standard specializer objects
 128   (instances of =class= and =eql-specializer=) are generalized
 129   instances of the =specializer= protocol class, and in principle
 130   there are no restrictions on the metaprogrammer constructing
 131   additional subclasses.  Previous work \cite{Newton.Rhodes:2008} has
 132   explored the potential for customizing generic function dispatch
 133   using extended specializers, but there the metaprogrammer must
 134   override the entirety of the generic function invocation protocol
 135   (from =compute-discriminating-function= on down), leading to toy
 136   implementations and duplicated effort.
 137
 138   This paper introduces a protocol for efficient and controlled
 139   handling of new subclasses of =specializer=.  In particular, it
 140   introduces the =generalizer= protocol class, which generalizes the
 141   return value of =class-of= in method applicability computation, and
 142   allows the metaprogrammer to hook into cacheing schemes to avoid
 143   needless recomputation of effective methods for sufficiently similar
 144   generic function arguments (See Figure\nbsp\ref{fig:dispatch}).
 145
 146   #+CAPTION:    Dispatch Comparison
 147   #+LABEL:      fig:dispatch
 148   #+ATTR_LATEX: width=\linewidth float
 149   [[file:figures/dispatch-relationships.pdf]]
 150
 151   The remaining sections in this paper can be read in any order.  We
 152   give some motivating examples in section [[#Examples]], including
 153   reimplementations of examples from previous work, as well as
 154   examples which are poorly supported by previous protocols.  We
 155   describe the protocol itself in section [[#Protocol]], describing each
 156   protocol function in detail and, where applicable, relating it to
 157   existing protocol functions within the CLOS MOP.  We survey related
 158   work in more detail in section [[#Related Work]], touching on work on
 159   customized dispatch schemes in other environments.  Finally, we draw
 160   our conclusions from this work, and indicate directions for further
 161   development, in section [[#Conclusions]]; reading that section before the
 162   others indicates substantial trust in the authors' work.
 163 * Examples
 164   :PROPERTIES:
 165   :CUSTOM_ID: Examples
 166   :END:
 167   In this section, we present a number of examples of dispatch
 168   implemented using our protocol, which we describe in section
 169   [[#Protocol]].  For reasons of space, the metaprogram code examples in
 170   this section do not include some of the necessary support code to
 171   run; complete implementations of each of these cases, along with the
 172   integration of this protocol into the SBCL implementation
 173   \cite{Rhodes:2008} of Common Lisp, are included in an appendix / in
 174   the accompanying repository snapshot / at this location.
 175
 176   A note on terminology: we will attempt to distinguish between the
 177   user of an individual case of generalized dispatch (the
 178   “programmer”), the implementor of a particular case of generalized
 179   dispatch (the “metaprogrammer”), and the authors as the designers
 180   and implementors of our generalized dispatch protocol (the
 181   “metametaprogammer”, or more likely “we”).
 182 ** CONS specializers
 183    :PROPERTIES:
 184    :CUSTOM_ID: Cons
 185    :END:
 186    One motivation for the use of generalized dispatch is in an
 187    extensible code walker: a new special form can be handled simply by
 188    writing an additional method on the walking generic function,
 189    seamlessly interoperating with all existing methods. In this
 190    use-case, dispatch is performed on the first element of lists.
 191    Semantically, we allow the programmer to specialize any argument of
 192    methods with a new kind of specializer, =cons-specializer=, which
 193    is applicable if and only if the corresponding object is a =cons=
 194    whose =car= is =eql= to the symbol associated with the
 195    =cons-specializer=; these specializers are more specific than the
 196    =cons= class, but less specific than an =eql-specializer= on any
 197    given =cons=.
 198
 199    The programmer code using these specializers is unchanged from
 200    \cite{Newton.Rhodes:2008}; the benefits of the protocol described
 201    here are: that the separation of concerns is complete – method
 202    selection is independent of method combination – and that the
 203    protocol allows for efficient implementation where possible, even
 204    when method selection is customized.  In an application such as
 205    walking source code, we would expect to encounter special forms
 206    (distinguished by particular atoms in the =car= position) multiple
 207    times, and hence to dispatch to the same effective method
 208    repeatedly.  We discuss the efficiency aspects of the protocol in
 209    more detail in section [[#Memoization]]; we present the metaprogrammer
 210    code to implement the =cons-specializer= below.
 211
 212 #+begin_src lisp
 213 (defclass cons-specializer (specializer)
 214   ((%car :reader %car :initarg :car)))
 215 (defclass cons-generalizer (generalizer)
 216   ((%car :reader %car :initarg :car)))
 217 (defmethod generalizer-of-using-class
 218     ((gf cons-generic-function) arg)
 219   (typecase arg
 220     ((cons symbol)
 221      (make-instance 'cons-generalizer
 222                     :car (car arg)))
 223     (t (call-next-method))))
 224 (defmethod generalizer-equal-hash-key
 225     ((gf cons-generic-function)
 226      (g cons-generalizer))
 227   (%car g))
 228 (defmethod specializer-accepts-generalizer-p
 229     ((gf cons-generic-function)
 230      (s cons-specializer)
 231      (g cons-generalizer))
 232   (if (eql (%car s) (%car g))
 233       (values t t)
 234       (values nil t)))
 235 (defmethod specializer-accepts-p
 236     ((s cons-specializer) o)
 237   (and (consp o) (eql (car o) (%car s))))
 238 #+end_src
 239
 240 The code above shows a minimal use of our protocol.  We have elided
 241 some support code for parsing and unparsing specializers, and for
 242 handling introspective functions such as finding generic functions for
 243 a given specializer.  We have also elided methods on the protocol
 244 functions =specializer<= and =same-specializer-p=; for
 245 =cons-specializer= objects, specializer ordering is trivial, as only
 246 one =cons-specializer= (up to equality) can ever be applicable to any
 247 given argument.  See section [[#Accept]] for a case where specializer
 248 ordering is non-trivial.
 249
 250 As in \cite{Newton.Rhodes:2008}, the programmer can use these
 251 specializers to implement a modular code walker, where they define one
 252 method per special operator.  We show two of those methods below, in
 253 the context of a walker which checks for unused bindings and uses of
 254 unbound variables.
 255
 256 #+begin_src
 257 (defgeneric walk (form env stack)
 258   (:generic-function-class cons-generic-function))
 259 (defmethod walk
 260     ((expr (cons lambda)) env call-stack)
 261   (let ((lambda-list (cadr expr))
 262         (body (cddr expr)))
 263     (with-checked-bindings
 264         ((bindings-from-ll lambda-list)
 265          env call-stack)
 266       (dolist (form body)
 267         (walk form env (cons form call-stack))))))
 268 (defmethod walk
 269     ((expr (cons let)) env call-stack)
 270   (flet ((let-binding (x)
 271            (walk (cadr x) env
 272                  (cons (cadr x) call-stack))
 273            (cons (car x)
 274                  (make-instance 'binding))))
 275     (with-checked-bindings
 276         ((mapcar #'let-binding (cadr expr))
 277           env call-stack)
 278       (dolist (form (cddr expr))
 279         (walk form env (cons form call-stack))))))
 280 #+end_src
 281
 282    Note that in this example there is no strict need for
 283    =cons-specializer= and =cons-generalizer= to be distinct classes.
 284    In standard generic function dispatch, the =class= functions both
 285    as the specializer for methods and as the generalizer for generic
 286    function arguments; we can think of the dispatch implemented by
 287    =cons-specializer= objects as providing for subclasses of the
 288    =cons= class distinguished by the =car= of the =cons=.  This
 289    analogy also characterizes those use cases where the metaprogrammer
 290    could straightforwardly use filtered dispatch
 291    \cite{Costanza.etal:2008} to implement their dispatch semantics.
 292    We will see in section [[#Accept]] an example of a case where filtered
 293    dispatch is incapable of straightforwardly expressing the dispatch,
 294    but first we present our implementation of the motivating case from
 295    \cite{Costanza.etal:2008}.
 296 ** SIGNUM specializers
 297    :PROPERTIES:
 298    :CUSTOM_ID: Signum
 299    :END:
 300    Our second example of the implementation and use of generalized
 301    specializers is a reimplementation of one of the examples in
 302    \cite{Costanza.etal:2008}: specifically, the factorial function.
 303    Here, dispatch will be performed based on the =signum= of the
 304    argument, and again, at most one method with a =signum= specializer
 305    will be applicable to any given argument, which makes the structure
 306    of the specializer implementation very similar to the =cons=
 307    specializers in the previous section.
 308
 309    The metaprogrammer has chosen in the example below to compare
 310    signum values using \texttt{=}, which means that a method with
 311    specializer =(signum 1)= will be applicable to positive
 312    floating-point arguments (see the first method on
 313    =specializer-accepts-generalizer-p= and the method on
 314    =specializer-accepts-p= below).  This leads to one subtle
 315    difference in behaviour compared to that of the =cons=
 316    specializers: in the case of =signum= specializers, the /next/
 317    method after any =signum= specializer can be different, depending
 318    on the class of the argument.  This aspect of the dispatch is
 319    handled by the second method on =specializer-accepts-generalizer-p=
 320    below.
 321
 322 #+begin_src lisp
 323 (defclass signum-specializer (specializer)
 324   ((%signum :reader %signum :initarg :signum)))
 325 (defclass signum-generalizer (generalizer)
 326   ((%signum :reader %signum :initarg :signum)))
 327 (defmethod generalizer-of-using-class
 328     ((gf signum-generic-function) (arg real))
 329   (make-instance 'signum-generalizer
 330                  :signum (signum arg)))
 331 (defmethod generalizer-equal-hash-key
 332     ((gf signum-generic-function)
 333      (g signum-generalizer))
 334   (%signum g))
 335 (defmethod specializer-accepts-generalizer-p
 336     ((gf signum-generic-function)
 337      (s signum-specializer)
 338      (g signum-generalizer))
 339   (if (= (%signum s) (%signum g))
 340       (values t t)
 341       (values nil t)))
 342
 343 (defmethod specializer-accepts-generalizer-p
 344     ((gf signum-generic-function)
 345      (s specializer)
 346      (g signum-generalizer))
 347   (specializer-accepts-generalizer-p
 348    gf s (class-of (%signum g))))
 349
 350 (defmethod specializer-accepts-p
 351     ((s signum-specializer) o)
 352   (and (realp o) (= (%signum s) (signum o))))
 353 #+end_src
 354
 355    Given these definitions, and once again some more straightforward
 356    ones elided for reasons of space, the programmer can implement the
 357    factorial function as follows:
 358
 359 #+begin_src lisp
 360 (defgeneric fact (n)
 361   (:generic-function-class signum-generic-function))
 362 (defmethod fact ((n (signum 0))) 1)
 363 (defmethod fact ((n (signum 1))) (* n (fact (1- n))))
 364 #+end_src
 365
 366    The programmer does not need to include a method on =(signum -1)=,
 367    as the standard =no-applicable-method= protocol will automatically
 368    apply to negative real or non-real arguments.
 369 ** Accept HTTP header specializers
 370    :PROPERTIES:
 371    :CUSTOM_ID: Accept
 372    :END:
 373    In this section, we implement a non-trivial form of dispatch.  The
 374    application in question is a web server, and specifically to allow
 375    the programmer to support RFC 2616 \cite{rfc2616} content
 376    negotiation, of particular interest to publishers and consumers of
 377    REST-style Web APIs.
 378
 379    The basic mechanism in content negotiation is as follows: the web
 380    client sends an HTTP request with an =Accept= header, which is a
 381    string describing the media types it is willing to receive as a
 382    response to the request, along with numerical preferences.  The web
 383    server compares these stated client preferences with the resources
 384    it has available to satisfy this request, and sends the best
 385    matching resource in its response.
 386
 387    For example, a graphical web browser might send an =Accept= header
 388    of =text/html,application/xml;q=0.9,*/*;q=0.8= for a request of a
 389    resource typed in to the URL bar.  This should be interpreted as
 390    meaning that: if the server can provide content of type =text/html=
 391    (i.e. HTML) for that resource, then it should do so.  Otherwise, if
 392    it can provide =application/xml= content (i.e. XML of any schema),
 393    then that should be provided; failing that, any other content type
 394    is acceptable.
 395
 396    In the case where there are static files on the filesystem, and the
 397    web server must merely select between them, there is not much more
 398    to say.  However, it is not unusual for a web service to be backed
 399    by some other form of data, and responses computed and sent on the
 400    fly, and in these circumstances the web server must compute which
 401    of its known output formats it can use to satisfy the request
 402    before actually generating the best matching response.  This can be
 403    modelled as one generic function responsible for generating the
 404    response, with methods corresponding to content-types -- and the
 405    generic function must then perform method selection against the
 406    request's =Accept= header to compute the appropriate response.
 407
 408    The =accept-specializer= below implements this dispatch.  It depends
 409    on a lazily-computed =tree= slot to represent the information in
 410    the accept header (generated by =parse-accept-string=), and a
 411    function =q= to compute the (defaulted) preference level for a
 412    given content-type and =tree=; then, method selection and ordering
 413    involves finding the =q= for each =accept-specializer='s content
 414    type given the =tree=, and sorting them according to the preference
 415    level.
 416
 417 #+begin_src lisp
 418 (defclass accept-specializer (specializer)
 419   ((media-type :initarg :media-type :reader media-type)))
 420 (defclass accept-generalizer (generalizer)
 421   ((header :initarg :header :reader header)
 422    (tree)
 423    (next :initarg :next :reader next)))
 424 (defmethod generalizer-equal-hash-key
 425     ((gf accept-generic-function)
 426      (g accept-generalizer))
 427   `(accept-generalizer ,(header g)))
 428 (defmethod specializer-accepts-generalizer-p
 429     ((gf accept-generic-function)
 430      (s accept-specializer)
 431      (g accept-generalizer))
 432   (values (q (media-type s) (tree g)) t))
 433 (defmethod specializer-accepts-generalizer-p
 434     ((gf accept-generic-function)
 435      (s specializer)
 436      (g accept-generalizer))
 437   (specializer-accepts-generalizer-p
 438    gf s (next g)))
 439
 440 (defmethod specializer<
 441     ((gf accept-generic-function)
 442      (s1 accept-specializer)
 443      (s2 accept-specializer)
 444      (g accept-generalizer))
 445   (let ((m1 (media-type s1))
 446         (m2 (media-type s2))
 447         (tree (tree g)))
 448     (cond
 449       ((string= m1 m2) '=)
 450       (t (let ((q1 (q m1 tree)))
 451                (q2 (q m2 tree))))
 452            (cond
 453              ((= q1 q2) '=)
 454              ((< q1 q2) '>)
 455              (t '<))))))
 456 #+end_src
 457
 458    The metaprogrammer can then add support for objects representing
 459    client requests, such as instances of the =request= class in the
 460    Hunchentoot web server, by translating these into
 461    =accept-generalizer= instances.  The code below implements this, by
 462    defining the computation of a =generalizer= object for a given
 463    request, and specifying how to compute whether the specializer
 464    accepts the given request object (=q= returns a number between 0
 465    and 1 if any pattern in the =tree= matches the media type, and
 466    =nil= if the media type cannot be matched at all).
 467
 468 #+begin_src
 469 (defmethod generalizer-of-using-class
 470     ((gf accept-generic-function)
 471      (arg tbnl:request))
 472   (make-instance 'accept-generalizer
 473                  :header (tbnl:header-in :accept arg)
 474                  :next (call-next-method)))
 475 (defmethod specializer-accepts-p
 476     ((s accept-specializer)
 477      (o tbnl:request))
 478   (let* ((accept (tbnl:header-in :accept o))
 479          (tree (parse-accept-string accept))
 480          (q (q (media-type s) tree)))
 481     (and q (> q 0))))
 482 #+end_src
 483
 484    This dispatch cannot be implemented using filtered dispatch, except
 485    by generating anonymous classes with all the right mime-types as
 486    direct superclasses in dispatch order; the filter would generate
 487 #+begin_src lisp
 488 (ensure-class nil :direct-superclasses
 489  '(text/html image/webp ...))
 490 #+end_src
 491    and dispatch would operate using those anonymous classes.  While
 492    this is possible to do, it is awkward to express content-type
 493    negotiation in this way, as it means that the dispatcher must know
 494    about the universe of mime-types that clients might declare that
 495    they accept, rather than merely the set of mime-types that a
 496    particular generic function is capable of serving; handling
 497    wildcards in accept strings is particularly awkward in the
 498    filtering paradigm.
 499
 500    Note that in this example, the method on =specializer<= involves a
 501    non-trivial ordering of methods based on the =q= values specified
 502    in the accept header (whereas in sections [[#Cons]] and [[#Signum]] only a
 503    single extended specializer could be applicable to any given
 504    argument).
 505
 506    Also note that the accept specializer protocol is straightforwardly
 507    extensible to other suitable objects; for example, one simple
 508    debugging aid is to define that an =accept-specializer= should be
 509    applicable to =string= objects.  This can be done in a modular
 510    fashion (see the code below, which can be completely disconnected
 511    from the code for Hunchentoot request objects), and generalizes to
 512    dealing with multiple web server libraries, so that
 513    content-negotiation methods are applicable to each web server's
 514    request objects.
 515
 516 #+begin_src lisp
 517 (defmethod generalizer-of-using-class
 518     ((gf accept-generic-function)
 519      (s string))
 520   (make-instance 'accept-generalizer
 521                  :header s
 522                  :next (call-next-method)))
 523 (defmethod specializer-accepts-p
 524     ((s accept-specializer) (o string))
 525   (let* ((tree (parse-accept-string o))
 526          (q (q (media-type s) tree)))
 527     (and q (> q 0))))
 528 #+end_src
 529
 530    The =next= slot in the =accept-generalizer= is used to deal with
 531    the case of methods specialized on the classes of objects as well
 532    as on the acceptable media types; there is a method on
 533    =specializer-accepts-generalizer-p= for specializers that are not
 534    of type =accept-specializer= which calls the generic function again
 535    with the next generalizer, so that methods specialized on the
 536    classes =tbnl:request= and =string= are treated as applicable to
 537    corresponding objects, though less specific than methods with
 538    =accept-specializer= specializations.
 539
 540 ** COMMENT Pattern / xpattern / regex / optima
 541    Here's the /really/ interesting bit, but on the other hand we're
 542    probably going to run out of space, and the full description of
 543    these is going to take us into =make-method-lambda= territory.
 544    A second paper?  Future work?
 545 * Protocol
 546   :PROPERTIES:
 547   :CUSTOM_ID: Protocol
 548   :END:
 549
 550   In section [[#Examples]], we have seen a number of code fragments as
 551   partial implementations of particular non-standard method dispatch
 552   strategies, using =generalizer= metaobjects to mediate between the
 553   methods of the generic function and the actual arguments passed to
 554   it.  In section [[#Generalizer metaobjects]], we go into more detail
 555   regarding these =generalizer= metaobjects, describing the generic
 556   function invocation protocol in full, and showing how this protocol
 557   allows a similar form of effective method cacheing as the standard
 558   one does.  In section [[#Generalizer performance]], we show the results
 559   of some simple performance measurements on our implementation of
 560   this protocol in the SBCL implementation \cite{Rhodes:2008} of
 561   Common Lisp to highlight the improvement that this protocol can
 562   bring over a naïve implementation of generalized dispatch, as well
 563   as to make the potential for further improvement clear.
 564
 565 ** Generalizer metaobjects
 566    :PROPERTIES:
 567    :CUSTOM_ID: Generalizer metaobjects
 568    :END:
 569
 570 *** Generic function invocation
 571     As in the standard generic function invocation protocol, the
 572     generic function's actual functionality is provided by a
 573     discriminating function.  The functionality described in this
 574     protocol is implemented by having a distinct subclass of
 575     =standard-generic-function=, and a method on
 576     =compute-discriminating-function= which produces a custom
 577     discriminating function.  The basic outline of the discriminating
 578     function is the same as the standard one: it must first compute the
 579     set of applicable methods given particular arguments; from that, it
 580     must compute the effective method by combining the methods
 581     appropriately according to the generic function's method
 582     combination; finally, it must call the effective method with the
 583     arguments.
 584
 585     Computing the set of applicable methods is done using a pair of
 586     functions: =compute-applicable-methods=, the standard metaobject
 587     function, and a new function
 588     =compute-applicable-methods-using-generalizers=.  We define a
 589     custom method on =compute-applicable-methods= which tests the
 590     applicability of a particular specializer against a given argument
 591     using =specializer-accepts-p=, a new protocol function with
 592     default implementations on =class= and =eql-specializer= to
 593     implement the expected behaviour.  To order the methods, as
 594     required by the protocol, we define a pairwise comparison operator
 595     =specializer<= which defines an ordering between specializers for
 596     a given generalizer argument (remembering that even in standard
 597     CLOS the ordering between =class= specializers can change
 598     depending on the actual class of the argument).
 599
 600     The new =compute-applicable-methods-using-generalizers= is the
 601     analogue of the MOP's =compute-applicable-methods-using-classes=.
 602     Instead of calling it with the =class-of= each argument, we compute
 603     the generalizers of each argument using the new function
 604     =generalizer-of-using-class= (where the =-using-class= refers to
 605     the class of the generic function rather than the class of the
 606     object), and call it with the list of generalizers.  As with the
 607     standard function, a secondary return value indicates whether the
 608     result of the function is definitive for that list of generalizers.
 609
 610     Thus, in generic function invocation, we first compute the
 611     generalizers of the arguments; we compute the ordered set of
 612     applicable methods, either from the generalizers or (if that is
 613     not definitive) from the arguments themselves; then the normal
 614     effective method computation and call can occur.  Unfortunately,
 615     the nature of an effective method object is not specified, so we
 616     have to reach into implementation internals a little in order to
 617     call it, but otherwise the remainder of the generic function
 618     invocation protocol is unchanged from the standard one.  In
 619     particular, method combination is completely unchanged;
 620     programmers can choose arbitrary method combinations, including
 621     user-defined long form combinations, for their generic functions
 622     involving generalized dispatch.
 623
 624 *** Effective method memoization
 625     :PROPERTIES:
 626     :CUSTOM_ID: Memoization
 627     :END:
 628     The potential efficiency benefit to having =generalizer=
 629     metaobjects lies in the use of
 630     =compute-applicable-methods-using-generalizers=.  If a particular
 631     generalized specializer accepts a variety of objects (such as the
 632     =signum= specializer accepting all reals with a given sign, or the
 633     =accept= specializer accepting all HTTP requests with a particular
 634     =Accept= header), then there is the possibility of cacheing and
 635     reusing the results of the applicable and effective method
 636     computation.  If the computation of the applicable method from
 637     =compute-applicable-methods-using-generalizers= is definitive,
 638     then the ordered set of applicable methods and the effective
 639     method can be cached.
 640
 641     One issue is what to use as the key for that cache.  We cannot use
 642     the generalizers themselves, as two generalizers that should be
 643     considered equal for cache lookup will not compare as =equal= –
 644     and indeed even the standard generalizer, the =class=, cannot be
 645     used as we must be able to invalidate cache entries upon class
 646     redefinition.  The issue of =class= generalizers we can solve as
 647     in \cite{Kiczales.Rodriguez:1990} by using the =wrapper= of a
 648     class, which is distinct for each distinct (re)definition of a
 649     class; for arbitrary generalizers, however, there is /a priori/ no
 650     good way of computing a suitable hash key automatically, so we
 651     allow the metaprogrammer to specify one by defining a method on
 652     =generalizer-equal-hash-key=, and combining the hash keys for all
 653     required arguments in a list to use as a key in an =equal=
 654     hash-table.
 655
 656 #+begin_comment
 657     [could we actually compute a suitable hash key using the
 658     generalizer's class name and initargs?]
 659 #+end_comment
 660
 661 *** COMMENT
 662     - [X] =generalizer-of-using-class= (NB class of gf not class of object)
 663     - [X] =compute-applicable-methods-using-generalizers=
 664     - [X] =generalizer-equal-hash-key=
 665     - [X] =specializer-accepts-generalizer-p=
 666     - [X] =specializer-accepts-p=
 667     - [X] =specializer<=
 668 ** Performance
 669    :PROPERTIES:
 670    :CUSTOM_ID: Generalizer performance
 671    :END:
 672    We have argued that the protocol presented here allows for
 673    expressive control of method dispatch while preserving the
 674    possibility of efficiency.  In this section, we quantify the
 675    efficiency that the memoization protocol described in section
 676    [[#Memoization]] achieves, by comparing it both to the same protocol
 677    with no memoization, as well as with equivalent dispatch
 678    implementations in the context of methods with regular specializers
 679    (in an implementation similar to that in
 680    \cite{Kiczales.Rodriguez:1990}), and with implementation in
 681    straightforward functions.
 682
 683    In the case of the =cons-specializer=, we benchmark the walker
 684    acting on a small but non-trivial form.  The implementation
 685    strategies in the table below refer to: an implementation in a
 686    single function with a large =typecase= to dispatch between all the
 687    cases; the natural implementation in terms of a standard generic
 688    function with multiple methods (the method on =cons= having a
 689    slightly reduced =typecase= to dispatch on the first element, and
 690    other methods handling =symbol= and other atoms); and three
 691    separate cases using =cons-specializer= objects.  As well as
 692    measuring the effect of memoization against the full invocation
 693    protocol, we can also introduce a special case: when only one
 694    argument participates in method selection (all the other required
 695    arguments only being specialized on =t=), we can avoid the
 696    construction of a list of hash keys and simply use the key
 697    from the single active generalizer directly.
 698
 699    | implementation        | time (µs/call) | overhead |
 700    |-----------------------+----------------+----------|
 701    | function              |           3.17 |          |
 702    | standard-gf/methods   |            3.6 |     +14% |
 703    | cons-gf/one-arg-cache |            7.4 |    +130% |
 704    | cons-gf               |             15 |    +370% |
 705    | cons-gf/no-cache      |             90 |   +2700% |
 706
 707    The benchmarking results from this exercise are promising: in
 708    particular, the introduction of the effective method cache speeds
 709    up the use of generic specializers in this case by a factor of 6,
 710    and the one-argument special case by another factor of 2.  For this
 711    workload, even the one-argument special case only gets to within a
 712    factor of 2-3 of the function and standard generic function
 713    implementations, but the overall picture is that the memoizability
 714    in the protocol does indeed drastically reduce the overhead
 715    compared with the full invocation.
 716
 717    For the =signum-specializer= case, we choose to benchmark the
 718    computation of 20!, because that is the largest factorial whose
 719    answer fits in SBCL's 63-bit fixnums – in an attempt to measure the
 720    worst case for generic dispatch, where the work done within the
 721    methods is as small as possible without being meaningless, and in
 722    particular does not cause heap allocation or garbage collection to
 723    obscure the picture.
 724
 725 #+begin_src lisp :exports none
 726 (progn (gc :full t) (time (dotimes (i 10000) (%fact 20))))
 727 #+end_src
 728
 729    | implementation          | time (µs/call) | overhead |
 730    |-------------------------+----------------+----------|
 731    | function                |            0.6 |          |
 732    | standard-gf/fixnum      |            1.2 |    +100% |
 733    | signum-gf/one-arg-cache |            7.5 |   +1100% |
 734    | signum-gf               |             23 |   +3800% |
 735    | signum-gf/no-cache      |            240 |  +41000% |
 736
 737    The relative picture is similar to the =cons-specializer= case;
 738    including a cache saves a factor of 10 in this case, and another
 739    factor of 3 for the one-argument cache special case.  The cost of
 740    the genericity of the protocol here is starker; even the
 741    one-argument cache is a factor of 6 slower than the standard
 742    generic-function implementation, and a further factor of 2 away
 743    from the implementation of factorial as a function.  We discuss
 744    ways in which we expect to be able to improve performance in
 745    section [[#Future Work]].
 746
 747    We could allow the metaprogrammer to improve on the one-argument
 748    performance by constructing a specialized cache: for =signum=
 749    arguments of =rational= arguments, the logical cache structure is
 750    to index a three-element vector with =(1+ signum)=.  The current
 751    protocol does not provide a way of eliding the two generic function
 752    calls for the generic cache; we discuss possible approaches in
 753    section [[#Conclusions]].
 754 ** Full protocol
 755    The protocol described in this paper is only part of a complete
 756    protocol for =specializer= and =generalizer= metaobjects.  Our
 757    development of this protocol is as yet incomplete; the work
 758    described here augments that in \cite{Newton.Rhodes:2008}, but is
 759    yet relatively untested – and additionally our recent experience of
 760    working with that earlier protocol suggests that there might be
 761    useful additions to the handling of =specializer= metaobjects,
 762    independent of the =generalizer= idea presented here.
 763
 764 *** COMMENT ideas
 765     Description and specification left for reasons of space (we'll see?)
 766     - [ ] =same-specializer-p=
 767     - [ ] =parse/unparse-specializer-using-class=
 768     - [ ] =make-method-specializers-form=
 769     - [ ] jmoringe: In an email, I suggested
 770       =make-specializer-form-using-class=:
 771
 772       #+begin_quote
 773       Could we change =make-method-specializers-form='s default
 774       behaviour to call a new generic function
 775       #+begin_src
 776         make-specializer-form-using-class gf method name env
 777       #+end_src
 778       with builtin methods on =sb-mop:specializer=, =symbol=, =cons= (for
 779       eql-specializers)? This would make it unnecessary to repeat
 780       boilerplate along the lines of
 781       #+begin_src lisp
 782       (flet ((make-parse-form (name)
 783                (if <name-is-interesting>
 784                  <handle-interesting-specializer>
 785                  <repeat-handling-of-standard-specializers>)))
 786         `(list ,@(mapcar #'make-parse-form specializer-names)))
 787       #+end_src
 788       for each generic function class.
 789       #+end_quote
 790     - [ ] =make-method-lambda= revision (use environment arg?)
 791
 792       jmoringe: would only be relevant for pattern dispatch, right? I
 793       think, we didn't finish the discussion regarding special
 794       variables vs. environment vs. new protocol function
 795
 796 * Related Work
 797   :PROPERTIES:
 798   :CUSTOM_ID: Related Work
 799   :END:
 800
 801   The work presented here builds on specializer-oriented programming
 802   described in \cite{Newton.Rhodes:2008}.  Approximately
 803   contemporaneously, filtered dispatch \cite{Costanza.etal:2008} was
 804   introduced to address some of the same use cases: filtered dispatch
 805   works by having a custom discriminating function which wraps the
 806   usual one, where the wrapping function augments the set of
 807   applicable methods with applicable methods from other (hidden)
 808   generic functions, one per filter group; this step is not memoized,
 809   and using =eql= methods to capture behaviours of equivalence classes
 810   means that it is hard to see how it could be.  The methods are then
 811   combined using a custom method combination to mimic the standard
 812   one; in principle implementors of other method combinations could
 813   cater for filtered dispatch, but they would have to explicitly
 814   modify their method combinations.  The Clojure programming language
 815   supports multimethods[fn:5] with a variant of filtered dispatch as
 816   well as hierarchical and identity-based method selectors.
 817
 818   In context-oriented programming
 819   \cite{Hirschfeld.etal:2008,Vallejos.etal:2010}, context dispatch
 820   occurs by maintaining the context state as an anonymous class with
 821   the superclasses representing all the currently active layers; this
 822   is then passed as a hidden argument to context-aware functions.  The
 823   set of layers is known and under programmer control, as layers must
 824   be defined beforehand.
 825
 826   In some sense, all dispatch schemes are specializations of predicate
 827   dispatch \cite{Ernst.etal:1998}.  The main problem with predicate
 828   dispatch is its expressiveness: with arbitrary predicates able to
 829   control dispatch, it is essentially impossible to perform any
 830   substantial precomputation, or even to automatically determine an
 831   ordering of methods given a set of arguments.  Even Clojure's
 832   restricted dispatch scheme provides an explicit operator for stating
 833   a preference order among methods, where here we provide an operator
 834   to order specializers; in filtered dispatch the programmer
 835   implicitly gives the system an order of precedence, through the
 836   lexical ordering of filter specification in a filtered function
 837   definition.
 838
 839   The Slate programming environment combines prototype-oriented
 840   programming with multiple dispatch \cite{Salzman.Aldrich:2005}; in
 841   that context, the analogue of an argument's class (in Common Lisp)
 842   as a representation of the equivalence class of objects with the
 843   same behaviour is the tuple of roles and delegations: objects with
 844   the same roles and delegations tuple behave the same, much as
 845   objects with the same generalizer have the same behaviour in the
 846   protocol described in this paper.
 847
 848   The idea of generalization is of course not new, and arises in other
 849   contexts.  Perhaps of particular interest is generalization in the
 850   context of partial evaluation; for example, \cite{Ruf:1993}
 851   considers generalization in online partial evaluation, where sets of
 852   possible values are represented by a type system construct
 853   representing an upper bound.  Exploring the relationship between
 854   generalizer metaobjects and approximation in type systems might
 855   yield strategies for automatically computing suitable generalizers
 856   and cache functions for a variety of forms of generalized dispatch.
 857 * Conclusions
 858   :PROPERTIES:
 859   :CUSTOM_ID: Conclusions
 860   :END:
 861   In this paper, we have presented a new generalizer metaobject
 862   protocol allowing the metaprogrammer to implement in a
 863   straightforward manner metaobjects to implement custom method
 864   selection, rather than the standard method selection as standardized
 865   in Common Lisp.  This protocol seamlessly interoperates with the
 866   rest of CLOS and Common Lisp in general; the programmer (the user of
 867   the custom specializer metaobjects) may without constraints use
 868   arbitrary method combination, intercede in effective method
 869   combination, or write custom method function implementations.  The
 870   protocol is expressive, in that it handles forms of dispatch not
 871   possible in more restricted dispatch systems, while not suffering
 872   from the indeterminism present in predicate dispatch through the use
 873   of explicit ordering predicates.
 874
 875   The protocol is also reasonably efficient; the metaprogrammer can
 876   indicate that a particular effective method computation can be
 877   memoized, and under those circumstances much of the overhead is
 878   amortized (though there remains a substantial overhead compared with
 879   standard generic-function or regular function calls).  We discuss
 880   how the efficiency could be improved below.
 881 ** Future work
 882    :PROPERTIES:
 883    :CUSTOM_ID: Future Work
 884    :END:
 885    Although the protocol described in this paper allows for a more
 886    efficient implementation, as described in section [[#Memoization]],
 887    than computing the applicable and effective methods at each generic
 888    function call, the efficiency is still some way away from a
 889    baseline of the standard generic-function, let alone a standard
 890    function.  Most of the invocation protocol is memoized, but there
 891    are still two full standard generic-function calls –
 892    =generalizer-of-using-class= and =generalizer-equal-hash-key= – per
 893    argument per call to a generic function with extended specializers,
 894    not to mention a hash table lookup.
 895
 896    For many applications, the additional flexibility afforded by
 897    generalized specializers might be worth the cost in efficiency, but
 898    it would still be worth investigating how much the overhead from
 899    generalized specializers can be reduced; one possible avenue for
 900    investigation is giving greater control over the cacheing strategy
 901    to the metaprogrammer.
 902
 903    As an example, consider the =signum-specializer=.  The natural
 904    cache structure for a single argument generic function specializing
 905    on =signum= is probably a four-element vector, where the first
 906    three elements hold the effective methods for =signum= values of
 907    -1, 0, and 1, and the fourth holds the cached effective methods for
 908    everything else.  This would make the invocation of such functions
 909    very fast for the (presumed) common case where the argument is in
 910    fact a real number.  We hope to develop and show the effectiveness
 911    of an appropriate protocol to allow the metaprogrammer to construct
 912    and exploit such cacheing strategies, and (more speculatively) to
 913    implement the lookup of an effective method function in other ways.
 914
 915    We also aim to demonstrate support within this protocol for some
 916    particular cases of generalized specializers which seem to have
 917    widespread demand (in as much as any language extension can be said
 918    to be in “demand”).  In particular, we have preliminary work
 919    towards supporting efficient dispatch over pattern specializers
 920    such as implemented in the \textsf{Optima} library[fn:4], and over
 921    a prototype object system similar to that in Slate
 922    \cite{Salzman.Aldrich:2005}.  Our current source code for the work
 923    described in this paper can be seen in the git source code
 924    repository at [[http://christophe.rhodes.io/git/specializable.git]],
 925    which will be updated with future developments.
 926
 927    Finally, after further experimentation (and, ideally, non-trivial
 928    use in production) if this protocol stands up to use as we hope, we
 929    aim to produce a standards-quality document so that other
 930    implementors of Common Lisp can, if they choose, independently
 931    reimplement the protocol, and so that users can use the protocol
 932    with confidence that the semantics will not change in a
 933    backwards-incompatible fashion.
 934 ** Acknowledgments
 935    We thank Lee Salzman, Pascal Costanza and Mikel Evins for helpful
 936    and informative discussions, and all the respondents to one
 937    author's request for imaginative uses for generalized specializers.
 938
 939 \bibliographystyle{plain}
 940 \bibliography{crhodes,specializers}
 941
 942 * Footnotes
 943
 944 [fn:1] GNU CLISP, at http://www.clisp.org/
 945
 946 [fn:2] Clozure Common Lisp, at http://ccl.clozure.com/
 947
 948 [fn:3] the \textsf{Closer to MOP} project, at
 949    http://common-lisp.net/project/closer/, attempts to harmonize the
 950    different implementations of the metaobject protocol in Common
 951    Lisp.
 952
 953 [fn:4] https://github.com/m2ym/optima
 954
 955 [fn:5] http://clojure.org/multimethods