There's scope for plenty of improvements to SBCL's support for Unicode:
- make it easy to update the Unicode data to a new version. Currently it involves thought, sometimes non-trivial amounts of it, in order to get the character database and magic constants related to it right.
- consider normalizing the names of symbols on creation. (It's perhaps not conformant to do so given non-normalized strings; maybe we should normalize strings too?)
- speaking of normalization, normalizations could be optimized, using
the
QuickCheck
property (NFC_QC
and similar). That would first need to be included in the character database, and then used. - we should also try to provide information in the UCD to users, so
that they can do their own Unicode-aware processing. (There's some
of that in
cl-unicode
but last I looked there were things substantially missing). - we don't currently support any kind of language-aware
case-insensitive comparison, nor any collation. That's a bit of a
shame. (Does it make sense to think about supporting bidi in
format
and/orpprint-logical block
)?