Gah must repeat K2e5_179_part_b.png as file1 thanks to sqlite3 error.

(Can we make sqlite3 actually behave nicely for concurrent access? Might be as simple as installing a busy handler; it would be nice if the command-line sqlite3 client didn't block)

OK, first run is done! It only took a week for 475 files. Options for going faster:

  • serialize and restore k-d trees (maybe 10-30% speedup);
  • do more batch work (e.g. call match A B C and have it do all the pairwise stuff itself (maybe 10-30% speedup);
  • use LSH (complexity order speedup)