a case study in contrafactum and parody

It would be nice to attempt to replicate the results of Picker (2001) using search over audio recordings. The basic points of discussion in the paper are the attribution of various pieces, including Tulerunt Dominum, Tu sola es, Lugebat David Absalon/Porro rex operuit, an 8-part Credo, and two chansons: J’ay mis mon cueur and Je prens congie. Modern attribution, endorsed by Picker (and basically everyone as far as I can tell) has all of the sacred works deriving from the two chansons, and a tentative attribution of J’ay mis mon cueur to Gombert, along with secure attributions of the music of everything else (there is doubt over whether the text of Tulerunt was set by Gombert). Previous attributions of the set to Josquin, desipte motivic similarity between Je prens congie and Mille regretz, are highly doubtful.

So, how can we go about looking at this? Well in some sense this is all cheating because we know the answer, but there's probably still an experiment to be done. The first thing to do is to identify a suitable set of recordings, maybe all those CDs I own with Gombert or Josquin motets on them? That list we can keep in a musicbrainz collection for posterity (though it ought to be computable from the master collection and a sufficiently advanced query), but for the purposes of an interesting experiment we should also augment it with:

De profundis performance and YouTube trail of Tulerunt Dominum
De profundis performance of Mille Regretz

Then, given a sufficiently good chroma feature with appropriate distance measure, what we would hope to find (the “known” “ground truth”) is that Tulerunt, Lugebat (prima pars) and Je prens congie are more similar than background; J’ay mis mon cueur and Porro rex (Lugebat seconda pars) are more similar than background; and that the specific relationships mentioned by Picker (2001) in the content of the 8-part Credo to Je prens congies are picked out:

mm. 1–19 in both Credo and Je prens congies;
mm. 55–61 in Credo to mm. 45–51 in Je prens congies;
mm. 242–248 in Credo to mm. 60–66 in Je prens congies;
(mm. 77–103 in Je prens congies recapitulate mm. 1–27).

The devil will be in the detail of chroma and distance calculation: capturing the right information and embedding it in the right space. Multiple candidates for chroma exist:

fftExtract chroma12 / chroma36 (with / without rotation)
NNLS chroma (Mauch and Dixon, 2010) (some preliminary work done: pitch shifting, parameters)
Echonest chroma (see the full Documentation, and note that they provide confidence values)
Acoustid fingerprinting (NB must fix to compute chroma over entire track)

Keeping the distance metric as squared Euclidean makes a certain amount of sense from a minimal code editing point of view (but might mean transforming the features if there's a principled reason to do so); since the recordings are at substantially different tempi, this also makes for a nice example case to investigate the need for time warping in the distance calculation, and indeed whether the Sakoe-Chiba band in the nicely optimized DTW from Rakthanmanon et al. (2012) is good enough (or whether we need to be allowing for more general subsequence matching as in Smith-Waterman). audioDB as it stands can handle the non-DTW case; for a collection of this size we maybe don't need to do all the DTW optimizations in UCRsuite (but we do need to modify code because we have vector-valued features).

Extra audioDB features that might be necessary:

automatic (12-fold) rotation of query feature;
ignoring or forcing particular regions of a query / target (beyond the “power” measurement – e.g. final chords have high power but are “boring” for motif detection tasks);

Bibliography

@InCollection{Picker:2001,
  author =       {Martin Picker},
  title =        "{A spurious motet of Josquin, a chanson by Gombert, and some related works: A case study in contrafactum and parody}",
  booktitle =    {Quellenstudium und musikalische Analyse: Festschrift Martin Just zum 70. Geburtstag},
  publisher =    {Ergon-Verlag},
  year =         {2001},
  editor =       {Peter Niedermüller and Cristina Urchueguía and Oliver Wiener},
  pages =        {33–45},
  address =      {Würzburg},
  OPTannote =    {}
}

@InProceedings{Rakthanmanon.etal:2012,
  author =       {Thanawin Rakthanmanon and Bilson Campana and Abdullah Mueen and Gustavo Batista and Bandon Westover and Qiang Zhu and Jesin Zakaria and Eamonn Keogh},
  title =        "{Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping}",
  booktitle =    {Proceedings of the 18th ACM SIGKDD international conference on Knowledge and data mining},
  year =         {2012},
  editor =       {Qiang Yang and Deepak Agarwal and Jian Pei},
  pages =        {262–270},
  address =      {New York},
  publisher =    {ACM},
  OPTannote =    {}
}

@InProceedings{Mauch.Dixon:2010,
  author =       {Matthias Mauch and Simon Dixon},
  title =        "{Approximate Note Transcription for the Improved Identification of Difficult Chords}",
  booktitle =    {Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010)},
  year =         {2010},
  OPTannote =    {}
}