pages tagged flacnoteshttp://christophe.rhodes.io/notes/tag/flac/notesikiwiki2014-04-04T13:39:13Zdigital-music-metadata-and-archivinghttp://christophe.rhodes.io/notes/blog/posts/2014/digital-music-metadata-and-archiving/2014-04-04T13:39:13Z2014-03-21T22:11:12Z
<p>Many people have their own ways of managing their digital music
collection. This is my way.</p>
<p>I am a relatively late convert to medium-less music. I <em>think</em> I
might have been a relatively early adopter of digital music – or at
least I was actively making purchasing decisions at a time when CDs
were generally available and not too much more expensive than cassette
tapes. But I've <a href="http://christophe.rhodes.io/notes/blog/posts/2014/mobile_world_congress/">written before</a> about my
reluctance to embrace the shiny; I've been resisting digital audio
files for longer than mobile phones have been able to play music (not
counting <a href="http://en.wikipedia.org/wiki/Crazy_Frog">Annoying Things</a>).</p>
<p>And yet, there are obvious and clear advantages to having a digital
and easily portable collection of music – brought home to me when I
was regularly commuting 90 minutes on trains in each direction for
<a href="http://www.teclo.net/">work</a>, and my “mp3 player” was a 1TB external
hard drive weighing upwards of 1kg. Over a few years I used
<a href="http://web.archive.org/web/20111010184247/http://nostatic.org/grip/">grip</a>,
and then
<a href="http://www.burtonini.com/blog/computers/sound-juicer">sound-juicer</a>,
to convert my physical music collection of CDs to audio files; at the
moment, because of something (I don't know
what) in the sound-juicer chain causing the generation of broken
FLACs, I use gstreamer directly:</p>
<pre><code>for track in $(seq 1 $(cd-discid | cut -d\ -f 2))
do
gst-launch-0.10 cdparanoiasrc track=$track paranoia-mode=255 ! \
flacenc ! filesink location=track$track.flac
done
</code></pre>
<p>One of the nice things that sound-juicer did automatically was to
import curated metadata from <a href="http://musicbrainz.org/">MusicBrainz</a>;
gstreamer doesn't do that itself, so I have had to learn how to use
the MusicBrainz tagger
<a href="http://musicbrainz.org/doc/MusicBrainz_Picard">picard</a> to add
decent-quality metadata to the audio files automatically – at least
when it's available at MusicBrainz; perhaps more commonly, given my
long-tail predilections, I use picard to use the CD Table of Contents
as an initial seed for providing that metadata for the community.
(There's a reason why most of my CD recordings of opera are unshifted
as yet: the
<a href="http://wiki.musicbrainz.org/Style/Specific_types_of_releases/Opera">metadata standards for opera on MusicBrainz</a>
are fiddly, tedious and error-prone).</p>
<p>Now I have losslessly-compressed CD-quality audio files with
high-quality metadata, and all is well. But if there's one thing that
working with computers teaches us, it's that if the data is not backed
up, it's already lost: you just don't know it yet. Not only that, but
it would be good to have access to the music collection wherever I am,
ideally <em>without</em> having to carry a “portable” hard drive along with
everything else. The good news is that there's a tool for this:
<a href="https://git-annex.branchable.com/">git-annex</a>. (I suspect it is not
coincidental that I use multiple tools by the same author: git-annex
is written by <a href="https://joeyh.name/">Joey Hess</a>, who is also
responsible for <a href="http://ikiwiki.info/">ikiwiki</a>). I have git annex
repositories on computers an external hard drives both at work and at
home, and any additions – or modifications, for example from retagging
– can be synchronised across the checkouts. As long as I remember to
get the content as well as git-annex's symbolic links, distribution
and offsite backup requirements are automatically satisfied, and git
annex even stores old copies in case of human error, which is
definitely liberating: I have a safety net, so I'm free to try to fly.
(Maybe that's over the top for a digital audio tagging workflow...).</p>
<p>Once the audio files are updated everywhere and checked out, the only
thing remaining for digital audio utopia is to ask the various media
servers to reindex the content. On my GNOME-based desktop, I'm a
little bit out of luck at the moment; the bright new hope for music
playing is <a href="https://wiki.gnome.org/Apps/Music"><code>gnome-music</code></a>, which
uses <a href="https://wiki.gnome.org/Projects/Tracker"><code>tracker</code></a> for its
indexing, and <code>tracker</code> doesn't currently follow symlinks – and the
workaround of checking out an annex in direct mode is like cutting
holes in my safety net. For my home music system, I use Logitech
Media Server, and there it's as simple as
<a href="https://github.com/csrhodes/squeeze-el/"><code>M-x squeeze RET resc TAB RET</code></a>
(and I'd like it to be simpler! Hacking welcome).</p>
<p>To summarize:</p>
<ul>
<li>Workflow:
<ol>
<li>format-shift from CD audio to FLAC using gstreamer</li>
<li>retag using picard</li>
<li>import into git-annex</li>
<li>sync git-annex repositories</li>
<li>perform git-annex get on
<ul>
<li>the external hard drive (the same one!) which acts as the music
source for my <s>Slim</s> <s>Squeezebox</s> Logitech Media Server;</li>
<li>the checkout in <code>~/Music/</code> on my workstation in the office.</li>
</ul>
</li>
<li>cause various servers to rescan or reindex their Music databases.</li>
</ol>
</li>
<li>Current bugs:
<ul>
<li><a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=716780">sound-juicer produces broken FLACs</a></li>
<li><a href="https://bugzilla.gnome.org/show_bug.cgi?id=726264">GNOME tracker does not support indirect git-annex repositories</a></li>
</ul>
</li>
</ul>
<p>All this was brought to mind because I recently made my first purchase
of recordings in digital audio file form: I was listening idly to
Radio 3, and heard <a href="https://www.youtube.com/watch?v=75QBS6V5WO4">this</a>
after the end of the evening concert. My ears tuned in gradually, and
then my brain started sending contradictory messages: “it's by
<a href="http://en.wikipedia.org/wiki/Giovanni_Battista_Pergolesi">Pergolesi</a>”
“it's
<a href="http://en.wikipedia.org/wiki/Alessandro_Parisotti">not by Pergolesi</a>”
“it's by
<a href="http://en.wikipedia.org/wiki/Pulcinella_%28ballet%29">Stravinksy</a>”
“wait what?”. I missed the announcer's statement of what the piece
was, and spent a frustrating evening trying to google things like
“piano arrangement "se tu m'ami"” and totally failing to find it.</p>
<p>The next day, I belatedly realised why they had played what they did:
the following evening's concert was
<a href="http://en.wikipedia.org/wiki/Marc-Andr%C3%A9_Hamelin">Marc-André Hamelin</a>
<a href="http://www.bbc.co.uk/programmes/b03vd8vk">playing Katchaturian</a>, and
again I was listening idly and not really paying attention. But I
paid attention to the encore, announced as: “the Minute
Waltz... played in seconds!” and again, I thought “wait, what?”.
<a href="https://www.youtube.com/watch?v=9XLbOw6WLS0">Have a listen...</a>.</p>
<p>And, convinced that I wanted to have easy access to more of Hamelin's
recorded material, I was pleased to discover that he has recorded for
<a href="http://www.hyperion-records.co.uk/">hyperion</a> – which company offers
digital downloads in losslessly-encoded DRM-free CD-quality FLAC
format. Hooray! I feel like I have joined the 21st century.</p>