A journey through audio on Linux, systemd, Rust and the "how hard can it be" culture

I took a Raspberry Pi and turned it into a Spotify client with librespot. Sounds easy, eh? Add an S/PDIF output to your Raspberry Pi, add Librespot, set up some cron job to start it on system startup and we’re done.

Oh boy was I wrong. The whole project just wasn’t meant to succeed, I think. Please come. Follow me through some rough terrain to a place where we can just listen to music.

Sellers selling stuff

I got myself an HiFiBerry Digi+ for my Raspberry Pi 3. Unfortunately, the seller didn’t really look up what he was selling and actually sent me a “HiFiBerry Digi”, which only works on the Raspberry Pi 1. They differ in the pinout for the extension connector.

Luckily, it’s just a question of rerouting the signals, see Old DAC<->Raspberry 3 connection. The mapping of the pins on the additional connector P5 of the HiFiBerry DAC to P1 on the Raspberry Pi is:

HiFiBerry DAC pin# 1 (5V) 3 (BCK) 4 (LRCK) (5 (DIN)) 6 (DOUT) 7, 8 (GND)
Raspberry Pi 3 pin# 2, 4 (5V) 12 (GPIO 18, BCK) 35 (GPIO 19, LRCK) (37 (GPIO 20, DIN)) 40 (GPIO 21, DOUT) 6, 9, 14, 20 (GND)

Take care about the copper planes on the HiFiBerry: The top plane is the internal 3.3V supply, not ground! Also, contrary to the pinout in the PCB, the odd-numbered pins (1, 3, 5, 7) of P5 face towards P1, the even-numbered pins (4, 6, 8) of P5 face towards the WM8804 chip.

Said and done, I now had a Raspberry Pi that could send audio via S/PDIF to my speakers.

Let’s move forward to librespot v0.1.1.

Not Implemented Here, part 1: Keeping the cache under control

From using librespot on some other machines, I noticed that there was no way to control the cache size. Not only that my Raspberry Pi will use some old junk USB thumb drive as cache, so storage isn’t plenty, but anyway: I need to keep the cache at a certain size. Otherwise, I will run out of space at some point on a device that should “just work”.

Well, not controlling the cache size isn’t really best practise, I’d say. Anyway, so I had to do this myself. Enter prune-cache.d and more boiler plate systemd timer files to have this run every hour. Assuming that I constantly stream at 320 kbps to disk and that I won’t skip songs like crazy, it’s easy to estimate how much data is being written to disk within an hour. A small multiple of that should be kept available every hour.

Nice. Now, it’s finally time to fire up librespot and listen to music! Let’s just configure it a bit (-b 320 -c /cache/dir -n endpoint_name --backend alsa --linear-volume --initial-volume 30) and listen to some music!

Ew. What’s that?

Clicks when changing songs

When I change from one song to the next, I hear a click. Sometimes, it’s louder, sometimes it’s more quiet. That totally destroys my listening experience.

Thinking about how to build a “minimal” program that plays back files, I think I know what’s happening: After one song ends, the program ends the audio stream. For the next song, the audio stream is opened again afterwards. Alternatively, no audio data is provided in between these two points in time. Or maybe before the stream is closed, the signal isn’t faded to zero. In any case, since the backend is ALSA and kind of directly pushes the samples to my S/PDIF interface, without any interpolation, smoothing, multi-client mixing and so on, any discontinuity is reproduced by the speakers.

I tried using ALSA’s dmix and constantly playing back silence on a different input to dmix, to keep the audio stream alive. That improved things significantly, but the issue remained.

Then I noticed that support for gapless playback was added to librespot recently. So I upgraded to HEAD.

Modern software stacks: dependency forests

Upgrading librespot meant compiling it, on my Raspberry Pi. Compiling the 260 targets of librespot and its dependencies took some half an hour, I think. Honestly, I’d prefer if software had less dependencies. Of course, half into the process, I got strange error messages. Upgrading the Rust compiler rustc and restarting the build from scratch helped. Think of XKCD’s “compiling” meme.

Desktop Audio

The gapless playback helped drasticly when playing songs in sequence, but not when starting or stopping the playback (and maybe also not when manually changing songs, I don’t remember).

So, let’s switch audio backends! I thought: Well, maybe this piece of software wasn’t really tested with ALSA, maybe it works best with PulseAudio. Also, PulseAudio is so commonly used, I hoped that they thought about all the edge cases when ramping has to be done to avoid clicks. With these thoughts, I already build librespot with PulseAudio support in the previous step. Setting it up was again easy, thanks to systemd user services.

Just one more thing: Add timeout=10 to PulseAudio’s default.pa in order to keep the device open for some time, but not indefinitely. This is to play really safe and ensure we get a few seconds of silence before the audio output is shut down when playback stops. Also, issue pactl set-sink-volume 0 100% to increase the volume from 80% to 100%.

And indeed, that solved all my click issues when changing tracks. Ha. Thanks, PulseAudio! Let’s listen to music for some more time!

Clicks when changing volume

Ew. When I change the volume, it still clicks?!

OK, librespot is using its own softvol mixer as gain control by default. Using an ALSA is out of the question. PulseAudio’s ALSA frontend used with the ALSA backend in librespot leads to cracks and pops during normal playback and clicks when changing the volume.

Where’s the good news? I finally have an opportunity to write some Rust.

Ramping the mixer gain in Rust

I dived into librespot and hacked in some ramping to make the gain change smooth, at least if you don’t take the second derivative. ;)

What they to is to actually apply the gain in integer arithmetic with flooring. I would have rounded the signal, but eh. What’s wiggling on the least significant bit in 16-bit audio? Probably noone will ever hear this. Especially since we’re talking about material that came out of an OggVorbis decoder.

It turns out that to me as a novice, a lot of Rust concepts felt needlessly complicated, yet easy to use. Also, the compiler’s error messages proved very helpful. Really, very helpful. Thanks for that, Rust people! Being able to just hack in some “unsafe” code to mutate otherwise immutable state also made it easy to add the ramping without changing the whole API of how audio effects work in librespot.

Finally, it’s time to listen to some music, isn’t it?

Trying for some more

Well, yes, if librespot would come up every time and woudn’t sometimes quit so often so quickly that systemd would stop respawning it. Turns out that librespot is very unhappy if it is started withot a usable network interface and the WiFi takes some time to come up. I somehow couldn’t easily make librespot wait until WiFi is available. So I just bumped the retry limits. That way, the daemon came up successfully after all.

Finally: Listening to music

An unexpectedly large time later, I’m finally able to listen to music and forget that the music is fetched from some remote site, cached, decoded and sent to my speakers by a computer. The illusing of having Spotify-capable speakers is perfect.

Finally, we’re able to enjoy some music! I’m sure my neighbours also enjoy it. :D

Wrapping it up