The cross with the latencies

Latencies are unavoidable when working with digital signal processing since the analog signal must first be converted into a digital signal. The next step is to process the digital signal and then convert it back into an analog signal. The time for A/D and D/A conversion and signal processing is typically:

  Delay equivalent distance *)
A/D converter 0.5 ms 0.17 m
D/A converter 0.5 ms 0.17 m
Buffering for digital signal processing **) typically
3.5 - 30 ms
1.16 m - 9.90 m

*) At a speed of sound of approx. 330 m/s, the sound needs 1 ms for 0.33 m, for example.

**) For example, with a sampling rate of 44.1 kHz and a buffer size of 1024 samples, you need at least 1024 / 44.1 ms = 23.2 ms to fill the buffer.

So what?

... you might think. But: the greater the delay, the more difficult it is to hit the metronome beat precisely. On larger stages, for example, you need to have the drums on the monitor boxes, since the interaction becomes difficult if the drums are more than 3m away.

My recording experiences are:

Delay Equivalent Distance Feeling to hit the Metronome
0 ms 0 m Ok
4.5 ms 1.5 m Excellent
6 ms 2.0 m Good
10 ms 3.3 m Ok
15 ms 5 m Difficult
> 15 ms > 5m Awful

When playing acoustic piano or acoustic guitar, the distance between the sound source and the player's ears is usually 1..2 m. With electric guitars, a distance of 3 m from the speaker to the player is quite common. However, if you try to play when you are more than 10m away, for example with a wireless device, latency makes it almost impossible to hit the beat.

Singers hear their own voice right in their head due to the sound transmission through the bones. The slightest delay results in a comb filter effect, which means that some frequencies are extinguished and the voice sounds kind of strange. Despite getting used to typical monitoring situations on stage, it is of course a pleasure for singers to have an analog mixer and an in-ear monitoring system or headphones.

The formula for the resulting amplitude of 2 overlapping identical signals, one of which is delayed, is:

$$A(f)=2\bullet|\ cos(2\bullet\pi\bullet\ \frac{dt}{2})\ |$$ For example, dt = 2 ms leads to extinction at (A=0) at 250 Hz, 750 Hz, 1250 Hz, 1750 Hz, 2250 Hz,... (250 Hz * 1, 3, 5, ...)

So latencies are by definition bad and unnatural?

Definitely not! For example, if you use headphones, delays of e.g.5..10ms make you feel like in a rehearsal situation where the musicians are 1.5 to 3 m apart.
When miking symphony orchestras or choirs, latencies of e.g 30ms can intentionally be added to make it sound like the musicians are 10m apart.

How to avoid to great latencies?

There are several ways to avoid excessive delays:
  • Direct monitoring (analog signal is switched through): Most sound cards (even the cheaper ones) offer direct monitoring, whereby the analog signal is switched through directly.
  • A better approach is to use an analog recording mixer as follows:

    - Send all instruments to the ALT output by pressing the respective ALT3/4 (=Mute) button.
    - Connect the mixer´s ALT output to the soundcards´s input.
    - Send the soundcard´s output to a mixer´s input; don´t mute that channel.
    - Push the buttons to send both the ALT output and the main mix to the control room (head phones).

    Now you can hear the instruments and the sound card output at the same time without creating a short circuit.

Measure the delay!

This is an easy way to measure the delay:


1st microphone
Head phones
2nd microphone
Use a metronome and record the beat with the 1st microphone Send the sound to the head phones and record the output with a 2nd microphone (as near as possible).
Compare the two recordings: In this example there is a delay of 40 ms.