About latencies

Latency is unavoidable when working with digital signal processing since the analog signal has to be converted into a digital signal in the first place. Next the digital signal has to be processed and then again re-converted into an analog signal. The amount of time for the A/D and D/A conversion and signal processing is typically:

  Delay equivalent distance *)
A/D converter 0.5 ms 0.17 m
D/A converter 0.5 ms 0.17 m
Buffering for digital signal processing **) typically
3.5 - 30 ms
1.16 m - 9.90 m

*) For example as the velocity of sound is approx. 330 m/s, the sound needs 1 ms for 0.33 m.

**) For example at a sampling rate of 44.1 kHz and a buffer size of 1024 samples you need at least 1024/44.1 ms = 23.2 ms to fill the buffer.

So what?

... you might think. But: The greater the delay the harder it is to exactly hit the metronome beat.

For example on larger stages you need to have the drums on the monitoring speakers as it is hard to play tight on the groove when the drum set is farther away than approximately 3 m.

My recording experiences are:

Delay Equivalent Distance Feeling to hit the Metronome
4.5 ms 1.5 m Excellent
6 ms 2.0 m Good
10 ms 3.3 m Ok
15 ms 5 m Difficult
> 15 ms > 5m Awful

When playing acoustic piano or acoustic guitar there is typically a distance of 1-2 m from the sound source to the player´s ears. And also for electric guitars a distance of 3 m from the speaker to the player is still OK. But if you try to play for example with the help of a wireless equipment 10 m and more away from the speakers, then it feels like playing on another planet because of the latencies.

Singers hear their own voice directly in their head because of the sound transmission through the bones. The smallest delay results in a comb filter effect which means that some frequencies will be extinguished and the voice sounds somehow strange. So despite of being used to this effect from typical monitoring situations on stage it´s of course a pleasure for singers having an analog mixer and an in-ear-monitoring system or headphones.

The formula for the resulting amplitude of 2 overlapped equal signals where one is delayed is:

$$A(f)=2\bullet|\ cos(2\bullet\pi\bullet\ \frac{dt}{2})\ |$$ For example dt = 2 ms (that´s what today´s digital mixers are able to provide) results to extinctions (A=0) at 250 Hz, 750 Hz, 1250 Hz, 1750 Hz, 2250 Hz,... (250 Hz * 1, 3, 5, ...)

For older devices (dt = 10 ms) it´s even worse:
Extinctions (A=0) at 50 Hz, 150 Hz, 250 Hz, 350 Hz, 450 Hz,... (50 Hz * 1, 3, 5, ...)

As soon as in the future delay times are lesser than 0.033 ms (33 µs), this will lead to
extinctions (A=0) at 15 kHz, 45 kHz,... that probably no one will hear.

What to do?

There are several ways to avoid too great delays:

  • Hardware monitoring (analog signal will be put through): Most of the sound cards (even the cheaper ones) have some sort of direct analog recording out you can put through to your head phones. Of course your instrument is then "extra dry" (without any effect)
  • Better approach: Using an analog mixer between your instrument and the sound card device. Plug your instrument via DI box into the analog mixer and put one output to the digital recording system and one to the monitoring system.


Measure the delay!

This is an easy way to measure the delay:


1st Microphone

Head phones

2nd Microphone

Use a separate metronome and record the metronome beat with the 1st microphone

Send the sound to the head phones and record the output with a 2nd microphone (as near as possible).

Compare the two records: In this example there is a 40 ms delay.