Why your own voice sounds weird on playback

… and why that proves you need stereo subwoofers

Text: Elettra Bargiacchi Interview Partner: Thomas Lund Images: Thomas Lund

Block 1

Understanding auditory envelopment and how it could change the way we think about mixing. A conversation with Thomas Lund.

The title is intentionally provocative. It aims to spark curiosity about a topic that may still seem niche yet could reshape how we think about mixing and monitoring. Let me take a step back. At the latest Tonmeistertagung, I had the pleasure of moderating a talk by Thomas Lund, researcher at Genelec and convenor for the European Commission on safe listening. His work focuses on human perception, spatial audio, loudness, and auditory envelopment (AE).

Correlation and Perception

Auditory Envelopment (Thomas Lund)
Auditory Envelopment (Thomas Lund)

To understand AE, we should first clarify how our auditory system works. Our brain detects the micro-differences in phase and level between the sound waves reaching our two ears with astonishing precision. These differences allow us not only to localize sound, but also to perceive qualitative differences.

In simple terms:

• A correlated signal (identical L-R) feels centered and stable

• A decorrelated signal (subtle L-R differences) feels wider and more enveloping

We usually associate spatial perception with mid and high frequencies, while bass is traditionally treated as non-directional and therefore treated as mono. And here lies the turning point.

Low Frequencies as an Emotional Cue

Thomas Lund’s research suggests that low-frequency decorrelation has a profound perceptual impact: not mainly spatial, but emotional. “Some years ago, we made some experiments where we just noticed that everyone—from small children to elderly people with hearing loss—could distinguish between correlated and decorrelated low-frequency sound. So we started thinking about what might explain why everybody has an innate idea of correlated versus uncorrelated sound. “Among other things, we talked about how, when we listen—just like now, as I’m speaking—we perceive a correlation at low frequencies in our head: that’s basically the most common source of correlated sound.” In his studies with participants aged 6 to 90, the correlated low frequencies were described as small, spooky, and dark, while decorrelated bass was perceived as big, pleasant, and light. Lund defines this perceptual dimension as Auditory Envelopment, with a scale from highly correlated (not enveloping) to highly decorrelated (very enveloping).

Block 4

Recent studies show humans between the ages of 6 and 90 years all recognise Auditory Envelop- ment. Fotos from the papers and experiments by Thomas Lund
Recent studies show humans between the ages of 6 and 90 years all recognise Auditory Envelop- ment. Fotos from the papers and experiments by Thomas Lund

Why Your Own Voice Sounds Strange

This also explains why we often dislike hearing our recorded voice. “When we speak, we perceive correlated low frequencies internally through bone conduction: from your stomach up through the neck and through the bones, and our brain knows this pattern intimately. A recording instead captures the voice externally, without that internal resonance, so the low-frequency pattern is different. This is why it sounds strange and unfamiliar.”

Optimizing Monitoring and Bass for AE

AE has direct consequences for monitoring and reproduction. In headphones, low-frequency correlation is a key obstacle to proper externalization. Lund hopes that manufacturers will develop headphone systems able to reproduce low-frequency decorrelation with greater precision.

When asked about loudspeakers setups from an AE perspective, Thomas explains: “For monitoring, you need a qualitatively good and well-calibrated loudspeaker system, where speakers are not set up too far from the listening position and two good low-frequency sources form a proper L-R pair. Place your subs just outside your main speakers and keep the crossover point low. If you have two small monitors and send all bass frequencies to one subwoofer, you lose the ability to hear decorrelated sound at very low frequencies. From a monitoring perspective, you should certainly try to have stereo bass as far down as possible. In multichannel systems, multiple discrete low-frequency sources are preferable. You shouldn’t just have one sub.”

A Shift in Perspective

Thomas’ work challenges long-standing assumptions and is actively reshaping how the audio industry approaches monitoring systems and spatial reproduction. They are a powerful perceptual dimension—shaping emotion, immersion, and engagement. Perhaps the key question is not whether we need stereo subwoofers. It is rather whether we are ready to reconsider the emotional role of low frequencies in mixing—and to recognise that what we feel in the bass may be a crucial driver of storytelling itself.

Block 8

Elettra Bargiacchi is an Italian sound designer and composer based in Leipzig. With a background in classical guitar, composition, and Audio Post Production (Abbey Road Institute, London), her work in audio post spans Germany, Italy, the UK, and the US. Passionate about immersive audio and innovation in sound, she has been a Visiting Researcher at the University of Surrey (UK) since 2024, focusing on next-generation audio. She is a member of the VDT R & D Board.

Thomas Lund is author of pro audio papers on Auditory Envelopment, Loudness and True-peak Level. He is a perceptual researcher at Genelec OY, and convenor of a working group under the European Commission tasked with requirements of safe listening. Out of a medical background, Thomas contributes to audio standards in EBU, AES, IEC, ITU and WHO. He previously served in Danish healthcare, and as CTO at TC Electronic.