The 3 Generations Of Immersive Audio

3 generations of immersive audio on Bobby Owsinski's Music Production Blog

There are a lot of immersive audio formats, and it’s easy to get confused in where they all fit in. In the latest 5th edition of my Mixing Engineer’s Handbook, I’ve made the distinctions between them more obvious by placing the various formats into one of 3 historical generations, as you’ll see in this excerpt.

Immersive audio (or “Surround Sound” as it was formerly known) has actually been with us in one form or another way longer than you might think. Theatrical releases began using the three-channel “curtain of sound” developed by Bell Labs back the early 1930s when it was discovered that a dedicated center channel provided the significant benefit of anchoring the center by eliminating “phantom” images (in stereo, the center image shifts as you move around the room). This also provided better frequency response matching across the soundfield as an added byproduct. 

The addition of a rear effects channel to the front three channels dates as far back as 1941 with the “Fantasound” four-channel system utilized by Disney for the film Fantasia, and then again in the 1950s with Fox’s Cinemascope, but it didn’t come into widespread use until the 1960s, when Dolby Stereo became the de facto surround standard. This popular film format uses four channels (left, center, right, and a mono surround, sometimes called LCRS) and is encoded onto two tracks. Until recently, Dolby Stereo was the standard delivery format for all major shows and films produced for theatrical release and broadcast television as it had the added advantage of playing back properly in stereo or mono if no decoder is present, which makes it compatible with a wide variety of both new and old theater sound systems. Today it serves more as a backup for movie theaters still able to exhibit movies on 35mm film.

With the advent of digital delivery formats capable of supplying more channels in the 1980s, the number of rear surround channels was increased to two, and a low-frequency effects channel was added to make up the six-channel 5.1, which soon became the modern standard for most films, surround music, and digital television. Today we’ve graduated to far more advanced formats such as 7.1, 11.2, and the totally revolutionary multi-speaker Dolby Atmos system, which debuted in 2012. 

And then there was the four-channel Quad from the 1970s, the music industry’s attempt at multichannel music that killed itself as a result of two non-compatible competing standards, both of which suffered from an extremely small sweet spot. Needless to say, that’s a lot of different formats.

1st Generation Immersive Formats

Before we can discuss the panning technique for immersive audio, it’s important to be familiar with the various formats that are available. Immersive audio can be broken down into three distinct eras, with the first two generations being more “surround” in nature rather than fully immersive. This is because, for the most part, they’re based on audio coming directly from the available speaker channels (you’ll see the differences soon).

The first generation was the “x.0” formats, or the ones that don’t use a dedicated subwoofer as part of the format. All the formats in this generation were also delivered in analog. These include three channel Dolby Surround, four channel LCRS, four channel Quadraphonic, and the five channel Dolby Pro Logic formats.

2nd Generation Immersive Formats

The second generation were the “x.1” formats which incorporated a subwoofer, and were also the first to be delivered in a digital form. This could mean via a DVD or Blu-Ray disc, or electronic transmission for television. Second generation formats also experimented with more channels in an attempt to achieve a higher degree of envelopment and realism.

That said, the main problem wasn’t in the production of the product, but on the consumer side, where the speaker systems were rarely positioned or calibrated properly in the home so the effect was disappointing as a result.

Formats in the generation include six channel 5.1, seven channel 6.1, eight channel 7.1, eight channel SDDS, twelve channel 10.2 and twelve channel 11.1.

3rd Generation Immersive Formats

Surround sound truly because immersive audio with the introduction of the multichannel Dolby Atmos system in 2012. Thanks to a whole new set of digital production tools and true overhead height speaker channels, mixers were finally able to provide the audio experience that the earlier generations weren’t able to approach. Today the capabilities are so vast that truly the only limitation is the mixer’s imagination.

Just as in the Quad days though, there are multiple technologies vying for the same consumer with the addition of Sony 360 Reality Audio, DTS-X, and L-Acoustics L-ISA formats coming on the scene (although L-ISA is used mostly for concert and theater sound). So far, Dolby Atmos has a significant lead in installations, available tools, and song releases, so that’s what we’re going to concentrate on here.

You might be thinking, “Why is immersive audio so different from the previous surround formats?” Probably the biggest difference in 3rd generation systems is that panning went from channel-based to object based. That means that a mixer is no longer panning towards a speaker but instead into a three dimensional space in the listening area. Thanks to more speakers being used (up to 64, which is seen mostly in theaters), these objects can envelop the listener in sound in a more seamless and natural way than ever before.

And unlike second generation immersive formats, integration into the typical home is now much easier and attractive. Now a simple soundbar, a subwoofer, and a couple of wireless speakers would not only get the seal of approval from the family worried about the decor, but provide a very enjoyable listening experience as well.

You can read more from The Mixing Engineer’s Handbook and my other books on the excerpt section of

Crash Course Access
Spread the word