Ambisonic audio is a method of capturing and reproducing surround sound that uses a set of channels to capture the direction and intensity of sound sources in a three-dimensional space. The goal of ambisonic audio is to create an immersive listening experience that accurately reproduces the spatial characteristics of sound as it exists in the real world.
The origins of ambisonic audio can be traced back to the 1970s, when a British engineer named Michael Gerzon developed a mathematical model for representing sound in three dimensions. This model, called the Ambisonic B-Format, uses four channels to capture the sound field: W (pressure), X (left-right), Y (up-down), and Z (front-back). The B-Format can be decoded into a variety of different speaker configurations, allowing for flexibility in reproduction.
One of the main advantages of ambisonic audio is its ability to accurately capture the direction and movement of sound sources. This is achieved by using a technique called "spherical microphone arrays," which consist of multiple microphones arranged in a spherical configuration. The resulting audio signals can then be encoded into the B-Format, allowing for precise control over the direction and intensity of sound in the final mix.
Another advantage of ambisonic audio is its compatibility with virtual reality (VR) and augmented reality (AR) applications. Because the B-Format captures the complete sound field, it can be decoded into a variety of different speaker configurations, allowing for realistic spatial audio in VR and AR environments.
In terms of production, the main benefit of ambisonic audio is the ability to capture the sound field of a space before the final mix. This allows audio engineers to have more control over the spatial characteristics of sound and make changes as needed, without needing to re-record or create fake sound sources.
Ambisonic audio is nowadays becoming more popular, and many software companies include native support for it. Some examples include, Reaper, Pro Tools, Ableton Live, and many others, some of them are even offering different ways of decoding and rendering the ambisonic sound files.
While ambisonic audio is still considered a niche technology, it has the potential to revolutionize the way we capture and reproduce surround sound. Its ability to accurately capture the spatial characteristics of sound and compatibility with VR and AR applications make it an exciting development in the field of audio production.
What are the differences between Ambisonic A and B?
Ambisonic audio is typically divided into two formats: A-Format and B-Format.
A-Format is the raw output of a spherical microphone array and contains information about the sound pressure and sound direction. It's a 4-channel signal, where the four channels are W, X, Y, and Z. The W channel contains the omnidirectional (pressure) information, X, Y, Z contain the directional information. A-Format audio needs further processing to be used in a playback environment.
B-Format, on the other hand, is a processed version of A-Format that is ready for decoding and playback. It uses four channels as well, and it includes the same information as A-Format, but it is transformed and encoded in a way that makes it compatible with a wide range of speaker configurations and decoding algorithms. B-Format is the result of a mathematical transformation of the A-Format called "Ambisonic encoding" . This encoding technique allows for more flexibility in terms of reproducing the sound field, as it can be decoded into a variety of speaker configurations, such as mono, stereo, 5.1, and so on.
To sum up, A-Format is the raw output of a spherical microphone array, while B-Format is a processed version of A-Format that is ready for decoding and playback. A-Format is used to capture the original sound field, B-Format is the result of a mathematical transformation of the A-Format that allows more flexibility in terms of reproducing the sound field.