How Sound Works In The Digital Domain
Overview
Now that we’ve learned how sound works and is perceived in the physical world, let’s take a closer look at the flow of sound in the digital domain.
There are two main concepts to think about here:
Signal Flow - how audio gets in and out of the computer.
Sample Rate and Bit Depth - how audio is stored digitally and how we can use these systems to manipulate our sounds.
Additionally, we’ll also look at some of the ways these concepts can be applied practically in game audio for example:
Sample rate and pitch shifting - how they correlate and why it’s important.
The 8bit effect - what it is and how to achieve it.
Headroom - how to get clean recordings.
File Formats and Compression - how we reduce the size of our audio files before they’re implemented into a game.
Signal Flow
Signal Flow is the overarching term covering how sound travels into and back out of your computer.
This can range from a very simple setup, such as a pair of headphones plugged into a laptop, to highly complex systems that synchronise multiple computers and external hardware processors.
Regardless of how complex the system is, the basic principles and ideas remain the same, and it’s important to understand those ideas for a number of reasons:
Troubleshooting - given the complexity of computers and audio technology, it’s no surprise that things go wrong from time to time. When they do, understanding how the hardware and software you use are configured can make it far easier to fix - trust me, you’ll be doing this a lot more than you think.
Gain Staging - being able to set up your gear to record and monitor at appropriate levels (not too loud, not too quiet) can make your job easier and the results better.
Creative Options - as with all aspects of audio technology, once you understand how it works you can start to get creative with it. For example, when Mick Gordon was composing the Doom (2016) soundtrack, he made “the Doom Machine” - a huge multi-path signal chain full of different bits of analog gear. This unique setup became a key part of Doom’s soundtrack, thanks to Mick's creative use of signal flow.
Here is a great article explaining the basics of signal flow:
https://www.musicianonamission.com/signal-flow/
Sample Rate & Bit Depth
For this next topic, it’s important to have a good grasp of how sound works, we’ve already covered this in the previous chapter however, here is a very brief 2-minute video to refresh your memory:
https://youtu.be/XLfQpv2ZRPU
Now that’s out of the way, let’s dive into sample rate & bit depth with this phenomenal overview by iZotope covering all of the basics: https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html
Game Audio Applications
Now that we know the basic theory it’s time to dive into some of the more practical uses for sample rate and bit depth in the context of game audio.
Preserving quality when pitch shifting
Shifting the pitch of a recorded sound up or down is a common process in sound design.
However, lowering the pitch of a sound down can cause a loss of quality, as the samples that make up that sound are stretched over a longer period of time, effectively lowering the sample rate as you lower the pitch.
This is similar to how a digital image becomes blurrier as you zoom in.
The solution to this is to record at higher sample rates, allowing you to slow the audio down further before the sound degrades.
Here is a great comparison by Sage Audio of what it actually sounds like when we pitch a high sample rate recording down vs a low one: https://youtu.be/-pvRlNu2ydM?t=38
Additionally, there are super wide range microphones like the Sanken CO-100K which are able to record frequencies up to 100kHz. This allows for even better high-frequency quality when pitch shifting.
If you want to listen to a sound comparison check out this video from Imphenzia:
https://youtu.be/e093pWoWCBs
Deliberate sound degradation (the 8-bit effect)
The sound of sample rate or bit depth reduction is unique, and sometimes desirable for sound design!
For example, you might lower the bit depth of a sound to give it a more retro/arcade quality, like the sound effects in early Mario games, or use the harsh, ‘digital’-sounding artifacts to create some sci-fi or UI sound effects.
This kind of deliberate downsampling is usually done with a plugin called a bit crusher, demonstrated very nicely in this video: https://youtu.be/dlXs28tnjUI
Headroom
Headroom is the amount of space you have to make a signal louder before it ‘clips’, causing distortion.
When we record sounds we generally want to avoid clipping, unless done on purpose as a stylistic choice, because any detail that is clipped is lost and cannot be recovered.
The higher the bit depth, the more headroom we have, which means that we can record louder sounds without worrying about losing any information or detail in the sound. It’s most common to record at a depth of 24 bits, but some modern audio recorders have even higher settings that pretty much eliminate clipping altogether.
If you are interested in the technical details I highly recommend this explanation of the concept on the Sound Devices website: https://www.sounddevices.com/32-bit-float-files-explained/
File Size
Lowering the sample rate and bit depth of a sound allows us to reduce its file size, which is extremely important in game development, as it helps improve performance and saves memory.
It is still better to use higher sample rates/bit depths when recording sounds and processing them, but storage space isn’t infinite, and in the world of game development it’s important to save memory wherever you can. In these situations, you have to compromise between reducing the file size and preserving the quality of the sound.
Fortunately, there are a number of options specifically for this purpose...
File Formats & Compression
When preparing a sound for importing into a game, we use compression algorithms to reduce the file size as efficiently as possible without sacrificing quality.
Some compression algorithms are lossless, meaning that there is no loss of quality when they are decompressed and played back (e.g. FLAC). However, these do not reduce the file size enough to be practical for game development. Instead, we use lossy compression algorithms*.* These can reduce file sizes much more effectively, but at the cost of some audio quality (e.g. mp3).
Sound designers usually use a file format called Ogg Vorbis, as it compresses files well with minimal loss of quality, and allows for seamless looping playback (vital when creating ambience sounds or musical loops).
Here is a great recap from the folks at iZotope on the basic concepts and things to avoid when converting audio files to different formats: https://www.youtube.com/embed/oLpGkfUVJkA
And if you want to compare all of the different audio file formats out there and know their pros and cons, then check out this guide from Indie Game Music: https://indiegamemusic.com/formatguide.php
Further Reading
If you’re interested in learning more about digital audio, I would highly recommend reading these two fantastic articles that go into more detail on these very important fundamental topics.
Understanding the Fundamentals of Digital Audio, by Kajal Mishra:
https://circuitdigest.com/article/understanding-the-fundamentals-of-digital-audio