Hi, it's Lachlan here, and welcome to the first episode of
Audio File, a series of videos that explains various technical aspects of audio
and music.
Now, for this first video I wanted to focus on audio bit depths
and 24-bit audio because of a
conversation I was having a friend about
16 bit audio
24-bit audio and signal to noise ratios.
There is a widely held misperception among
audiophiles that 24-bit audio relates to
the resolution
of an audio file. It doesn't and this
video explains why.
Now this is somewhat complex but I
promise you that if you watch this
it will save you time and money next
time someone tries to sell you a 24-bit
audio player.
I didn't understand this stuff for a
while and it took me some reading and
research.
I've linked all the resources I've
looked at in the description of this
video
so if you want to get a more in-depth
explanation of the stuff I'm gonna talk
about
in this video can go ahead and follow
the links. So this video will cover
three questions. Number 1: what is audio bit depth?
Number 2: What does a higher bit depth mean? And Number 3:
Why is 24-bit audio a pointless waste
time?
So Number 1. What is audio bit depth?
So as we know, sound comes in a wave. If we want to record this continuous wave
digitally
we record it as a certain number of samples per second.
The number of samples per second is called the sampling rate,
but we will talk about this in another video. So let's draw
a digital signal. Note that in a past video I drew a digital signal
like this:  a stair stepped wave. Let's get
this misconception
out the way first. Depicting digital
audio waveforms
as stair steps is incorrect and
misleading and is one of the main
reasons why the 24-bit
audio myth has been propagated. For a
complete explanation of why representing a
digital
 audio wave as a stair stepped wave is wrong, I recommend you watch Xiph.org's 
excellent and entertaining video presentation, but for now,
I'll just say this: It is more accurate to depict the way digital audio works
by drawing it as a series
lollipops: we have a number of discrete
individual samples and at each sample we
have an audio level.
When the digital to analog converter
reconstructs the audio wave from these
individual samples it creates a smooth
wave.
There is no aliasing or stair stepping
in the final output.
The output is always smooth because the
digital to analog converter
creates a line that passes through all
sample points.
So then what is a bit depth? To answer that
let's go back to our wave. When we sample our wave, we need to assign a level to 
each sample
that best represents the original waveform. We need to give
each sample a value. Computers represent
values
in bits. The word bit is a portmanteau
of BINARY DIGIT i.e.: 1,0
So when we say bit depth, we are saying how many bits we are using to describe the
different levels
of one audio sample. So let's say we have a bit
depth of 1. This means that we can assign two levels to the wave form,
0 and 1. So we end up with this.
Let's say we have a bit
depth of 2. This means we can have 4 possible combinations:
00, 01, 10, and 11.
When we get up to 16 bit audio, we have 65,536
possible values to describe the audio level.
So you may have noticed that when we
assign these values to the samples
we are rounding up or down to the
nearest level.
This is called quantisation error. As we have lower and lower bit depths,
we are forced to round up or down more and more
because we have less and less levels to work with. When we look at our original 
wave
and we look at the waveform we get
quantizing we can see that the wave
is still smooth, but it's different.
So what does this quantisation error
sound like?
It doesn't sound like stair stepping or
aliasing anything like that.
For various reasons, and again I urge you to check out those resources in the 
description in this video
this difference due to bit depth just
sounds like plain old noise underneath
our original signal.
So that answers our second question.
A higher bit depth means less quantisation noise.
So rather than just tell you this, let me show you how this works
in practice. So I have this beautiful
piano piece by Chris
Zabriskie opened up in Adobe
Audition
and you can find a link to this music
track in the description of this video.
This is a 32 bit floating-point
music file so if we have a listen, it sounds
very nice
it's a very good song. It sounds like this: 
Now let's see what happens when we take
this music file and convert it to a
4 bit music file and I would turn
your speakers down for this
because it could get messy. So we'll
convert this
to 4 bits and as you can see
immediately the waveform becomes something more like
a stair-stepped waved form. But when we
actually listen to this music
you will see that the end result
doesn't quite sound
the same way.
So even though that does sound terrible
if you listen closely
you can still the underlying piano track.
All we have now
is a lot of additional quantization noise
So let's go back and see what happens
if we convert this file to an 8
bit music file. As you can see here
immediately the waveform looks more or
less like the 32-bit original wave form.
And if we listen to the music this is what
it sounds like.
So as you can hear by increasing the bit
depth we've reduced the quantization
noise
but we haven't done anything like
increasing detail resolution
or anything like that. Now let's hear
what happens when we go up to
a 16-bit file
So once we're at 16 bits you can tell
that is now very hard to hear the quantization noise.
Again by increasing the bit depth we
have not increased detail resolution.
We've just reduced the amount of underlying quantization noise.
So we have our bits and we have our noise
What this means is that an audio file's bit depth determines the minimum noise
floor
of the recording, and therefore it determines what is the maximum dynamic 
range of the recording, since any noise softer than the noise floor
will be marked by the quantization noise. So again
so this answers our second question: a higher bit depth means that you have a higher
dynamic range. With a greater bit depth you can have a bigger difference between
the loudest
and softest sound, because you have more room above the quantisation noise
floor.
The human ear has a dynamic range of roughly
140db. This is the difference between
a pin dropping and a jet engine. A 16-bit
recording has a maximum dynamic
range of 96 decibels but thanks to a
technique called
dithering, 16-bit audio has a
perceived dynamic range
of something closer to 120 decibels.
Again I'll refer you to the Xiph.org video 
for an explanation of dithering. Now, 24 bit audio gives us
a maximum dynamic range 144 decibels.
That sounds great you might say! What's the problem with more dynamic range?
Well, when we return to our audio file, we can get a clue.
When we have our file at 16 bit, it's very hard for us to hear the quantization
noise
unless we turn up the volume very high.
Why is this?
Because as you're sitting here watching
this video there's probably noise from
your computer.
Noise from the outside. Noise from the train. Noise from the air conditioning.
Noise from the DAC and amplifier in your equipment.
So in a 16 bit recording, in order to hear the quietest noise
in the recording you would need to play
at a volume where the loudest noise
is at least 96 decibels higher than the
quietest noise.
This means that if you are in a library with a 30 decibel noise floor,
the loudest noise in the music would be the volume of a jackhammer.
And this means that if you are in a
library with the
24-bit recording, in order to hear the
softest noise in the recording
the loudest noise in the recording would
need to be at at the volume
of a rocket launch. It's actually
speculated that listening to music
this loud might not destroy hearing it,
might send you into a coma
or kill you. Obviously you are not
going to listen that loud.
so the end result is that  no matter how much bit depth you have
you cannot overcome environmental noise
unless you listen to music
in a soundproof room. 16 bit audio
was chosen as the standard because
audio engineers thought that 96 decibels of
dynamic range
or 120 decibels of dynamic range with dithering
would be more than enough anyone who
wasn't listening to music in space.
And most recordings these days only have
about 24 decibels of dynamic range to
begin with.
Good classical recordings might have a
dynamic range of just over 60 decibels.
So 24-bit audio with a dynamic range of
144 decibels
for music files is a pointless waste of
time,
sold by audio companys on the myth
that bit depth corresponds with the
resolution of an audio file.
It is a waste of hard drive space and a
waste of money.
So why does 24 bit audio even exist?
Because for musicians, when they are producing music
in the studio 24-bit means that they can
mess around with the levels of their
tracks and have room to move
without introducing more noise when they
increase the gain.
That's why musicians use it and that's
why it's become an industry buzzword.
When the CD is mastered and you are
listening with your headphones
24-bit audio is pointless unless you are
listening to music in a vacuum.
So next time someone tells you that they
can hear the difference between 24-bit
audio
and 16-bit audio them if they do
the music listening in space.
Or better yet, share this video with them, and hopefully we can move the industry
into focusing on things that
actually make a difference. Click the
like button if you found this video
helpful
and please share it with your audiophile friends. Check out the links in the
description this video.
I am very very grateful to the original
authors for helping me understand this
quite complex area. I'm looking forward to
your comments and as always
happy listening!
