DESTON BENNETT: Thanks
for coming out today.
Again, my name's
Deston Bennett.
I'm with the Grammy producers
and engineers wing.
The Grammy Awards, as many of
you may know, are the only
music awards that are peer
determined, meaning it's not
the public that votes.
Those who vote are members of
the Recording Academy, and who
are hands-on music creators--
artists, songwriters,
musicians,
producers, and engineers.
From the very beginning at our
founding in 1956, the basis
for the Grammy Awards process
has been a commitment to
excellence.
The Recording Academy's original
credo clearly states
that the awards are not about
sales, and they're not about
popularity.
Musical excellence in all areas
is the only criteria
Grammy voters are charged with
to determine who gets
nominated, and what will win.
Knowing that little bit of
information should help you
whenever there's some
controversy about the Grammys,
as there sometimes can be.
For example, the year jazz
bassist Esperanza Spalding won
the Best New Artist category
against fellow nominees Drake,
Florence and the Machine,
Mumford and
Sons, and Justin Bieber.
She won because the majority
of the Grammy voters were
familiar with Esperanza and her
work, and they saw her as
a stand-out that year.
It's pretty cool when
you think about it.
Many of you who are musicians
or audio producers or
engineers may be eligible to
be members of the Academy.
And I'm happy to speak to you
about that after this is over
if you like.
You can also get some more
information or join by
visiting Grammy365.com.
Today specifically, we're here
to talk about excellence in
sound, something that's key
to great recordings.
The P&E wing has partnered with
the Consumer Electronics
Association and others on an
initiative we call Quality
Sound Matters.
We represent people who truly
understand the difference good
sound makes, and we want to
share their enthusiasm and
excitement about quality
with everybody.
Today, we have a very cool
presentation from
Grammy-winning engineer that
we think you'll enjoy.
And I want to give a big thank
you to Neil Annala and Joe
Rosenberg for bringing
us here today.
We'd also like to thank JBL
and Prism Sound for this
amazing sound system you're
going to hear today, too.
The speakers in particular, they
encompass some very new,
exciting technology that
you're amongst
the first to hear.
And to top it off, I really want
to introduce an amazing
engineer producer who's worked
with artists including
Metallica, Lincoln Park,
Green Day, and U2,
along with last week's--
well, not last week, but a
recent number one album, the
Black Sabbath project.
He's a two-time Grammy winner
for his work on the Red Hot
Chili Peppers' "Stadium
Arcadium" project, as well as
Adele's "21" album.
He has another interesting
honor.
In 2012, he was named the
International Engineer of the
Year by England's Music
Producers Guild.
Please welcome Andrew Scheps.
[APPLAUSE]
ANDREW SCHEPS: First of all,
thanks for coming.
This is as full a house as we
could have in here, I think.
So thank you so much, and thanks
again to Neil and Joe
for putting this together.
This is awesome.
So later on, we're going to
listen to a bunch of stuff,
which is the point
of what I do.
And the Recording Academy has
been really great about
sponsoring me to do this talk
all over the country.
The idea of the talk, it was
originally put together for
what the Recording Academy
called their Grammy Future Now
conference, which was sort a
mini, one-day TED conference
for producers and engineers, for
people who make music in
Los Angeles.
And since then, I've gone around
the country, and most
the time I give this
presentation to
producers and engineers.
And it's because there's a
lot of information in the
presentation that, as people who
make records, we sort of
kind of know, but we don't
actually know.
And so I'm trying to put numbers
and facts behind the
things we think we know so that
when we listen, you can
actually compare things and
you know what it is you're
listening to, and why there
might be differences and
things like that.
So I'll start the way I usually
start by asking how
many people in this room are
artists and make records, or
have ever released a record.
So still good number of you.
So if you've released a record,
how many of you have
then gone and bought your record
to make sure that what
comes off the services sounds
like what you sent them?
So that's about normal.
About a third of the
hands, maybe less.
And that's exactly the same
with people who do that as
their day job-- or night
job, depending on
the hours your keep.
It's not something people
really think about.
They finish their record, they
master it, like oh, it's done.
Send it off, and you're done.
And of course, now with all
the digital services--
and we'll get into lots
of them specifically--
and there are a few that happen
to be housed in this
building or down
in San Bruno--
there are lots and lots of
different ways that music gets
out into the world.
And so, the idea is to give some
context to know, what are
all these possibilities?
How do they compare?
And do they actually impact the
consumer's experience when
they listen to your music?
So that's the idea.
Now along the way, I can usually
get away with a lot of
sort of vagaries, because
I'm talking to
producers and engineers.
All right, this is the question
just for you guys.
How many people in this room
know more about digital audio
theory than me?
There's going to be-- come on.
It's everybody in the room.
But seriously, how many people
work directly with digital
audio in the room?
OK, I'm going to be vague.
I'm going to be slightly
inaccurate, and I would
welcome corrections
along the way.
So I've done the presentation,
I think, 12 times now.
11 of those times were for
producers and engineers, and
once was a few weeks ago, which
is when Neil came and
saw me at Fantasy Studios
in Berkeley.
And that was for a room full
of people from the tech
community, including people from
Google and YouTube and
Google Play music, as well as
SoundCloud and Apple and
Rhapsody and Arteo, and a
few other companies, and
Fraunhofer, who developed
the MP3 and ADC codecs.
And I got my butt kicked.
And I'm fine with that.
I would love to get my butt
kicked, because every time I
give this presentation, I know
more, and I can kick butt back
a little bit, which
is the point.
And I think what happens is
people get into their little
rabbit holes on what
they work on.
So I make records, and I want
to make great sounding
records, but I don't want to
follow it through the food
chain down to the consumer,
because that's not what I do.
Now two years ago, I started
my own record label, so now
that became part of what I do.
And I used to think, I'm going
to start a label because the
labels suck.
They don't know what
they're doing.
It turns out I don't know what
I'm doing, and it's really,
really difficult, and
there's a lot to it.
So every part of this process
of getting music into the
hands of people who listen to it
is unbelievably difficult,
incredibly technical, and
fraught with peril for the
audio along the way.
So we'll talk about some
of the specifics.
So what I want to do first,
though, is put recording into
perspective, OK?
So for thousands and thousands
of years-- and now we start my
very fine PowerPoint--
there has been music.
OK, for who knows how many?
Let's say 10,000?
Is that a good number?
This is where I get vague,
and everybody in the
room backs me up.
So we're going to say 10,000
years, there's been music in
the form of songs that have
been written by somebody.
And then, they would perform
their song for somebody in
their village or something
like that.
And the only way music could
propagate would be either they
would go to the next village, or
they would teach their song
to somebody and then they would
go to the next village,
or people from the next village
would come hear them
and go back.
Right?
So there's your music industry
for the first 9,900 years.
Fair enough?
OK, about 100 years ago--
a little bit more-- but
basically, about 100 years
ago, there started to be
consumer recordings of audio.
And there were a few things
before this, but let's say the
wax cylinder was the first
viable format.
So you have the Edison cylinder
where people would
come into a room.
They would make lots of noise.
That noise gets collected
by a horn.
It would get scratched
on to this disc
that's spinning around.
And then, you could take
that disc, and go
play it back elsewhere.
So all of a sudden,
you have created
what is called recording.
Recording, especially back then,
was technically just at
a delay process, right?
So you perform the music, and
then you capture it for a
second, and then you can
carry it around.
And then later on at any point,
you can play it back.
So now, you can get rid of some
of the space and time
constraints of everybody
come to your concert.
Now you can record your concert
and send it out.
Now this caused a huge uproar.
And in researching this for this
presentation, and also I
teach a recording class where I
try to give a little bit of
a history, there's some amazing
quotes from John
Philip Sousa and people like
that about how recording was
going to destroy not
music, but society.
Destroy it.
You have to be in the room
with the musician.
So I think we've all kind
of gotten over that.
I mean, I would hope everybody
here enjoys going to concerts
and things like that.
But we've gotten over the fact
that we're going to completely
destroy society.
Music isn't the only thing
that's destroyed.
And it's just one
of many things.
OK, so that's 100 years.
That's it.
Then about 50 years ago, mainly
with technology out of
Germany in the '40s, and then
also some techniques developed
bouncing from one tape machine
to another tape machine, you
started to be able to not just
capture a live performance,
but you had what we call
overdubs, which is basically,
you make a recording, and then
you record some more stuff to
go with it.
So now, you can record at
different times, and all of
those things up to
make a recording.
A lot of the early Beatles
recordings
were examples of bouncing.
They would record the band, then
they would play the band
back while recording something
else, combine those together.
So that was a technique.
The German tape machines allowed
you to actually have
multiple tracks that
side by side.
So you record on a couple of
tracks, then you record on
another track.
So things we're sort
of familiar with.
But that basically '40s,
into the '50s--
but even in the '50s, most
commercial recordings were
live recordings, to mono or
possibly starting to get into
three-track tape, but eventually
going to be mono
going out into the world.
But once you start having these
multi-track tapes, then
you have to mix those
things together.
So this created something
in the music
industry that didn't exist.
It used to be there were only
recording engineers who
captured things.
Now all of a sudden, you needed
people who could take
all the stuff that was captured,
combine it together,
and make it something that
could go off into
the world to be heard.
So that's the mix of
the recording of
the song with overdubs.
And then, once you actually
had consumer formats--
whether it was the cylinders or
onto LPs or 45's or 78's or
cassettes or eight-track
tapes, up into CDs--
you needed to have some sort
of standard as to how the
music would have to be put onto
these media to then be
distributed.
So you would get your mastered
mix of the recording of a song
with overdubs.
Now in a room full of engineers,
that kills.
That's really funny, because
it's a font joke.
Mastering makes things loud.
That's the idea so-- all right.
I'm sorry.
It's the wrong crowd.
OK, so this is now what
the artist sends off
into the world, OK?
This is what a record is.
But it's much more
than that, right?
Music, pre-recording, was
nothing more than art.
There was some commerce
involved, but it
was basically art.
It was musicians and composers
who would have a piece of
themselves that they would want
to capture, and then let
other people here it, and
recreate the emotion they were
trying to create when they
performed the music live.
So let's say that really, the
recording is more like this.
And you don't have to read it.
The point is I needed to get a
lot of text on the screen for
later on for one of my very
clever, inaccurate analogies.
The idea being that we need to
keep in mind that this is art,
and this is the difference
between looking at an art book
and going to a museum, OK?
There are differences.
And the idea of live performance
versus recording
is one stage of this
difference.
But there's also a huge
difference depending on how
that recording gets to you
at the end of the day.
And when we actually get to the
listening portion, I think
someone once said it's stuff
that you can't unhear.
You'll hear the difference
between some of these file
formats and bit rates and things
like that, and you'll
decide for yourself whether
it makes a difference.
My theory is I think it does.
OK, so now we're going to
go through part of the
presentation, which is a little
more technical, which
means it's a little dumbed
down for most of the
people in this room.
But there are a couple
important things.
So the first thing is,
the difference
between sound and audio.
And I'm sure most people in this
room know this, but the
idea that's important is that
all sound is analog, period.
An analog meaning infinitely
variable, OK?
Until you get down to the
molecular quantum level, any
sound in the air is infinitely
variable acoustic pressure
waves that travel around
the room, right?
Everybody cool with that?
Now, you can buy a digital
microphone or a digital pair
of headphones, and that isn't
actually what they are.
They are analog microphones
and analog headphones that
happen to have converters
built into them.
So they are two things in one.
But they are an analog device.
There's no such thing as
a digital microphone.
The only way you can record
something is to put something
in the air in the way of the
pressure wave so it moves
because of the pressure wave,
and then using lots of
different technologies for how
you design your microphone.
You turn that into a
voltage is the most
common way to do it.
Then, you can digitize
the voltage, OK?
So this would be the
simplest sound.
It's a sine wave.
It's information at only
one frequency.
But the idea is while it's a
sound wave, you zoom in, you
zoom in, you zoom in.
It never pixelates, right?
It's smooth all the way down.
So the idea of digitizing--
and this is where, feel free
to take a nap or something
real quick.
So obviously with digital
systems, you don't have the
luxury of looking at something
infinitely many times a
second, right?
You have to have a clock.
You have to decide how many
times you're going to look.
So for the producers and
engineers I talk to, this is
actually really helpful.
I know it's very simplistic,
but it's just the easiest
visual representation
of what sampling is.
So the idea is, time
across the bottom,
voltage up and down.
And every time there's
a vertical
line, that's a sample.
So how many times a second--
let's say that's a second, and
then we count the number lines,
and that's how many
times a second we're
looking at it.
And each time we look, we say,
how big's the voltage?
And we write it down
using a number.
And how many bits we get to
write down that number are our
horizontal lines.
Everybody's good with
that, right?
So the idea being that if you
look at this particular grid
superimposed on the sine wave,
we almost never go directly
through an intersection.
So we are always wrong.
We are always rounding.
And obviously, anyone in the
room who really knows digital
theory knows that that's OK.
There's one quantization error,
but you make up for it,
and you can reconstruct
things quite well.
You'll also know this sample
rate is way higher than we
actually need to capture
this sine wave.
You only need just over
two samples per
cycle, and your good.
So that's fine.
And I'm not saying that this is
not a good sample for this
particular sine wave.
But as a visual
representation, it's important.
The idea being, though, if we
want to be more accurate, we
can do two things--
we can up the sample rate, and
we came up the bit depth.
So now, this is sort of the
aha moment for a lot of
engineers who've got little
pop-up menus for sample rates
and bit depths, and they don't
actually know what they do
other than bigger is better,
So I'll record more stuff.
Now there are diminishing
returns.
In terms of actually building
audio hardware, it's very hard
to build something that will
work equally well at every
single sample rate.
And I do lots of listening tests
for just my studio for
making records, and I found
that there's a lot of gear
that works great at 96
kilohertz, and up at 192, it
doesn't really work so well,
because some things are
getting stressed, and it's just
not optimized for it.
So it's not always that higher
sample rate is better.
But in a perfect system, a
higher sample rate will be
more accurate more
of the time.
Right?
I mean, I think that's
fair enough to say.
And the same thing
with bit depth.
And in some ways, bit
depth is more
important than sample rate.
Now the other thing is you could
very easily make the
theoretical argument that 44.1
kilohertz is fine, because
human hearing goes up to
around 20 kilohertz?
And I know everyone probably
already knows this, but
basically, take your sample
rate, divide it by 2.
That's the highest frequency
you can capture
at that sample rate.
Fair enough?
So 44.1, you get
down to 22.05.
Wow.
22.05.
There you go.
Sorry.
My math just went
out the window.
But the problem is, to make that
work, you need a perfect
filter that cuts off everything
above that
frequency, but doesn't touch
anything below it, right?
That filter cannot be built.
It doesn't exist, especially
as an analog filter.
So this is part of why higher
sample rates are really
important for capturing
things--
to get an accurate picture at
20k, you kind of need to leave
it alone out to 40 or 48,
something like that.
So if you start working at 96,
and you can either use very
gentle analog filters or you
can start getting into
over-sampling and digital
filters, but you can do things
way past where we hear that
are brutal, and they don't
affect what goes on down
where we do hear.
Now there are also people who
argue that we respond to
frequencies above 20k.
We're not getting into that.
We're not getting into, we
should be tuning everything to
436 instead of 440.
There are lots of holistic
arguments about lots of
things, and I try and keep
things more real and in
numbers, because then I don't
have to argue about them for
12 hours, and not
get anywhere.
So I try and keep it that way.
So anyway, this is basically
what I try and impart about
sampling, even though you
guys know most of this.
So then we start talking
about the actual
consumer formats, OK?
Now there are two types of
digital audio files.
Again, I'm sure you guys know
this, but there's lossless
audio and there is
lossy audio.
OK, lossless audio is take a
PCM-encoded wave file at some
sample rate and some bit depth,
and you keep all the
numbers, period.
That's it.
That's all a loss is.
It's AIFF, WAV, used to be
Sound Designer, too.
OK, so those are loss files.
Now if you want to get into
the analog versus digital
debate, they're all
lossy, right?
We've thrown away some
information.
But we're not there.
Let's say that our capture
is awesome.
Let's say we're working
at 96k, 24-bit.
We've got lots of information.
If we keep all that information,
it's a loss file.
Lossy is--
and again, I'm just going to go
through the presentation.
You guys know all of
this already, which
is why it's so great.
So lossy is the difference
between zipping a file, and
using something where when you
unzip your 25-page paper
you've just written, it's
missing a bunch of letters and
there's stuff spelled wrong.
And again, for a lot of
producers and engineers, they
don't actually understand
this concept.
They assume that lossy
compression is still OK,
because you end up with a PCM
audio stream at the other end.
But it's reconstructed, and
stuff is thrown away to
actually make those files.
And the reason being that if
you zip an audio file, you
save maybe 20%.
If you use FLAC, which is
optimized for audio, you can
maybe save 50% of the space.
But that's it.
So if you do some quick math,
and you're looking at a CD,
let's say, which is at 44.1,
16-bit, you're talking about
10 megabytes for every minute
of stereo music.
Those are big files.
You guys spend a lot of time
trying to get files from one
place to another quickly
and efficiently.
Those files are too big,
especially up to a few years
ago with the data pipes
going to phones,
all the mobile devices.
There's no way you're going
to send that much audio.
So this is why the lossy
codecs actually exist.
So very briefly, Fraunhofer,
which is based up here,
developed first the MP3 lossy
codec, and then more recently,
the ADC codec.
These are based upon
the way you hear.
If you know anything about the
way your brain processes the
information from your ears, your
ears have just got lots
of hairs in it.
And Julie will probably
talk more about
this than I need to.
But you basically are splitting
things into
different frequencies.
All of that information comes
up into your brain.
Your brain then processes it,
and decides, I don't need to
listen to that, not going to pay
attention to that, I hate
that, screw that-- oh,
that's important.
And then that's what you hear.
So there's lots and lots of
information that's thrown
away, which is why in a
crowded room, you can
concentrate on a conversation
with somebody, because you
start to mask things out.
And the same is true when you're
listening to music.
There are lots of things
that can be masked out.
So through a lot of research,
they decided, what can we
throw away, right?
The idea being that if we take
care of getting rid of some of
this information, then all of a
sudden, we're dealing with a
much smaller file.
And if you compare file sizes, a
decent bit rate MP3 is maybe
10% of the size of the
uncompressed audio file.
Yet in some listening tests, you
might be able to actually
do pretty well against the
file it was encoded from.
OK, so this is where I get very
inaccurate, and people
actually got mad at
me about this.
But that's OK, because I'm up
here and you're back there,
and you'd have to jump
over the screens.
So this is the way I explain
lossy encoding to people.
So if we go back to our
paragraph of lots and lots of
text, if I take out some of
the vowels, everybody can
still read this just as fast
as they used to, right?
The idea is your brain is
predicting what should be
there as much as it's
taking the input of
what actually is there.
So if we look at the word
"mastered" in that first line,
as soon as you get to the M and
you see the "stered" after
it, your brain has decided
there's probably an A there.
There's room for an A there.
There's an A there.
It fills in the blank.
If you have a tiny little smudge
on the page, your brain
is all about it.
That is an A. Absolutely.
Whereas on its own, that
smudge is nothing.
It's a smudge.
So that's the basic idea, is
finding what can we throw
away, and still be able to
read as fast we can?
Or, listen and enjoy the music
without having to figure out
what it was supposed
to sound like?
So the idea being that if I only
take out those vowels, we
don't save a whole
lot of space.
If I take out all the vowels,
now we're really starting to
save some space, and we can
compact it down, but I can no
longer read this, OK?
So somewhere is a threshold.
The problem is when you're
reading, you have very
discrete chunks of data.
You either know what that
word is, or you don't.
Maybe you can fill in a word
from the context around it,
but that's kind of as
far as you can go.
When you're listening to music,
at some point it just
sounds bad, and you don't
really want to
listen to it anymore.
Sometimes it sounds so bad that
it's kind of crazy and it
sounds like it's under
water, and more
like whales than music.
But until you get to that point,
it's very hard to say,
yeah, OK, we compressed too
much, because you could put
someone in the room, and
especially if they know the
song, they'll fill in some
blanks on their own, and
they're like, yeah,
I like this song.
It's all good.
So the problem with audio is you
go from this analog sine
wave-- which no matter how
far we zoom in, is still
infinitely varying.
We capture it, we compress
it, we send it off, we
reconstruct, but we're starting
to reconstruct
something that's a little
more stepped.
Now again, this will get
smoothed out by things in both
the circuitry and also by your
ears, so there are lots of
things working to help you out
in reconstructing this
waveform along the way.
But you compress too much, and
then you start getting to
things that start to not really
sound like sine waves,
or they've got so many harmonics
on them that you
don't hear them as a
sine wave anymore.
And at that point, you're
listening to something
different than what
you started with.
And I think that is more akin
to someone who kind of sucks
at art, copying paintings, and
selling it to you, and like,
yeah, I'll put that
on my wall.
Now for $10 and I
can download it?
Maybe that's a trade-off
you're willing to make.
But in terms of taking the art
that this artist has made, and
saying, this is my record, and I
love it, and it makes my mom
cry, at some point you're going
to send them such a low
bit rate file that their mom's
not going to cry anymore.
And that's a drag, because at
that point you've lost the
point of the music, right?
It's art coming through
speakers.
It's emotion coming
through speakers.
So what can we do as record
makers, and then what can we
do as people who get that music
out into the world to
help people listen to it?
And the great part is, I would
assume that everybody in this
room listens to music
recreationally.
Let's start with the hands of
people who don't listen to
music ever.
OK, so not only are we in charge
of making this music
and getting it out there, but we
also consume it, so we want
to make products that we
actually like, which with a
lot of things, people don't
actually buy their own
products, whereas this is sort
of the ultimate consumer
product, because everybody's
into it one way or another.
So going back to the actual
consumer formats.
Within the loss category,
you've really
only got two choices.
You have CDs, which are dying a
very quick death, which are
set at 44.1 16-bit
audio, right?
Then you've got what
is called high res.
And this is a term that people
can argue about.
All it means is anything better
than 44.1 16-bit, OK?
So when the Beatles re-released
their catalog, I
dunno, six years ago, something
like that, there was
a version you could
buy on a USB stick
which was 44.1 24-bit.
That is high res audio, because
it's higher than a CD.
So that's what the term means
out in the audio world.
Now for me, I like to think of
high res being up at 96k or
something like that.
But in terms of consumer audio,
that's what you get.
Now in terms of buying high
res audio, there are very,
very few options.
There's HDtracks, who will sell
you things to download,
and there's this crazy
Java file.
OK, has anyone bought anything
from HDtracks in this room?
So a few people.
Is there anybody who thinks that
it's so easy to download
and play back this stuff that
everybody should be doing it?
OK.
Got a couple.
So there are a lot of things
involved, and I'll talk a
little more about what
I have set up here to
play this stuff back.
It's hard to get the high res
music, and it's hard to play
it back properly.
It's easy to play
it back wrong.
Anybody can do that.
Just throw it in iTunes or any
other music player, it'll play
back wrong, and you're
all good.
But you're getting into
transcoding, and things that
you don't really want
to get into.
But anyway, that's what you've
got for the two viable sort of
ways you can get
lossless audio.
There are a couple others
that, once we start
listening--
excuse me-- once you start
listening that I'll actually
show you, which are
kind of cool.
There's high res streaming
starting to
happen, adaptive streaming.
It's really awesome.
OK, then we get into the lossy
formats, and those files are
basically MP3 and AAC,
which are the
two Fraunhofer codecs--
AAC having not necessarily
superseded MP3, but just
coming after.
I think Robert from Fraunhofer
would argue that
it supersedes it.
But obviously, there's tons
of stuff still coming out
on MP3 as you go.
Depends how you encode
things like that.
Then there's ogg vorbis, which
other than Wikipedia, I don't
know much about it.
Is it that it's open source?
OK, so it's the open
source encoder.
There you go.
But of course, there are open
source MP3 decoders, which
skirt Fraunhofer's license.
Because if you get the lame
encoder, you're not paying
them, either.
So I don't know.
That's vague.
Yes?
AUDIENCE: It's totally
patent free, as
well, but that's debatable.
ANDREW SCHEPS: The ogg vorbis?
OK, so ogg vorbis is
patent-free, which I guess
would be the main difference.
Because if you can build
yourself an MP4 encoder that's
open source, you're
getting around--
anyway.
Robert and I had a very long
conversation about this, and
he was awesome.
He was very, very
good about this.
I thought he was going to kill
me, but he was great.
OK, so if we actually start
looking at the services
themselves, this is where for
the producers and engineers
it's a big, big deal, because
this is the stuff where they
don't necessarily understand
things.
I mean, they understand, but
it's the stuff you know but
you don't know.
So the CD and high res are both,
I'm going to say, WAV.
You can buy it is FLAC, but
that's just compressed WAV.
There are AAIFs and things
floating out there.
But WAV is the most robust and
the most prolific form of
uncompressed audio.
Everything else is not WAV.
OK, so it's either--
all right, first of all, who
here is from the Play Music?
OK, I need an answer, because
I have scoured your website,
and it says it plays up
to 320 kbps files.
So what format, and what
does the up to mean?
I can't-- is that
an NDA thing?
AUDIENCE: [INAUDIBLE]
ANDREW SCHEPS: So it's MP3s.
AUDIENCE: [INAUDIBLE]
ANDREW SCHEPS: And I'm assuming
it's scaled, so as
you test bandwidth, do go--
I'm going to guess,
128, 256, 320?
AUDIENCE: 192, 256.
ANDREW SCHEPS: OK.
So three tiers topping
out at 320.
OK, I couldn't--
and this is part of
the problem of
looking for this stuff.
And I don't think--
and you can correct
me if I'm wrong--
I don't think anyone is
intentionally being obscure
about this.
Maybe you are.
Are you being intentionally?
Are you obfuscating?
I love that word.
Maybe you are a little bit.
OK.
So--
yes, sir?
AUDIENCE: If people in front
can move maybe towards the
back of the room, we're
going to playing stuff
out of those speakers.
ANDREW SCHEPS: Yeah, that's
going to hurt.
I think what we can do actually
is, what's going to
happen is at about 10 to 5:00,
Julie's going to speak,
because she's got
a presentation
about what she's doing.
And we're technically sort of
4:00 to 5:00, but we also have
the room to 6:30.
So what I'd love to do is we've
got 15 minutes, I'll
finish going blah blah blah.
We can maybe do some questions
where you guys kick my ass.
Can I say ass on this?
It's internal, right?
You can kick my ass.
And then, we'll break
for Julie to speak.
And then, we'll do the
listening, and people who have
to go can go, but then we can
also shove people into the
middle of the room, because
you guys are
going to get killed.
I mean, I'm not going to have
it crazy loud, but still,
you're going to get killed
a little bit.
OK, so finding all of
this information.
OK, how many people
from YouTube?
Do we have anyone?
OK, so we got a couple.
Finding out the information on
what happens with the audio on
YouTube was not that difficult,
but it was also a
little odd in that-- so does
everybody in the room know why
there are two bit rates, and
everybody in the room know
when you get which one?
OK, there you go.
So here's the problem.
It's tied to the video rate.
There's no setting that says,
give me good audio.
There's only the setting that
says, give me good video.
So basically--
and you can correct
me if I'm wrong--
720 and 1080 give you 384.
Everything else gives
you 128, OK?
Here's the problem--
a lot of people can't afford to
make videos for every song
on their record, and a lot of
people who buy records and
then really like a song and want
to upload it to YouTube
don't make a video that's
HD for that song.
So you upload static art work,
or you upload lyrics, or you
upload a picture of your
dog-- or cats.
Cats are the internet, right?
So it's kittens.
But unless it's awesome footage
of a kitten, nobody is
going to switch to HD.
Nobody.
And it doesn't default to HD, so
nobody here's your music at
384, which is, in terms of pure
bit rate, the highest of
the lossy formats available,
period, and nobody hears it.
Yet from numbers I've seen,
and I'm sure my NDA won't
cover this because I haven't
even signed one, but for
numbers I've seen, 80% of music
discovery happens on YouTube.
Somebody says, hey, have you
heard vrr, and I go, I don't
know, let me search for it.
And you put it in, and you
listen to it on YouTube.
So 80% of the time, people are
being introduced to music with
one of the lowest bit rates on
the board, when the highest
rate on the board is actually
there, though not available
for most of the videos,
because people aren't
bothering to upload HD video.
And should be just
to finish up the
YouTube thing right now?
OK, and this is something
I'm hoping--
I know I'm speaking with some
of you tomorrow, but I would
love to get--
my email address is my name,
andrew@scheps.com.
Hunt me down, find me,
because I'd love to
discuss this stuff.
Because another thing is going
through all of the YouTube
documentation, there's nothing
that I could find about audio
upload guidelines.
OK, so there are no audio
upload guidelines on the
YouTube site.
Zero.
The problem is, of course, what
you're ending up with are
128 and 384 AACs, but most of
the time, people are uploading
lossily compressed audio.
So you're transcoding.
Is there anybody in the room who
disagrees that transcoding
is the worst sounding thing you
could ever do to a piece
of audio between two
lossy format?
Because we'll fight later.
OK, there are amazing sounding
lossy encoded files.
384 AAC, I would defy most
people to sit in a room, do
double-blind test between 384
AACs properly encoded and CDs.
I would defy anybody to not tell
the difference between
384 transcoded AAc that came
from any other lossy format.
It sounds terrible.
This is one of the things
we're hoping to
move forward with.
So anyway, this is one
of the problems with
comparing the services.
But the big problem that a lot
of the people I speak to
normally have is they don't know
how to compare the 44.1
and the 256, and zero
consumers know how.
256 is way more than
44, right?
I rest my case.
But when you're trying to
actually educate people about
just what this is, you need to
come and sit in a room, and
have me go blah blah blah,
and show you a chart.
So the idea is that, again, as
with any scientific thing,
you've got to look
at the units.
And the kilohertz and bit depth
is totally different
from kilobits per second.
Now the cool thing is that all
of the lossy formats are
actually very transparent
with their bit rate.
OK, this is, again, where
I make records.
I don't work with computers
all the time.
I'm rounding.
There's no 1024.
The numbers are very round,
because it's easy for us
people to understand.
All right, so basically, I
take your bit rate, I put
three zeroes on the end, and
that's how many bits per
second I get to represent my
stereo piece of art that makes
my mom cry.
Then actually do the math--
44,100 times 16 times 2, and
we're at 1.4 million on a CD.
Now obviously, the codecs that
encode the lossy encoders are
very smart.
So it's not like just take a
percentage, and that's how
much worse at sounds.
I absolutely get that.
But we're talking at a very big
difference, and then you
look at the 192 32, which is the
highest I've seen coming
off of HD tracks.
And you're up to 12.2 million.
OK, the problem being in the
grand scheme of things that
that's really not a whole lot
compared to the analog we
started with.
So again, we're not going to go
the analog versus digital
debate, but how many people
here like vinyl?
How many people actually look
to see if the vinyl's done
from the analog masters instead
of digital remasters?
Get some old Blue Note.
Even just compare it to some
of the reissued Blue Note.
And it's kind of astonishing.
It's like your there.
OK, so this is where we stop
talking about numbers.
And now, I want to go through
this study very quickly.
This is sort of an
older study.
Because of course, the thing
is, does anybody care?
If nobody cares, then we don't
need to care, right?
If this doesn't make a
difference, and it's all just
a bunch of numbers,
I don't care.
The idea is I want people to
spend enough money on the
music that I work on that the
artists I work with cannot
take a day job so they can
keep making records.
And I want to be able to afford
to keep making records,
and not necessarily take a day
job, but if you've got
something for me, we'll talk.
OK?
That's the idea.
OK, we're not all looking
to be on MTV Cribs,
because we're not.
OK, but if people don't care,
then by all means, make the
files tiny, because then
everything else about the
consumer experience
is awesome.
Instant on, very fast, move it
from one place to another, fit
25 bazillion songs on anything
that fits in your pocket.
That's all good.
OK, now Harman who were actually
nice enough to send
up this pair of speakers we're
going to listen to later, this
study is from a little
while ago to be fair.
But they decided, we need to
actually know if people care.
Because they don't care what the
outcome is, but they need
to know the answer to that
question because they make
equipment for people
to listen to music.
That's what they do.
So they need to know, do
we need to be really
concentrating on stuff that
plays back loss audio, or even
high res audio?
Or should we be building better
MP3 hardware decoders
in, and just deal with that?
Should we actually limit
the bandwidth?
When we're starting to talk
about wireless technology--
I mean, if you look at Sonos
and RedNet and a lot of the
really cool networked audio
and wireless audio
technologies--
where do we need to
cap our bandwidth?
These people need to know what
people like, but they don't
actually have a horse in the
race, because they're just
going to build the gear
to play it back.
So Dr. Sean Oliver, who works
there, who's a pretty amazing
guy, and he's got labs
that have all
kinds of stuff in them.
They've got stuff that looks
like it's out of an amusement
park, so when you're A-B-ing
speakers, they hydraulically
move into the same place.
You don't have the differences
in placement when people
change speakers and
things like that.
So what he decided to do was
get young people, because
there's a lot of sort of
anecdotal evidence that young
people not only don't care--
but this is the crazy one to me,
and if you know anything
about neurology and cognitive
listening, it's even crazier--
but that kids these days have
only heard MP3s, so they
actually prefer them.
Again, if anyone wants to
discuss that later, I will
talk about that for hours,
because that's the rabbit hole
I've been down for the
last two years.
But I'll just say that
that is pretty much
categorically not true.
So this study from a little
while ago was
meant to prove this.
So they got a bunch of young
kids these days, or in those
days, both high school and
college age students.
The only thing that's really
important here-- well, there
are two things.
One is that, for whatever
reason, they were mostly male
students, as opposed to female
students, studying audio,
which is kind of a drag
at all times.
So that's just the
way it works.
The other thing is you see this
last column, this level
of training--
all this is is that these
students were involved in a
recording program, or they had
taken a comparative listening
class or a critical listening
class, or something like that.
So they were aware of audio
quality as a thing, as opposed
to just being someone off the
street who really has never,
ever thought about it, OK?
So that's the break up.
Here is what they did.
And I--
all it means is they knew what
they were doing, and it's
scientific.
OK, so it's true double-blind
listening.
These kids don't know what
they're listening to.
They come back multiple times,
and they listen.
OK, now this is between 128k
MP3, which was what everybody
was selling when they
did this story.
And you think, my god, that's
the Dark Ages, but it's
really, what, four years ago?
Maybe five?
Maybe five.
That's what you could buy.
So between that and CD.
So we're not talking high
res HDtracks downloads.
70% of the time, those stupid
kids liked the CD.
And this isn't even a
what sounds better.
This is a what do you
listening to?
Which one do you want to hear?
All right, the important part of
this is going back to this
sort of threshold of where
does my mom cry, is what
happens emotionally?
So part of one of my theories
is, if you go back to that
huge block of text, and you take
out a bunch of vowels, at
some point it's harder
work to read.
So while you will still
understand the words, and
enjoy the story maybe, you
will be less emotionally
invested because you're
doing stuff.
The same thing is true, I
believe, when listening to
lossy audio, because while your
brain might throw stuff
away, it's expecting it, and
your brain gets pissed when
the stuff doesn't show up.
So you can create anxiety, you
can create depression at very
low levels, but at the same
time, it's also filling in the
blanks for you, right?
You're taking away
lots of acoustic
things from the music.
That's one of the first things
to go are reverb tails and
acoustic cues.
So your brain is recreating.
Therefore, it becomes more of
an active process to listen.
Now while that may not be that
much of an issue, one of the
anecdotal things that really
sent me down this road is that
my daughter had a friend in high
school who was interning
with me in my studio.
And great drummer, really
musical kid, listens to music
all the time.
And he showed up at the studio
in the afternoon to work on
something, and he came in, and
he said, man, been listening
to music all day and
I'm exhausted.
And I don't know how many people
that sounds absolutely
crazy to, but that to me is
crazy, because I would wake up
in the morning and put on
records or cassettes--
even that I had recorded from
a microphone in front of a
speaker, so not the highest
quality audio in the world--
but I would listen for 15 hours,
and my parents would
yell at me, and then I would
listen to headphones in bed
for a while.
Even recently, I've gone to
friends' houses who have these
amazing set-ups, and we listen
to vinyl all day.
And as my wife can attest, I was
down at this guy's house
for 15 hours, and I got home at
1:30 in the morning and put
on a record.
I was not exhausted.
When I listen to some of the
streaming audio services,
though, I get tired.
I get a headache.
I grind my teeth.
And it's not an instantaneous
thing.
It is not an, oh my god, that's
killing me and making
my ears bleed.
But it is, in terms of a
long-term commitment, and I
would also argue in terms of a
long-term connection between
people who hear the music
and the artist.
And one of the most important
things with artists is that
people actually connect with
them on an artistic level.
And that happens by them
experiencing some of the
emotion that went
into the song.
And it could be as simple as a
lyric, which means you're in
pretty good shape
no matter what.
But it could be because of
the chord changes and the
instrumentation and the
subtleties of the performance.
And when we start listening, you
will, I believe, start to
hear some kind of not
subtle differences.
We put the B back in subtle with
some of the things that
change when you listen back to
back between some of the
lossily encoded music and
the lossless music.
In terms of when you get to the
second verse of the song,
do you feel like, musically,
I've already heard
this, let's move on?
Or do you feel like, god, what's
next in the story?
And man, there's a
new guitar part.
And these are subtle things.
So if you love an artist, then
it doesn't really matter.
You will love them even
if it sounds terrible.
But what if it's somewhere
in the middle?
What if you're kind
of on the fence?
What if the audio quality
actually determines where your
threshold moves as you're
listening as to whether you're
going to listen to the next song
on that record, or even
make it to the end of
the first song?
And I know that part of people
not listening all the way
through to songs and skipping
around all the time is just
due to changes in consumer
habits, and we're all
multitasking more, and
things like that.
But for the people here who
listen to vinyl, I think you
may not always flip it to Side
B, but how often do you lift
with the needle in the middle
of Side A-- unless you're
DJing a party--
because you're just kind of
tired of it, and now I
want to move on?
You'll generally have the
experience of Side A. So
you're getting 20 minutes
straight of something.
When you're just listening
online, that
doesn't happen so much.
There's a lot of skipping
around, and a lot of moving.
But what I've got here--
I went to a few of the
different labels.
I've got 18 songs and a bunch of
different genres, and I'll
put up just a list of them.
And you guys will DJ.
And also, we can talk
about anything.
If anyone has questions or want
to point out stuff I've
got wrong, I absolutely want
that to happen, as well.
And we can do that while we
listen, things like that.
And I have them in as many
formats as I could possibly
have them in, including--
oh, we didn't make
it to this slide.
Sorry.
Google Play Music, I've got
my playlist from you guys.
So hopefully because I'm on your
ridiculously fast, free
Wi-Fi, we'll be getting 320
the whole time I'm sure.
But also, then I want to show
you something called
OraStream, which is
adaptive based on
bandwidth, which is awesome.
And we'll talk about
other stuff.
Roundabout.
OK, really quickly.
The way I'm playing the stuff
back is I'm using my Mac.
I am playing out of a program
called Decibel, which is just
a very, very simple
music player.
And the only thing that it does
is it switches the sample
rate of the hardware
to match the files.
So that way, we're not doing
any sample reconversion.
In software on the way out of
the computer, we get it out to
the converter at its
native sample rate.
It also crashes a lot.
It's a $30 program.
But it generally works.
I'm using this Prism Orpheus,
which it's a one-rack space
eight-channel audio interface.
So it's amazing for recording,
but I'm using it because it
gives me a volume knob
on the front.
I'm just using it stereo
going out.
The reason I'm using it, as
opposed to something a little
more simple, is because some
of my source material is at
192, so I need a box that'll go
up to 192 without putting
something else in the middle.
I've tried as hard as I can to
make sure that all of these
different files are from
the exact same master.
So the same--
remember my font joke
from earlier?
Sometimes, that happens multiple
times to a release.
Roundabout is one
of the examples.
Sorry, we will listen
really quickly.
But I needed to say that
Roundabout is one of the
examples of something where it
is actually from a different
master, the high res version,
because it was from a DVD
audio release from, I don't
know, eight years ago--
way back in the stone ages when
that was a format for
about eight minutes.
So that is actually a
different master.
But still, it's a pretty
astounding difference.
Now I will also say--
and we can stop this, but anyone
who was at the talk I
gave at Berkeley knows that
at some point, Robert from
Fraunhofer made me stop playing
things off YouTube
because he said it's unfair
because it's all transcoded,
and made it him look bad.
And I said, OK, that's fine, but
I wasn't sure if everybody
in the room kind of understood
what had just happened, that
we just took the biggest player
in music discovery out
of the discussion completely
because it wasn't fair to the
people who developed the codec
that encode the music that's
on this service.
So I will play--
and that said, I play official
videos if I can find them.
But there aren't always
official videos.
So let's listen to some
Yes, and would you
like to pick a format?
Do you want to go low to
high, high to low?
AUDIENCE: High to low.
ANDREW SCHEPS: High
to low, OK.
So we'll actually go down
through CDs, because you'll
hear a little bit of the
difference between the master.
So this is the 96 24 taken off
the DVD-A, or whatever it was.
AUDIENCE: Quick question
for you.
Are you relying on the digital
analog in your Macbook?
ANDREW SCHEPS: No, I'm going
FireWire to the Orpheus, and
the Orpheus is the
D to A converter.
And it's a great sounding
converter.
The Prism converters are--
some people say that they're the
best converters out there
for music recording.
In the UK, it's almost
exclusively what's used for
all the orchestral
scoring guys.
They'll have 80 channels of
the Prism converters.
And then, we're just going
straight into an amplifier to
these speakers.
And that's it.
Yeah?
AUDIENCE: What are you doing
to match levels?
I'm fudging it.
OK, so this is not a
scientific test.
This is an anecdotal test.
Unless I unplug the monitor,
which we can do as well,
you're going to know what
you're listening to.
So I'll try and match levels
as best I can from up here,
but it does vary a little bit.
So I'll always make the high res
stuff louder, because then
you'll like it better.
AUDIENCE: How much power are
you using to drive the
amplifiers?
ANDREW SCHEPS: It says
it's 4 by 350.
So each speaker is bi-amp, so
we get 700 watts a side.
So I'm barely cranking it.
You let me know how
loud to go.
And I apologize again.
Yeah?
AUDIENCE: [INAUDIBLE]
volume [INAUDIBLE]
digital in this thing?
ANDREW SCHEPS: In this?
No, it's actually an analog
control on the output, which
is bizarre.
That's what they tell me.
You can hook it up in lots
of different ways.
There's an audio
path within it.
The way it is supposedly
hooked up is as analog.
But if it is digital,
I have to be able to
turn it up and down.
I don't have a choice.
There have been times when I
actually had an analog control
room section instead,
but it was a lot of
gear to bring up here.
So we're going to use that.
Again, everything is
going through that.
Everything is constant except
the files themselves.
AUDIENCE: Is it worth turning
off the air conditioning, or
will that not matter because
of the volume?
ANDREW SCHEPS: I think we'll get
over the top of it, yeah.
I mean, again, this
is not the most--
here's the crux of this.
And I do want to get to
music for those of
you who have to leave.
But the crux of this is that
you could set up audio file
double-blind A-B tests--
A-B-X tests-- and be really
precise about this, and see
what you can tell the
difference of.
But I think especially as
we jump from ends of the
spectrum, it's not subtle.
It's huge differences, and then
it's a question about
whether it matters to you.
I mean, who cares if you can
hear the difference?
If you like them both,
then fine.
Then you're good with
the small files.
I'm not trying to evangelize one
particular type of file,
or to convince anybody that you
have to listen this way,
or you're missing out
on the music.
My theory is that once you get
to a certain point, you're no
longer kind of interfering in
the emotional response.
But in terms of an audio file,
short burst listening test,
this is more fun than anything
else, because it takes a lot
of work to actually find all
these stupid files and put
them in one place.
So that's the fun of it is I
wasted days of my life so that
we can sit here and DJ.
OK, so that said, let
me know how loud--
OK.
[MUSIC PLAYING]
ANDREW SCHEPS: All right,
and here's CD, which--
again, a different master, but
it's more to set for when we
listen to the other formats.
So this will be the
same master as
all the other formats.
OK, so that's pretty
different.
But it's also a different
master.
So let's for fun, because
it is fun.
This is when I'm glad I'm
behind the speakers.
Sorry.
Let's just listen to some more
stuff, and then we can talk
more, because--
AUDIENCE: What resolution were
you playing that at?
ANDREW SCHEPS: Well, that
would have been--
is it 128 AAC?
Because there was no
high def video.
AUDIENCE: OK, so it was an
old upload [INAUDIBLE]?
ANDREW SCHEPS: I guess, yeah.
I mean, or it's a static artwork
upload, so they didn't
bother uploading it in HD.
AUDIENCE: [INAUDIBLE].
ANDREW SCHEPS: Yeah.
OK, so let's do Coltrane.
So this is the same
master, OK?
There have been reissues and
things like that of this, but
I know for a fact because I got
this from Blue Note that
this is the same master
in all formats.
OK, so where do we
want to start?
You guys tell me.
So that's A. We'll do A,B,C.
What do you think about that?
Or do you just want A, B?
AUDIENCE: A, B, A, B
ANDREW SCHEPS: Just A B?
Well, hold on.
A, B or A, B, C?
A, B. OK.
AUDIENCE: Are you sure
[INAUDIBLE].
ANDREW SCHEPS: Yeah,
that happens a lot.
And this is why, again, we had
to stop going to YouTube as
any of them, because a lot of
them are either swapped, or
depending on the transcoding
start to collapse into mono.
Like the Beatles stuff
is mono, but it's
not the mono mixes.
So yeah, that happens,
but that's--
AUDIENCE: [INAUDIBLE]
resolution.
You can't have very high good
placement [INAUDIBLE].
ANDREW SCHEPS: Oh yeah.
Yeah, I mean, with the CD.
OK, so that was A and B. That
was YouTube versus 192.
And so again, it was the
low resolution possibly
transcoded, even though
that was an
official Blue Note upload.
But the problem is--
I mean, I'm sure you guys know,
working at kind of a big
company, that at some point,
someone told the people at
Blue Note, OK, now we're going
to start doing our official
YouTube uploads.
And here are all the assets,
and go ahead and do it.
And that definitely filtered
down to an intern who had to
sit in front of a computer
uploading for three weeks,
because nobody who really know
what they're going to do,
knows what they're doing is
going to spend longer than it
takes to just point them
to the assets.
So their official
uploads could've
been completely destroyed.
I mean, it's easier
sometimes--
and this happens at HDtracks
a lot, where they're sent
something that they're told is
96/24 so that they can sell
it, but the person who actually
sent them the files
didn't know how to get over the
2 gig file size limit, and
the album was too big, so they
just ripped a CD and sent it.
And it happens.
And then HDtracks gets in a lot
of trouble, because there
are a bunch of crazy audiophiles
at home doing FFTs
of this stuff.
And also, depending on how it
was recorded, there isn't
necessarily anything
above 20k.
But if they don't see stuff
at 40k, they're like,
that's not high res.
So there are lots of problems
in the supply chain, as well
as just the file formats, which
is, again, why this is
not meant to be a scientific
test, and
more of just an anecdote.
Now if you want, we can stay
away from YouTube, because it
is, unfortunately, the
most problematic.
But--
AUDIENCE: Which one
[INAUDIBLE]
ANDREW SCHEPS: A was YouTube,
and B was 192.
Now an interesting thing to me--
with these speakers, I
added some low end to tune
this room very quickly
before I came in.
There's some thumping on that
side that I'm hearing on the
192 which I don't really
hear in the MP3.
So you don't always--
like, oh my god, it's
just so much better.
Sometimes you uncover other
things along the way.
AUDIENCE: Do you have a
non-YouTube [INAUDIBLE]
ANDREW SCHEPS: Yeah.
We can do--
well, your stuff would be 320.
So we could do 320, or
we could do Amazon if
you want to do that.
Let's do Amazon.
Well, I just told you,
it's Amazon.
I'll leave it up.
OK, but here's Amazon.
AUDIENCE: Can you do it, and
then we vote which is which?
ANDREW SCHEPS: Yeah.
Yeah.
Let me just play you some of
the Amazon of the Coltrane,
and then we'll go to a different
song, and I won't
say a word.
Now I'm crazy, so I did some
FFTs of some of this stuff.
And one of the things that
Amazon does-- because they're
only selling 256 MP3s, that's
what they sell--
and presumably to help
their encoder--
because they're not getting
24-bit files, either--
they actually pretty much cut
off everything above 15k.
So that's ban limited to 15k on
the way into the encoder,
because if you don't have to
bother encoding from 15 to 20,
you've got that much more
room to encode below.
So that's their decision.
Again, now I'm right between
the speakers, so for me the
imaging is a pretty
obvious thing.
The 192 is the only one where
things are either on the left
or in the right.
And Rudy Van Gelder, who
recorded this album, did not
have a pan pot.
It was a patch cord.
It was either in the left or
the right, and that's it.
So as soon as you get anything
that isn't discretely on one
side or the other, you know it's
part of the process of
the encoding that has
made things shift.
And that's another way that a
lot of the encoders work.
And I don't know specifically
the ones you use because
you're writing your own
encoders, if you mono up
stuff, , it makes it much
easier to encode.
It's one audio stream, and it's
identical in both channels.
So you can save a lot of space
doing it that way.
And I'm sure that's part of the
pre-encoding of a lot of
this stuff, especially at
the lower bit rates.
So that'll happen.
And it's not that big a deal
on modern pop stuff because
stuff is everywhere, but any of
the Beatles stuff, all the
old Motown stuff, the Blue
Note stuff, that is all
discrete stereo, and it will
change it completely.
AUDIENCE: [INAUDIBLE]
are you thinking about
[INAUDIBLE]
ANDREW SCHEPS: No.
No, I refuse to.
So here's my theory, is that I
need to make my records sound
as good as I can make them
sound regardless of what
happens afterwards.
So then, when I realized what
was happening afterwards, I
asked the Recording Academy so
let me come and talk about it.
And they said, sure, we've
been trying to figure--
because they've had this
Quality Sound Matters
initiative officially for a
little over a year, but
unofficially for the
last 10 years.
And they've had ideas about--
we're going to get buses and put
awesome sound systems in
them, and we're going to drive
them around and play this
stuff for people, and trying
to come up with ways to let
people here what the difference
is so that you can
start to understand.
So when I came up with this
presentation as a way to do
it, they were all over it
and have allowed me
to come and do it.
So my idea is to find
out what's actually
important, and change it.
I refuse to live with the crap,
and just say, I got to
make it work on earbuds, because
in five years, it
won't be earbuds.
And the pipes will be bigger,
and you guys will flip a
switch, and it's going to be
either uncompressed or barely
compressed.
And so now, I've changed my
whole workflow to cater to
something that goes away.
And it's one of--
not to talk about your
neighbors-- but it's one of
the biggest problems I have with
Apple conceptually, is
that they will talk a lot about
what they want to get
from the labels and from the
artists in terms of their
ingestion, and they want
24 bit, and they
want the high res.
But if I master specifically for
their encoder right now,
in three weeks, if they say,
bandwidth is awesome.
We're going to start
selling 320 AACs.
Well, now it's a new encoder,
or they just
update their encoder.
All of a sudden, I'm making
decisions based on
things that go away.
And I think it's a very big
difference between the record
making process and the consumer
distribution world,
and you can't make records for
the consumer distribution
world other than a lot of the
analog limitations we used to
have to deal with.
Like you can't pan your bass off
to one side if there's a
lot of low end, and
still cut vinyl.
OK, like their physical
limitations to things which
I'm fine with.
And AM radio--
they shave off the top and the
bottom, and it's mono.
OK, that's fine, I know what's
going to happen.
But in terms of taking some
sort of encoding algorithm
that's constantly being
updated-- otherwise, some
people in this room would
be out of a job--
I can't work for that because
it's a moving target.
So my idea is if I make it sound
great, it will survive
the process better.
And that, I've actually
found is true.
Like this Blue Note stuff
sound so amazing and so
natural that you can start to
hear things get hashy and it's
a little more annoying and a
little brash, and the panning
isn't as wide.
But musically, it's still pretty
awesome, and it's OK.
And it survives better.
And strangely, a lot of the
urban music survives better,
because there's lots of
separation between the
instruments.
Things are very discretely
encompassed in terms of their
frequencies and things
like that.
They're not sharing
a lot of space.
You don't have 15 microphones
on a drum kit that are all
making noise.
So that actually translates
better.
And strangely, there is zero hip
hop or R&B that I was able
to get, other than the
Espreranza Spalding
record, in high res.
It doesn't exist.
CD is as high as it goes.
They turn in masters that are
44.1-16, because they're
building it on a laptop.
And they're actually building
their tracks with MP3s.
AUDIENCE: [INAUDIBLE]
compressed not in bytes but
making the lowest part of the
music-- the softest
one-- high.
So if people start doing
that [INAUDIBLE]
what's the point in
going to high res?
ANDREW SCHEPS: Well, I would
argue that even something that
doesn't have a whole lot of
dynamic range, you will still
absolutely here the difference
when you have a very lossly
encoded file.
You start to destroy things
other than just the dynamic
range, right?
There's frequency content,
there's panning content,
there's the mono versus
stereo content,
there's depth of field.
There are all of the cues that
are being taken away, all the
acoustic cues and reverb tails
and things like that.
And that will affect it even
if it's super loud.
I mean, there's this whole thing
called the loudness war,
which maybe you know about,
but they just like--
I won that war, OK?
I mixed "Death Magnetic,"
which was the album that
everybody said was the poster
child for things
being way too loud.
OK, so I won.
Therefore, the war is over, we
don't have to worry about it.
[LAUGHTER]
ANDREW SCHEPS: I spent weeks
reencoding for iTunes and
Amazon at that time
to make those
files work lossly encoded.
So what happens is you start
to get rid of dynamic range
and things like that, is you
start to break the encoders.
The encoders need some
room to work.
So I'm making it very difficult
for that to work.
And one of the things we found
that worked great was turn the
mix down 0.7 db, period.
Just let there be headroom
that we never even use,
because it's brick
wall right there.
We never get up to that last
0.7, but all of a sudden, all
of the encoders sounded about
100 times better.
When we got to give them 24-bit
files for the last
Chili Peppers record-- that was
right at the beginning of
the mastered for iTunes project
at Apple-- and the big
crux of that project is give
us 24-bit files instead of
16-bit files.
That made a huge difference.
So in terms of what you feed
the encoder, it isn't just
about the source material in
terms of a sonic thing.
Because I think there are lots
of hardcore and punk albums
that, from a sonic audio file
point of view, sound terrible.
But they are so super exciting
that people love those bands
and they want to
listen to them.
And if you do a 128 MP3 of that
album, what used to be
hashy and exciting is now just
hashy and noisy, and I think
there are lots of people who
wouldn't get into the band as
much as they would even if they
buy it on a cassette,
which doesn't have anything
above 12k on it, or
something like that.
So there are two very different
aesthetic paths you
can take when you talk
about the music.
And the problem is, it's
not like with TV.
Right, with TV, who is going to
argue that a high def set
looks worse than an SD set?
Because you see it, and
it's easy to A-B.
Some people like the artifacts
and you're used
to things like that.
And if you have a bad digital
set that pixelates,
there can be issues.
And if you look at bad material
on an HD set, it
looks terrible.
OK, so all those arguments
are true.
But let's say you have a
well-captured still image, and
you show it on these
two different TVs.
One of them has way more
information about it and it
just looks a hell of
a lot better, the
other one does not.
Whereas with audio, people don't
trust what they hear.
People think you have to be
trained to like something
better when you just talk
about audio formats.
And people believe what
they're told, period.
I mean, nothing influences your
opinion about things more
than me telling you how
great it is, right?
If someone's about to play you
something by a certain band
and you like them, and they say,
I can't stand this band,
check it out, you will
not like that band.
If they say this is my favorite
band in the whole
world, you're going to try
really, really hard to like
that band because you
like that person.
So there's so much that goes
into liking music that has
nothing to do with any of
this, but it also has
everything to do with it,
because I really believe that
there are just thresholds.
And for every person listening
to a new piece of music,
there's a threshold of,
am I going to like it?
Am I not going to like it?
And the more you can give them
something that sounds true to
whatever the artist decided
was done, the lower that
threshold will be, and the
easier it is to connect.
So regardless, let's listen to
some stuff, unless you want to
keep talking.
AUDIENCE: So when did you do
that, the Death Metallic?
ANDREW SCHEPS: The Metallica
mix? "Death Magnetic"?
That was--
I don't know, six years
ago, seven years ago?
AUDIENCE: What made
you [INAUDIBLE]
ANDREW SCHEPS: What made
me destroy it?
AUDIENCE: Yeah.
ANDREW SCHEPS: OK.
That is a conversation
that is not--
I mean, really the only thing
I would say about that is I
have nothing to say
about that.
The idea that me as an engineer
could mix a record in
such a way that was destroyed,
but everybody would be OK with
it and let it out into the
world is just crazy.
There is a band involved, there
are producers involved,
there are plenty of people
involved who
said, this is awesome.
Now during the process, whether
or not I made quieter
mixes to A B,and an let them
hear differences and whatever,
I may have done, but
it's irrelevant.
It's irrelevant.
What happens is at the end of
the day, that album sounds the
way it does because that's what
the band and the producer
thought was great.
And there's some people who
really don't like the way it
sounds, but there are a lot of
people I've talked to who
think it sounds awesome.
It's super aggressive.
It's not the most hi-fi thing
in the world, but a lot of
stuff I do is not hi-fi.
But I hope that it's emotionally
awesome, and makes
you love it, and makes you want
to either kick a hole in
the wall or cry or call your
mom or whatever it is that
we're trying to get across.
So this discussion in terms of
what you do with that file
afterwards is also very
different from the audio
quality in the sense
of audio file.
There are lots and lots of
records that if you go to one
of the big consumer electronics
shows where they
have a million dollar set-up
where a speaker this size will
cost you $85,000 each, and
has iridium tweeters.
And you've got a stand for the
turntable that costs more than
your house-- that
kind of thing.
You can only listen to audio
file stuff on there, right?
And so what are you
going to hear?
You're going to hear a few jazz
records and Steely Dan,
and that's kind of it.
And those are great records,
and they're also amazing
sounding records.
But if you put on something like
the Metallica record on
there, at that point, maybe
some of that's wasted.
But it's not because you're
putting on a low bit rate MP3.
OK, another thing just
anecdotally--
and we will listen more.
I'm sorry.
I will talk about
this for days.
But while I was putting all
these files together, I had
this massive folder of files,
and I'm keep things organized,
and making sure things
are named.
And I was just listening
on my laptop speakers.
First of all, I'm letting the
OS do sample rate conversion
in real time.
Right, whatever Quicktime has,
that's what happened, so it
can play back at whatever sample
rate the stuff was set
to, which is probably 44.1.
And I'm just listening to the
first 25 seconds of each song,
making sure they're all
the right song.
I can tell the difference
in my laptop speakers.
So I bring this set up because
it's cool, and we've got a
room this big.
And if I played stuff on my
laptop, no one can here it.
So this helps.
But if you have any sort of
decent kind of system-ish that
has some good DSP on the back
end to make it sound pretty
good, and it's got a little bit
of power so some of the
dynamics come through, I think
you absolutely will hear the
difference.
And even more than that, you'll
feel the difference.
One of them is just more
fun to listen to.
But that's a discussion that
could go for weeks, and
there's no necessarily
right answer.
But the good thing is, I won the
war, so the war is over.
So now we can all make
quiet records again.
Yeah?
AUDIENCE: So there's a new
standard from the ITU to set
record loudness levels.
Are you following that at all?
ANDREW SCHEPS: Well, what
those are as far as
understand, and correct me if
I'm wrong, that's what's used
in the Apple Sound Check, as
well, where you scan a record
to say how loud it is, and then
it uses it to even out
the level if you take advantage
of that in whatever
playback system you're using.
Is that--
OK.
So basically--
AUDIENCE: [INAUDIBLE]
a little different
from the ITU's standard.
So there's different, competing
implementations.
ANDREW SCHEPS: Again, I don't.
I mean, if we got into
my mix process--
which I could talk for
a different set
of hours about that--
my mixes are what sounds good.
And sometimes, the level of the
mix really doesn't matter.
But a lot of times, it does.
And I mix on analog equipment
which has voltage rails.
So as I hit that rail, I
don't just cut it off.
It smooshes it off, and it takes
a while to smoosh it off
completely.
And different amounts of
that smooshing differ.
And it's just because I'm in the
analog world, so clipping
and harmonic distortion are your
friend until they're not
your friend, and something
catches on fire.
So when I'm mixing something
like the AFI record I just
mixed or Black Sabbath record,
those mixes are going to be
loud because they don't really
sound right until their loud.
But when I make something
like [INAUDIBLE]
which is on my label
or jazz record--
I mixed a Jeff Babko
record last year--
those end up being much quieter
mixes, because I want
it to be more open, and
the dynamic range
really helps the music.
So for me, it's much
more a feel thing.
And then I find out later that
I've kind of screwed up, and
the mastering guy gets angry.
And then I will send the quieter
makes and say, if you
get it to sound as good as my
one that you say is too loud,
then we're good.
But if it doesn't feel as good,
then we have to go with
my screwed up mix.
So I'm not the best
person with that.
There are a lot more technical
mixers than me who adhere to
things more than I do.
I'm kind of a disaster
with that.
Yeah?
AUDIENCE: What's your
take on [INAUDIBLE]
Pandora, [INAUDIBLE]?
ANDREW SCHEPS: OK.
So streaming, I mean, the
filetypes are the same, right?
And on that chart, I had bit
rates for the streaming files.
So I have no problem with
streaming versus download.
I mean, there's a whole other
conversation which is about
making the music business
still exist.
And that's actually a really
important conversation, and
encompasses way more
than just this.
This is the esoteric, I think
this makes a difference part.
Then there is the recording
album credits part, which is a
discussion I'm hoping we're
having tomorrow a little bit--
implementation of that, getting
consumers to interact
directly with artists more,
because that's what creates
the relationships that last so
that I don't have to go get
another day job.
That's my goal in all of that.
In terms of just the audio,
though, the streaming and not
is exactly the same thing.
So actually let me plug
the monitor back in.
And let me show you one other
thing, which is a technology.
It's called OraStream.
Does anyone in here know
about OraStream?
So we've got one, because you
were there last time.
Does anyone here know about
the MP4 SLS format?
It's another Fraunhofer
encoding format.
So it's meant to be an archival
strength format.
So what it does is it will
wrap audio in its own
metadata, and preserve whatever
the native bit rate
and sample rate is
of that audio.
But one of the byproducts it
has is you can do what they
call truncating of the stream
to produce in real time any
bit rate stream you want.
So what OraStream have done is
they've come up with all of
the server side and back end
technology to do pinging of
your connection in real time,
and to granularly scale.
So the Google Play Music--
you've got three bit rates.
You check out how fast people
are able to get the stuff, and
you give them the fastest when
you think they can get without
any buffering, right?
Because buffering sucks.
No one wants their
music to stop
But that's what you do, right?
And you will skip between
those levels.
So if when you start playing a
song, you're in a black hole,
even though you're listening
on your cell phone.
You're in a parking garage.
You're going to start off
at a very low bit rate.
Now are you constantly pinging,
and you'll up the bit
rate as soon as you can?
Or do you wait for
the next song?
AUDIENCE: You want for it.
ANDREW SCHEPS: You wait
for the next song.
OK.
And this is technology-- these
guys, I mean, they probably
had meetings here, I don't know,
with anyone in the room.
But originally, it's a few
guys from Singapore who
developed the technology.
And they were hoping someone
else would just license it,
because they thought it was
awesome, and why wouldn't
people want to do this?
So what they do is they're
pinging constantly, and the
bandwidth will change.
And it plays back in HTML5
using a WebSocket, and it
plays back on iOS and Android
via an app, because MP4 SLS
isn't supported directly
in the OS of
anybody's computer yet.
So let me just quickly go
to my account here.
And for audiophile people,
by the way, this
is an awesome service.
So as a listener, it's like a
Dropbox that can stream your
audio to you.
So you can get a
free 1 gigabyte
account, I think it is.
Or you can pay $5 a
month for 5 gig.
You can pay a little bit more
for 10 gig or 50 gig, or
something like that.
You upload your lossless music
to the service, and you can
immediately stream it anywhere
in the world on any platform.
So it's their version
of a cloud iPod.
But here is what it's
awesome about it.
So let where are all
of my playlists?
Here, let's stream something
that's kind of--
oh, here we go.
Come on.
OK, so I've got some of
the same songs here.
But here's what's important is
see it right up at the top,
below the scroll.
What do you call it?
What's the official name for
that, the progress bar with
the thing in it?
You know, it's the position
bar thing.
OK, so watch what happens.
So everybody heard the song come
out from under the water,
and start sounding good?
Here's one that is not
a hi-fi recording.
Oh, this is a band from Austin
who are the most exciting show
I've ever seen.
And I signed them to my label.
They made a record
in two days.
I mixed it in one day.
It's psychedelic rock stuff.
It's not the most hi-fi
thing in the world,
but this is at 96/24.
And again, just watch the bit
rate if you can see it.
So we're going to start off at
128 because there's a cache.
OK, so we're just
streaming 96/24.
And if you do the math and
figure out the bit rate, the
number will always be a little
lower, because the last part
of the decoding happens at the
WebSockets, so you don't
actually need to give
the full bit rate.
So the drawback is if you
compare 256 stream from MP4SLS
to a 256 encoded MP3 or AAC, the
MP4SLS will not sound as
good, because it's not optimized
for that bit rate.
But I've never had to listen
to 256 with this.
Wandering around on the 4G or 3G
that I get off AT&T, I'm CD
quality all the time.
And as you go from the cell
network onto your
wi-fi, it jumps up.
And it's seamless, and
it works in real
time, and it's awesome.
So this is another example of,
I think, where stuff can go
where you still get the
convenience of things having
to start playing immediately,
which I totally get.
You don't want to start
streaming CD quality audio to
people on crappy cell
connections.
But if you can hit Play
immediately, then realize
they're not on a crappy cell
connection and be CD quality
within the first few bars of a
song, and when they jump on a
wi-fi network, be up at
audio file quality,
that's pretty cool.
So hopefully, this is
sort of where some
things will get headed.
And it's one of many
possibilities.
But if anyone's interested in
talking to the or guys, please
get in touch with me, because
they've set it up where now
it's a lockbox service for
people who want to just upload
their own stuff.
I can sell my artists' albums
through there, download as
individual apps.
So they have a business model,
but they're also always
looking for partners.
When Neil Young released his
last record, and everyone has
heard of the Pono system that
he's touting, which is a
hardware-based high
res audio system?
The Warner Brothers wanted to
stream his record for a week
before it came out, because
that's what record labels do
now is give you a free stream.
And he said, yeah, that's fine,
as long as it streams at
192/24, which of course, that's
not going to happen.
So they got the or guys to do
it, and they actually did it.
And they were streaming about
5 terabytes an hour all over
the world of people who
wanted to listen.
And if they were on their mobile
browser, they were
probably getting maybe
CD quality.
But if they were on a computer
hooked up to a stereo, they
could listen to his
album at 192/24.
And again, granularly scaling,
so if there's any little bit
in the traffic, or if your buddy
starts streaming a movie
down the hall, you granularly
dip, so it's
not a stepping dip.
So in terms of the listening
experience, it's a lot less
intrusive, because you dip
down and come back up.
Anyway, so that's OraStream.
Yeah?
AUDIENCE: There's one form that
you haven't mentioned a
single time.
I was wondering [INAUDIBLE]
DSD?
ANDREW SCHEPS: OK, so DSD,
just really quickly, is
basically 1 bit encoding
at a megahertz level.
So instead of taking this grid
and putting it over, many,
many, many more times a second
then you would on a PCM
encoding, you say, what's
the voltage?
Is it higher or lower
than last time?
And you use your 1 bit--
this is the dumb version--
say, yeah, it's higher, it's
higher, it's higher, it's
higher, now it's lower,
it's lower.
So you're basically tracing
the waveform very, very
quickly as it goes.
The only problem is-- the reason
I don't mention it is
because until about a week
ago, there was no viable
consumer format.
And now there is one site that
is actually selling DSD audio
files that you can download.
And it's even more cumbersome
to get a player to work.
Now in terms of audio quality,
listening to DSD versus high
res PCM encoding, I haven't
gotten to do A B test, but a
lot of people love it, think it
sounds absolutely amazing.
It's a very different
way to encode music.
It's awesome.
I try to only cover established
consumer formats
during this, because that's
what's out there.
And there's no way
I can distribute
anything DSD right now.
It's impossible.
AUDIENCE: And it would be
hard for you to edit it
ANDREW SCHEPS: It's
almost impossible.
There's one system that allows
you to do multi-track editing,
and it's really expensive,
and their software sucks.
So I can edit, but it
would not be good.
So again, obviously, there's
always the ability to work
versus what would be best.
[APPLAUSE]
