[MUSIC PLAYING]
RAYMOND: Dave is largely
responsible for the ARM
architecture.
I'm going to give you the
briefest possible intro to ARM.
It's why this is not this
big and burning a hole
in your pocket.
That's it.
With that, Dave.
DAVE JAGGAR: Thanks
so much, Raymond.
I'm going to skip through the
first slides fairly quickly,
because I hope it's
fairly well-known stuff.
On that introduction, just
a wee detail about ARM
is in a lot of
products these days,
it is dominant in cell phones,
but a lot of other products
as well.
ARM was formed in 1991--
late 1990, start of
1991-- with 12 engineers
from a British
company called Acorn.
We started intellectual
property company
with no intellectual property.
This is not something
to try at home.
We got $2.5 million from Apple.
Why all that came together
will become clear in a moment.
Chip designers-- ARM
doesn't make chips,
but everyone else
does, pretty much.
And ARM makes a royalty.
Just recently the 150
billionth chip shipped.
But none shipped
in 1991 to 1995.
And about $23 billion last year.
So that's about 20 for
everyone on the planet.
More than 60% now of the world
have access, use ARM every day.
And that's about the
same as having access
to basic sanitation.
730 chips per second
are manufactured 24/7.
They're a tiny company
compared to Google, of course.
Everyone is.
But $1.83 billion turnover
in 25 years is not too bad.
And about $1.1 billion
is from royalties.
6,000 employees, nearly 1,700
people manufacturing ARMS.
And they were bought
by SoftBank in, uh--
whether it's for
better or for worse,
the Pope was
inaugurated in 2005.
And there's a photograph.
And the change that
perhaps ARM had.
In 2013 it kind of
looked like that.
I think maybe the Pope thinks
the whole world is covered
in purple and green splotches
from all the flashes going off.
Shipments [INAUDIBLE]
$150 billion growth curve.
That's to the end of 2018.
So that's why it's not
quite $150 billion.
I was largely responsible
for the yellow and orange
and the start of the red.
ARM stopped reporting those
products as individuals.
And that's why it
becomes that beige color.
That's the combination of
those three after that time.
And then the Cortex M is
largely the development
of the yellow stuff that I did.
And it's a direct descendant.
And then the power--
the big chips that are on your
phone-- that's the Cortex A
series-- the purple at the top.
So you can see a lot of ARMS
are going into everything other
than the main processor
in a cell phone too.
Annual shipments look like that,
approaching $25 billion a year.
Quarterly shipments
are really jumpy,
because you fill the pipeline
with products in about the end
of second and third quarter.
And the fourth quarter
and the first quarter
are pretty quiet after
Christmas for production.
So a little bit
about the background.
This processor was
originally developed
by a company called Acorn.
And that's the name.
ARM is a Acorn RISC Machine.
They had a lot of success
in the early '80s.
They were kind of
like a British Apple.
Had a educational computer.
And they sold so many
of them that they then
decided to do this computer,
which was a follow-on.
Unusually, they decided to
develop everything themselves,
right down to the
keyboard and the mouse.
They had multiple operating
systems, networking,
file systems, and
the core processor,
and three support chips-- the
memory controller, the I/O
controller, and the
video controller.
Kind of unusual.
Unfortunately, it was never
particularly successful.
It was probably just
overtaken by the IBM PC,
like most things.
In parallel in 1990, Apple
were developing the Newton--
the first PDA
handwriting recognition.
You can roughly imagine
it as a large cell phone
without any connectivity.
Probably why it
wasn't successful
is it didn't have
any connectivity.
But the advanced technology
group led by Larry Telser
were building this Newton.
And they really wanted a
low power 32-bit processor.
It was actually Jony
Ive's first job at Apple
was to design the Newton.
And because Apple and Acorn
competed in the UK market,
they decided to spin
out this company.
The other bit of
serendipity was the timing
of this company was it was just
the start of the world going
digital.
So just to cast your mind
back, about 1990 a lot
of applications that
were developed on PCs
wanted to be put into
sort of portable products
that ran on batteries.
So as far as that was
concerned, the ARM processor
came online just about the
right place at the right time.
At high school, my math teacher
said I'd never be an engineer.
This is kind of
ironic, because one
of the reasons I'm on a
tour of the US at the moment
is Dave Flynn and I--
another senior engineer at ARM--
were awarded the James Clerk
Maxwell medal from the IEEE.
So I think maybe I can talk to
my teacher with my head up now
and say, actually, I'm
probably a decent engineer.
But because of that I actually
did computer science instead.
So I did every single
paper at my university
in New Zealand,
which was unusual.
And again, it was a
bit of serendipity.
We just seemed to have all
the right people with--
we only had 10 teaching staff.
But we just seemed to have
the right bits of technology.
We had especially Tim Bell.
If you know anything
about text compression,
the original bible was written
by Bell, Clary, and Witten.
And Tim Bell was one
of our lecturers.
We also had compiler technology.
We had the source to
an operating system--
a good one.
But we had almost no
hardware expertise.
So I spent a lot of time
in the engineering library
learning about hardware.
John Mashey was
from MIPS computers.
He has an esteemed career.
And he gave a guest
lecture in 1988
and really taught me that
there was such a thing called
a computer architect.
And I worked with lots
of old mainframes.
If you look at anything I
design and you squinted it
a little bit, you get a PDP 11.
So.
So for my master's thesis I
looked at this ARM processor.
And the word
"interesting" is in quotes
because it's a slightly
crazy architecture.
But it is interesting.
So my thesis was called "A
Performance Study of the Acorn
RISC Machine."
And I wrote a C Compi--
sorry, I wrote a C Compiler.
I wrote a instruction set
simulator called ARMulator.
I put MIP style floating
point on the side of it.
And put a complete
software stack
on top of that-- compilers,
assemblers, the Sun OS sources,
and ran the whole lot.
And I really think I learned a
lot in those couple of years,
because you really haven't
lived until you've debugged
a complete stack like that.
It was actually a couple of
bugs on the Sun OS source
when it was compiled
for a new machine
all the way down to a bunch of
bugs, of course, in my code.
And later I modeled
a 16-bit ARM.
And this comes in later as a
replacement for the teaching
simulator.
A couple of days after
handing in my thesis I
saw my first copy of Patterson
and Hennessy's "Computer
Architecture--
A Quantitative Approach."
And I remember standing in the
university new book section
at the library going, well,
you could have told me sooner.
RAYMOND: If only I had known.
DAVE JAGGAR: I could have just
saved myself so much time.
But yeah.
So inspiration number
1 one was John Mashey
from MIPS and Silicon Valley.
And I was very fortunate to give
a talk in Stanford last week.
And John was there.
So after all these years--
30 years-- I was able to sort of
repay a little bit of the thank
you's.
He walked in with a stack
of overhead projector slides
about this tall, all messy,
straight out of his attache
case, and said, I won't have
time to present all this,
and then did it.
And it was like drinking from
a fire hose for 90 minutes.
It was fantastic.
And I guess at the end
of that I knew what
I wanted to be when I grew up.
He also came out
with this taxonomy
of a complex instruction
set computer and a reduced
instruction set computer, and
a continuum of how you grade
these things as far as
what a processor might
be on that continuum.
And I've never really
entered into this argument.
ARM is not a pure RISC.
Our CTO, Mike Muller said
that way back in 1992.
It's kind of on the spectrum.
I found out recently that
mushrooms are closer to animals
than they are to plants.
And I think it's
kind of like that.
It's just a different thing.
And you shouldn't try
and rule it in too much.
Maybe it's a MISC
for miscellaneous.
And maybe you have
to bacronym the M.
And we'll call it
a M for modest.
It was a modest little chip,
mainly because it was designed
not to have any on-chip caches.
It was designed to
connect directly to DRAM.
And that has a ton of
architectural implications
that good and bad.
And it took a long time to
sort of undo some of those.
Probably the single
biggest mistake they made
is they had a limited
26-bit of address bus.
They artificially
limited the address bass.
And so this thing could only
address 64 megabytes of memory.
But it was low cost, it was
low heat, and it was low parts.
They had to fit in
a plastic package
back then with no
heat sync, no fans.
It was probably no smaller
than the early MIPS machines.
We used to say it was.
But with hindsight, when
I learned more about
how the MIPS machines
were laid out,
there was probably no simpler.
Anyway.
The implications of
having no caches on-chip
meant a long cycle time.
And this meant that the
whole instruction pipeline
of the machine stalls
whenever you access memory,
because the machine's
trying to load
an instruction every cycle.
That's what RISC machines do.
They're always trying
to load an instruction.
And as soon as you
need to access memory
for a load or store, you have
to stop fetching instructions.
And the whole pipeline stalls.
This is kind of unusual.
Because that whole
pipeline stalls,
you've actually got
time to do other stuff
while you're accessing memory.
And they went ahead and
baked a lot of the stuff
into the instruction set.
Really quite
unusually in an ARM,
a single instruction can
do a shift and an AOU op
in a single cycle.
I don't know any other
machines that do that.
It also has load installed
multiple instructions,
which allow you to get
fast DRAM access for data.
This is not how our computer
pipeline should look.
But this is how ARM2 looks.
A trained computer
architect will look at this,
and I see an Anaconda
that has swallowed a goat.
So you've got that empty,
empty, empty, huge bulge
that just doesn't look
right, empty, empty, empty.
And there's just everything's
done in the execute stage
of the ARM pipeline.
It's really not pipelined
at all-- the back end
of the machine.
Because it's so
simple, that thing
has to just loops while it's
accessing the single memory
system.
A little bit on code.
It's cute and fun to write
assembly code for this thing.
For example, the top
instruction multiplies
register 1 by 5, which
is kind of a handy thing
to do in one cycle.
You can do other
things like move bytes
around in and out of registers
to do bit field and certain
[INAUDIBLE] in a single cycle.
The AOU and shifter
combination was also
used to form addresses
from loads and stores.
And that meant you can do
quite complex addressing mode.
So the C programmers
in this world
will understand the
R5 + + nomenclature.
You can do auto increment
and auto decrement
built into every instruction.
Register 15 was the
program counter.
And register 14 was
the return address.
So to return from a
subroutine, you just
put register 14 back into 15.
And that returned.
The last one is a
conditional return
from function call
all in one hit.
So if something is equal
to 0, that's EQ bit.
You load multiple
increment after.
Register 13 is
the stack pointer.
And the exclamation
mark means, I'm
going to update
the stack pointer
after I've done this operation.
It loads register 4, 5, 6, and
7, and the program counter.
So it does a return
all on one instruction.
So that's all kind
of cute and fun.
The trouble is, the
instructions say it also
defined all the bottom ones.
And they had to work as
well, because they really
didn't have a concept
of, we shouldn't
allow people to do this.
We'll just let them do whatever
the pipeline can achieve.
So if you want to
multiply by 33,554,431,
you can do that
in a single cycle.
It's just not
particularly useful.
You can branch to
that funny address.
You can load a byte was funny
offsets like 75-byte offsets.
The load instruction underneath
that-- that LDR R15--
that takes the program counter
and rotates it by 15 bits.
And then it adds it to
the program counter.
Then it accesses that
memory address and loads
that into the program counter.
Now this is almost
completely useless.
But it still had to work
in every implementation we
did after that, because
the programmers used
weird instructions
like this early on.
The last one loads register
13 and updates register 13
as the stack pointer.
So after their instruction, it's
not clear what register 13 it.
And so this was a lesson
about architecture
versus implementation.
They had a tight
little implementation
that worked well for DRAM.
But the world was
moving very quickly
towards high level languages.
And Steve Furber, the original
implementer of the design,
said this recently.
"We expected to get into
the project finding out why
it wasn't a good idea to do it.
And the obstacle just never
emerged from the mist.
We just kept moving
forward through the fog."
Now we've all been
in that situation
with a new design of
something where we just really
don't know what we're doing.
If we're honest, we're just
progressing through the fog,
trying to work out
where things are.
But I really love this
very honest description.
It explains why the architecture
was more or less missing.
Because to have a
chip architecture
you really need
to have visibility
over lots of implementations.
You really need to be
able to look forward
several implementations
to design
a good architecture that's not
going to be costly to implement
in the future.
I'll skip this slide.
It unpacks that
a little bit more
for those that are interested.
It basically says why those
initial interesting things
in the pipeline
become hazardous.
So if ARM2 was M for modest, as
soon as you add on-chip caches,
M has to begin something else.
And it's muddled or messy or any
other kind of sick adjective.
Just beyond this, on-chip
caches became affordable.
And as soon as you do that
to the ARM architecture,
the whole architecture
starts to look
a bit strange and silly
and hard to implement.
But they went ahead and made
one of these things anyway.
Acorn did ARM3 It was an ARM2
with a 4-kilobyte instruction
and data cache.
It had no write buffer.
And this was due to
self-modifying code.
They had a habit of writing
the instruction stream directly
ahead of executing it.
This was mainly for
bit blit graphics.
But they would just spit
out instructions, and then
expect that to be loaded
straight into the pipeline
and executed.
That means that you really
can't have anything buffered
in the write stream, because
it needs to come straight back
into the cache.
And it means you can't
have separate instruction
caches and D Cache--
data caches, because
they become incoherent.
My thesis predicted that this
would be a 40% performance
loss.
And that number probably
got me the job at ARM.
So yeah.
They really had no idea of
Gordon Moore and Moore's law
it was really--
I don't know whether they
just ignored it or just missed
the memo.
I guess it was
somewhere in the fog.
So the next generation
process would have given them
twice as much silicon.
But they didn't exploit it.
So just a quick summary.
At the start of 1991, there was
this joint venture between ARM
and Apple with 12 engineers.
Sophie Wilson, the original
ARM instruction set,
stayed at Acorn.
She did not join ARM.
Steve Furber took the
professor of computing
at Manchester University.
So he was gone.
Al Thomas-- this name is going
to be an important in moment.
He was the ARMS3 cache designer.
He took over all
CPU design at ARM.
And we had no patents.
No coverage at all.
Acorn had never filed any
patterns on this technology.
And the money from Apple.
And we had fab space from
VLSI Technology in San Jose.
Robin Saxby joined as CEO.
He brought a lot of experience.
We'll talk about
Rob in a minute.
And then later on there
was a layout engineer.
The office manager, Simon
Segars is the current CEO.
That's the tall
one in the picture.
I'm the other one
in the picture.
And then Dave Flynn
joined just after me.
And he's the co-recipient
of the middle.
Robin brought a lot
of experience to ARM.
I'm a hoarder of emails.
I never throw away an email.
And I have all my emails.
And you can look back and
see that he completely
predicted this domination.
He has a bunch of sayings there.
And you'll hear him
say those things a lot.
We did work hard.
We did have fun.
So ARM started in this
tiny little barn--
in a 17th century barn-- in a
town called Swaffham Bulbeck--
can that name be
any more English--
about eight miles
Northeast of Cambridge.
We added about 10
staff per year.
And we had almost no money.
We were almost
always going bust.
In fact, Brexit
is kind of funny,
because we would have
not been in business
if it wasn't for the
European government funding
that we received.
So if Europe hadn't
been part-- if England
hadn't have been part of
Europe, ARM wouldn't exist.
Acorn and Apple Commitments.
Had to do an ARM6, and an
ARM7, an ARM8 for Acorn
plus floating point
and video controller.
They really wanted high
performance workstation
processes.
Apple really wanted
something that
would fit in a thing that
looked a wee bit like a phone.
And that was Robin's
balancing act for years.
I think, as he's practicing
on the unicycle there.
First thing ARM did-- this
was just before I joined--
was the ARM6 family for Apple.
The nomenclature is if it's a
single digit like a 6, that's
just a processor core.
Can't really use it by itself.
60 is a processor
core bonded out.
And this is the very
first ARM development
card with an ARM60 right here.
At the same time
as I got the metal,
Dave Flynn presented
this to me as a gift.
I'm so proud to have the very
first ARM development card.
I've since promised it without
asking Dave yet-- sorry, Dave,
if you're watching this--
to give it to the Computer
Museum in San Jose,
because it's kind of a
start of the revolution.
So that's an ARM60,
which is ARM6 bonded out.
An ARM600 or 16 would
have caches on it.
If there was a 6,000
later, we started
going up to four digits,
that would be an ESOC.
So that's how the naming worked.
They put a write buffer
on this for Apple
when they pushed it out
at 32-bit wide address
bus for Apple.
And hey, the write buffer
produced a 40% performance
increase.
So that was handy.
I've only ever had one job.
I worked at ARM for nine
years, and then I retired.
So straight out of
university I joined.
I sent them my thesis by
post and heard not a thing.
Not a single sausage was
heard in New Zealand.
And of course, postage back then
from New Zealand to England,
I didn't actually know
the means by which
my parcel would
travel, whether it
was on an airplane or a boat.
So I waited patiently.
And on the 2nd of
May I sent an email
to Jamie Urquhart, who was
running the VLSI group,
asking if I'd got my thesis.
And John Mashey admits
asking if they had any jobs.
John came back and
said contact HR,
which made me kind of
think probably not.
But on the 3rd of May I heard
from somebody called Lee Smith
at advancedRISCMachines.co.uk.
And he said that
following, I have your CV.
I've been impressed by it.
And he's currently looking
for a software person start
around the end of June.
So this was another
piece of good timing.
I got a telephone interview
on the tenth of May.
And I had a job offer
on the 17th of May.
And I arrived in the
UK on the 20th of June.
As part of those emails
going backwards and forwards,
I had the following paragraph.
Lee said, "Over
the past few days
it has come to my attention
that our understanding of ARM
at the software level
is insufficient."
This really troubled me.
I couldn't quite understand
how that sentence could exist.
These were ARM.
How did they not
understand their processor
at a software level?
He was actually talking about
doing HDL and VHDL models--
Verilog and VHDL
models of the ARM.
And I went on to do
some of that too.
So I joined about two months
after ARM600 taped out.
Robin Saxby lived
two hours away.
And he didn't want
to move his family up
to Cambridge at the time.
This was a start-up,
so he didn't want
to disrupt his whole family.
So we ended up renting
an apartment together
in Cambridge.
We have a lot in common,
including the same birthday,
20 years apart.
And we're both
Cambridge outsiders.
So we got on really well.
We still do.
We see each other a lot.
I had a very modern software
development background then.
I was used to symbolically
[INAUDIBLE] C and Unix.
Acorn's way of hand-coding
things and they
used a lot of interpreted Basic
was kind of archaic to me.
I certainly knew that
the ARM processor
was too slow to compete
with the big boys.
I knew that we had a decent
modest implementation.
But the architecture was
pretty much non-existent.
And ARM really didn't understand
the concept of architecture
back then.
And I knew that we didn't
have John Mashey's experience.
So day one was write an
instruction set simulator.
Day two, I handed
in my thesis code.
That was the easiest day's
work I've ever done, cause I
pretty much had that written.
[LAUGHTER]
[INAUDIBLE] Actually, I
spent about three months
fighting X86
compilation back then.
And then as I said,
Dave Flynn and I
developed this development card.
I did the software,
he did the hardware.
And then we did Verilog and VHDL
models by wrapping that C code.
I was made the head
of technical marketing
because I was the only one
in technical marketing.
So therefore I was
the head and the body.
Because I knew how
to benchmark code
and had a good
experience with this,
I was just flying around
the world benchmarking
code for people.
Just so you know what a
high tech startup looked
like in 1991, we called
Cambridge once an hour
at five minutes to the hour
with a 2,400 baud modem
to send and receive all the
email for the entire company.
So if you had an
important email to send
and it was 10 to the hour,
you were typing very quickly
to catch that dial-up.
We had no wireless.
Wireless really hadn't
been invented then.
10-bit Ethernet everywhere.
A few Sun workstations.
A few Acorn base workstations.
But all pretty crude.
So a summary is, we had
two low volume customers
with very different needs.
We had one CPU designer.
We had a modest ARM6.
We had that with 600 cache,
MMU, and write buffer.
We had some software tools.
But we had no
experienced architect
or complete CPU design team.
We had no development cards.
We had no HDL models.
We had no general
purpose operating system.
No way to debug an ARM6
if it was buried in SOC.
And as I said before,
no volume customers.
But most shockingly,
we had no patents.
Shockingly, in 1992, as I
said, Sophie stayed at Acorn.
Steve went to Manchester,
took a professorship.
And Al Thomas passed away
halfway through 1992.
The other half of the company--
about half the company--
were working on Acorn parts,
another quarter on software
tools, and the remaining quarter
were support sales marketing.
It turned out, 12 months
after leaving university,
I was the only
one in the company
that really had an in-depth
knowledge of the ARM.
And I had absolutely no
clue about processor design.
So I was really thrown
in at the deep end.
I did point out
that maybe perhaps
it would be a good idea
if we had some patents.
So they immediately
made me the chairman
of the patent committee.
RAYMOND: Were you the
entire patent committee?
DAVE JAGGAR: Yeah.
I was the chair of
the patent committee.
So I had to walk
around and bribe people
into writing things
up as patents.
So I was the entire CPU team.
I understood bits
of their design,
because it was written
in C. Other bits
to instantiate it into
their timing simulator
I did not understand at all.
We needed a follow-on
processor quickly.
I did have a lot of background
with software architectures
in general.
And this was really
the rebirth of ARM.
Back then RISC was very popular.
All the big guys were doing
the RISC processor in some way.
Intel had the I960,
the I860 going on.
Motorola had the 88K.
And all the old MIPS and
Sun really started all this.
But everyone followed.
Down the bottom there was a
bunch of small embedded cores.
And in between there
actually wasn't much.
The Motorola 68k really
owned that market back then.
There was a little bit
of X86, but not much.
I remember Robin rented
my room Monday to Friday.
And we had some pretty
candid talks every evening.
I need to rewrite this line.
I think we convinced
each other pretty quickly
we couldn't compete
with the big fish,
and we should just
go somewhere else.
Richard Feynman has
that term, there's
plenty of room at the bottom.
And I really like that term.
There's plenty of
room at the bottom.
I think there's
still plenty of room
at the bottom of this market.
So that's what we--
we started to go down into
the embedded side of things.
And RISC was the
buzz of the industry.
It was much better than CISC.
So we kept calling it a RISC.
But I'm really
sticking for the MISC.
M is now for the embedded
instruction set computer.
I did a really quick spin of
the ARM6, made it go faster.
And there's a big critical
path I knew about.
I learned about
transistors real quick.
You can't have
big stacks of them
if you wanted to
run at low voltage.
So you rearrange a few things
to get it down to 3.3 volts.
I put a tiny bit of debug in.
I removed a reset from
the return address
after the processor was reset.
I removed the reset
wire from that latch.
The hardware guys
go nuts when you
do this, because they point at
it and say things like hi-Z.
And I didn't even
know what hi-Z was.
Sounded like an energy
drink that hadn't even
been invented then.
But what they let you do
is you reset the processor.
And then at least you knew where
it was when you pressed reset.
That's how crude our
debug was at the start.
And I filed a very
narrow patent on that,
because it's quite
an unusual thing
to do to not reset
part of your chip
when you hit the reset button.
And that was, I think--
that was my first patent.
We called it ARM7.
Those changes gave it enough
to give it a new name.
I then went on and
started to get into DSP.
So at this time we were
looking at MP3 code
for doing digital audio
players like the iPod.
And so we added a
faster multiplier.
And I did proper
integrated debug
so that we could debug
the processor when
it was buried under an SOC.
I think I'll skip this.
I did multiply properly
to get us into DSP.
Simon Segars, who's
the current CEO,
freed up from the
video controller.
And he most of ARM7DMI.
It's really great to have a CEO
with a technical background.
It was very well received.
The debug interface
really revolutionized
a lot of the design tools.
Because I'm a software
guy, interfaces
kind of come naturally to
me-- well-defined interfaces.
And so that really
started the ARM ecosystem
where people could write a
debugger once and interface
to a lot of different
chips, because it
was a proper interface
at that level.
I was traveling a
lot at this time,
doing a lot of benchmarking.
The performance was great.
The power consumption was great.
The die size was fantastic.
I was spending a lot
of time in America.
So the weather was much
better than England.
But code density bit
us, and it bit us hard.
We were trying to replace
eight and 16-bit controllers.
And obviously the reason you're
putting a new microprocessor
in your product is
you want-- either it's
a brand new product,
or you're trying to put
a bunch of new features in.
And we ended up having
code size that was bigger
than the original products.
We originally thought
we would be smaller.
But it turned out being bigger.
And of course, the
way memory works
is you don't go from 12
kilobytes to 13 kilobytes.
If you go from 12
kilobytes to 17 kilobytes,
you then probably need a
32-kilobyte memory system.
That's the first problem.
We blew the memory budget
such that they really needed
to double their memory size.
The other problem is a 32-bit
risk instruction set computer
wants 32 bits every
cycle to hit full speed.
It wants to swallow
instruction as much as it can.
And a 32-bit wide
memory system then
was two or even four chips.
So this was painful for
everyone to maybe quadruple
the size of their memory system.
What really drove
this home to us--
a lot of people think that
the chip that this became
was for Nokia.
It actually wasn't.
It was for Nintendo.
And back then games
cartridges plugged in.
And they were basically
a bit of plastic,
a tiny little bit of brass,
and a stack of memory.
And so if we made the wrong
cartridge twice as expensive
or four times as expensive,
that really ate all their profit
at Nintendo.
So this was against
the industry.
Now Mike Horowitz-- the quote
here was at Stanford last week.
So I'm slightly OK that I've
told this joke to his face.
But it's unusual to see
the word "ridiculed"
in a technical document.
But the thinking at
the time was very much
this, that you shouldn't
try and do coding density.
You should do simple decode.
And that's absolutely
the correct thinking
for a high performance
workstation.
And it's just the wrong
thinking for embedded.
So simple decode, simple
decode, simple decode
was the way everyone thought.
And you'd be ridiculed if you
tried to do anything else.
And so to swim against that
tide was hard work back then.
But any instruction set was
fixed links as wasteful.
And as we saw on that
code side earlier,
not all combinations
are very useful.
So if you can get rid of
them somehow, it's good.
So on a train from Nintendo
to a ski weekend at Matsumoto
in 1994, and
literally on a napkin,
I started writing the
16-bit instruction set.
It was pretty much the same
one that I used in my thesis.
I'd learned a few
more tricks by then.
And so I crippled
the C compiler.
And what I did was I
made the C compiler only
produce 32-bit instructions that
weren't too complicated that I
knew I could compress down
into 16-bit instructions.
So that, because I was
not using the full power
of the instruction set, the
programs actually got bigger.
Because it was still
32-bit instruction sets.
But they had instructions--
but they had gaps in them.
And I knew that I could
then take all those
and squish them down to 16 bits.
So when the program size
only went up by about 40%,
I smiled, because I
knew I could halve that
immediately back down to
70% when I re-encoded them
in 16 bits.
The real light bulb
moment, though,
was when I realized that
this processor should
have two instruction sets.
Now at the time,
remember, we're talking
about reduced instruction
set computers.
You should one instruction that
does one thing at all times.
So having a machine that
has two completely different
instruction sets and
codings and two instructions
that do exactly the same
thing was really weird.
It's about as unRISC as
you can possibly get.
So I called this thing thumb,
because that's the useful bit
on the end of your arm.
It's a second instruction
set, more compact
than the original one.
I recorded the instructions.
As I said, programs end
up being 70% smaller.
And if you're running
from narrow memory,
the code runs faster because
you get a 16-bit instruction
every cycle instead of having
to halve the memory bandwidth
to get a 32-bit instruction.
I added some support
for 16-bit data.
I left in the ARM
instruction set,
so you can still do full speed
if you want to, especially
from on-chip memory.
I also defined
something called TOM.
Tom Thumb, right?
A 32-bit data path with only
the 16-bit instruction set.
And that's what's called
Cortex M0 and M1 today.
The other really
big volume chips.
And I also defined and put
all the hooks in TOM 16
with a full 16-bit data path.
We never did that, and
we really should have.
A lot of people don't know--
Unix runs really nicely
on a 16-bit machine.
It started life on
a 16-bit machine.
And one of my few
regrets is that bit.
So Thumb really put us on this
different curve, this red curve
where we could have more
performance and less cost.
And depending on which--
how you encoded
your program, if it
was an important
bit of code, you
encoded in the 32-bit
instruction set.
If it was a less important
bit of code-- for example,
all the GUI--
you ran all that in 16-bit code.
So you had the best
of both worlds.
It was really on
a different curve.
And it was really
the breakthrough
for ARM and embedded.
I left all the original
stuff in because it
was a really easy
sell to say you've
got the best of both worlds.
And I never would have
got away with replacing
the entire instruction set.
Remember, by this
stage I'm only a couple
of years out of university.
So although it's exactly
what I was doing,
I put a back door in that
later we used Thumb2.
I put a prefix instruction.
And no one spotted
that, fortunately.
It was smart
politically, because it
looked like a relatively
small change for the chip.
And for those who called
it architecturally ugly,
I said, yeah, it's ugly.
But gee, it works well.
Sophie Wilson, who was the
original architect that
stayed at Acorn, she hated it.
She wrote to ARM's board
and said, to be brief,
I don't like Thumb.
As a short-term hack
it might be survivable.
As a long-term
architectural component,
as my view a disaster
of enormous proportions.
It represents a backward step.
Now the first chip
sold $30 billion units.
So maybe not quite as
backward as she was expecting.
But it was a big deal.
There was an emergency
board meeting.
Robin Saxby's bonus was cut
by 20% if he chose to do this.
They really tried to stop it.
Steve Furber was called
in from Manchester
as the judge and jury.
And narrowly a side of ARM.
Steve recently said, "ARM
addressed the code density
issue with an imaginative leap.
They introduced the Thumb
16-bit instruction set."
So it went from a backward
step to an imaginative leap.
So that's a pretty good U-turn.
And this is why I say my
part in ARM's downfall.
It was downward in
market position.
But it was very much
upward in success.
I will say it's much harder to
simplify something like this
than you think.
Looking back on it, it
looked so easy at the time.
It was just, how do I take
this big complex problem
and make a simple solution?
And RISC in general
is a little like that.
It's often hard to look across.
It still looks like an
Anaconda that swallowed a goat.
But there's this little
Thumb decode in the front.
There was fresh air in there.
And I could slip
the decoder in so
that we just decoded
16-bit instructions
to 32-bit instructions.
And the rest of
the pipeline just
thought it was being
fed 32-bit instructions.
I fixed quite a
lot of other things
that were wrong with
the architecture.
I hid a lot of the ugliness.
And I really thought
no one noticed.
But in the latest version
of Patterson and Hennessy
there's the statement
at the bottom.
"In many ways, the
simplified Thumb architecture
is more conventional than ARM."
So someone actually noticed that
I did a bunch of cleaning up
in there.
And they least miss
the guy that originally
gave me the job
just said last year,
"Thumb was essential
to our success."
That's his summary of it.
32-bit ARM sealed the deal,
getting to 2/3 of the code size
took 10 years,
but they could see
we're on a trajectory
to an asymptote.
Nokia were driving
round Finland with a van
full of equipment testing
cell phones at the time.
They looked at Thumb,
realized how much
it outperformed the competitors,
and were sold on it.
And so Ericsson and Motorola
were the other big names
in phones.
Then they had to follow.
And so we ended up selling
an ARM license to Motorola.
So this was-- wow.
We've actually sold a
license to the big guys.
Texas Instruments loved it.
They combined it with
a lot of their DSPs.
The chip was called MAD--
microprocessor and DSP.
And I think it's fair
to say it really rewrote
the rulebook on what an embedded
processor should look like.
MIPS followed fairly
quickly with MIPS 16.
The latest RISC-V, if
you're familiar with it
out of Berkeley
and Stanford, has
the C optional
16-bit instruction
set that you can bolt on
it for embedded control.
These are the two big patents.
Notice that actually
MISC is not a bacronym.
Right on the patent
title back then,
multiple instruction sets.
Multiple instruction
set mapping.
So MISC isn't a bacronym really.
Multiple Instruction
Set Computer.
They were filed early 1994.
I'm the inventor.
No, I do not get all the money.
Everyone asks that.
That would be nice.
But that's the ARM
of the assignee.
The patent people
in the audience
might like to read this
one at their leisure.
We had some narrow patents
and some wide patents.
ARM7TDMI, the processor
that came out of this,
was never cloned successfully.
The little guy, ARMs
2, 6, and 7 when
they had less patent coverage,
or almost no patent coverage,
were cloned a lot.
I was flying a lot by this time.
I was just selling this thing
and benchmarking this thing.
We're still a pretty small
company-- maybe 40 people.
And all the big names
getting into printers,
getting into hard
drives, getting into
headless terminals and
all this sort of stuff.
Cars, of course.
The printer and camera
guys really liked it.
We had some weird customers.
NKK Steel, who were just a big
steel company, took a license.
I still don't know why.
We had some on the eurofighter.
That scared me.
I didn't want to be anywhere
near a eurofighter, cause
I knew how many bugs
we'd seen over the years.
But anyway.
There was one on
the eurofighter.
And I accidentally
visited the NSA.
They wanted me to put a
backdoor on the processor.
I thought I was honestly
visiting the National
Semiconductor of America.
National Semiconductor
used to be a firm.
And "of America"
used to be a thing
you put on your
end of your title.
My boss gave me a bollocking
for why I didn't hit return
with any business cards.
And later I worked
out what the NSA was.
That became that
skipjack clipper program
that came on much later.
In parallel we had a big
project running at ARM--
ARM8 and 810 was using up
about half our resource
to try and do a fast processor
for Acorn as best we could.
We had a single
instruction data cache
for the self-modifying
code problem.
But we didn't put Thumb
and debug on that.
And the floating point
was difficult too.
But we did that to
their specification.
But it used up an enormous
amount of resource.
So that's what most of
the company were doing.
So ARM7TDMI was
really successful.
I was traveling a lot--
I'm starting to think
about how to go faster--
when Digital
Equipment Corporation,
who were third or fourth
biggest people in computer
company in the world then,
came on long and said,
we'd like to do a
fast ARM for Apple.
Now digital had about four--
well, they had
exactly four that I
know of-- reduced instruction
set programs going on
at the company at that time.
They had the Hudson RISC.
They had Titan.
They had Prism.
And lately, Alpha.
And Alpha was
originally called EV,
because their programs kept
getting canned because Vex
was everything at digital.
And if your program had
nothing to do with Vex,
when the cutbacks came,
they were just canned.
So the prism architecture was a
beautiful little architecture.
But it got canned.
So they started a new
architecture, which
they called Extended VAX--
EV.
And it didn't get
canned, even though it
had nothing to do with VAX.
It just had VAX in the title.
And I really learned about that.
I thought, well,
that's kind of--
hide that from the board.
Later, by the way, the
marketing people got hold of it.
And they called
it the Alpha AXP.
And the joke in
the engineers was,
AXP stood for Almost
Exactly Prism.
They blew the doors
off the industry,
they were running
at 200 megahertz
when everyone else was about 66.
It actually turned out to
be too late to say digital.
But probably the best
design team on the planet.
Quite a lot of these
people are still active.
I went to Texas for eight weeks
and wrote the ARM ARM-- the Arm
Architecture Reference Manual.
I just cleaned up the
whole architecture
and said, don't do this.
We promise not to halt and
catch fire if you do do this.
We promise not to get
privileged if you do do this.
Otherwise, don't do this.
And I learn a whole
lot about how to design
a chip from these guys.
They were a very
friendly bunch of people.
I didn't downplay Thumb.
But I didn't talk it up either.
I basically said, you
guys do the high end
where you've got 32-bit memory
systems, 32-bit on-chip caches.
We'll stay at the low end.
And that could be
our differentiation.
We all agreed that
was quite a good idea.
So the StrongARM
processor came out.
They basically cut
an Alpha in half.
It was so fast that
Apple started rewriting
their self-modifying code.
But it was-- did I say Apple?
Acorn started rewriting their
self-modifying code, which
was the nail in our
mate's coffin, really
at the ARM company.
But nothing could
save Acorn by then.
It was just too late for them.
But I snuck back to Cambridge
having learned everything
about how the
StrongARM was designed,
and told ARM we
should do an ARM8E.
And this was that lesson
about don't call anything new
because it may well
get canned by the board
if it's not in line with
the product roadmap.
So I called it ARM8E, even
though it had absolutely
nothing to do with ARM8.
It was the StrongARM pipeline.
A direct rip-off.
I add Thumb and debug to it.
And a tiny little design team,
again including Simon Segars.
And it was launched
at ARM as ARM9TDMI.
There's the pipe.
It's starting to look a lot less
like an Anaconda full of goat.
It's pretty streamlined,
that machine.
Digital taught us
how to do that.
Those two chips together are
still responsible for about 80%
of ARM shipped today.
So they've been
tremendously successful.
That TOM32 machine did get built
as that Cortex M0 and Cortex
M1.
The little arms have no 32-bit
instruction set at all anymore.
Then we decided to do--
it was silly for ARM and Digital
to be designing chips together.
And we particularly--
I particularly--
wanted to do floating
point properly.
And they had a lot of
floating point experience.
So we decided to a joint
design center in Austin, Texas.
We employed just about
everybody in England
that could spell microprocessor
backward, let alone design one.
So we really needed to
tap into another market.
America was a lot more
expensive for salaries
than Cambridge was.
But we had to bite that bullet.
So we did this design
center in Austin, Texas.
So I went to Austin
in late 1996.
My oldest daughter
Catherine was born.
She's here today too.
She's just finished
her EE degree.
So that's what I've been
doing in between by the way,
is raising my children.
But this program ran into some
huge unforeseen problems--
unavoidable problems.
First of all, I
noticed the ARM19,
we didn't have a
great debug strategy.
And they were booting
operating systems.
We had Window C, the
Symbian operating system,
and Linux were all running
on ARM at the time.
And our first silicon ran
about 10,000 structures
and fell over.
And so we spun the silicon.
Got the silicon back.
It ran about 10,000 more
instructions and fell over.
And they did this four
times from memory.
And this is a very expensive
long loop to be going around.
Digital had enough
performance that they
were booting the operating
system on the neat list.
They had enough compute
in the Digital company
that they were getting about
100 instructions per second.
And they were booting Unix
up to the command prompt.
I really thought that was
cool, and really wanted
to exploit that somehow.
The next thing that happened
was that Digital sued Intel.
And Intel looked at the
price of the lawsuit
and the price of Digital and
went, let's just buy Digital.
No one at the Digital
design center in Austin
wanted to work for
Intel, so they all quit.
And they didn't want
to work for ARM.
They wanted to do
their own startup.
And then we were using Compass
design tools at the time.
They were bought by Avanti.
And I believe overnight the
licenses just stopped working.
So we had no design flow.
So obviously big problems
are a big opportunity.
I went back to the emergency
board meeting in Cambridge.
Do we stop this now and fire the
four or five people we've hired
and apologize profusely?
Or do we change gear and
do our own design center?
That's what we decided to do.
They gave me headcount for 50.
I only ever used about 20.
But I didn't get to spend
much more time at home
with my family.
So we had a new chip, a new
team, new tools, new flow,
new country.
So obviously I had to get
infrastructure, buildings,
admin, the time
zones are a pain.
There was a hiring frenzy.
I borrowed the support
people from elsewhere.
I just didn't have time
to put all that together.
But this was really
a startup in Austin.
And I didn't want to back
off on the deliverable.
I wanted ARM10 to be about
twice as fast as ARM9.
By the time you
add floating point
to that and new support
for operating systems,
it ended up being about four
or five times more complex
than ARM9.
I was really worried
about the ARM9 long loop
around booting code.
I really wanted to find
some way of getting
much better validation
in these silicon chips.
We didn't do super
scalar on ARM10.
But we set up for the next
chip to be super scalar.
And that group in
Austin went on to do
the start of all those
Cortex A series that
were all in the phones.
It was another group and Sophia
in France, they had it also.
They ping-ponged back
between the two designs.
But the ARM10 was a decent chip.
It had an 8-stage pipe.
It ran fast.
It did indeed become
twice as fast as the ARM9.
We fixed up everything we could
that we knew about in ARM9.
And so ARM10 was
very successful.
We did brand new floating
point from the ground
up with little
short vectors in it.
That architecture is still
in use today in ARMV8.
So that architecture is
23 years old already.
So that probably says
it was half decent.
That floating point also got
back-ported to ARM9 and ARM7.
So it really was a
broad architecture.
We put in some proper
software hooks.
By this time people
were actually
debugging code that
was running on the arm.
So we'd sort of
come full circle.
People were making
something powerful enough
to actually develop
on the machine.
We completely reworked our
validation methodology.
We started including random
instruction set generation
to just throw random
instructions at the core to see
if we can make it blow up.
I had a small
brainwave that code
that I wrote way
back for my thesis.
I brought it up to
ARM10 specification.
And I made that code record
an instruction trace.
When we booted an
operating system,
I saved every instruction
that got pulled into the core.
And I recorded every data
transfer to and from the core.
And I played that back
to the transistor model.
And wherever they
were different,
we set around the
table and worked out
whether it was
their fault or mine.
It was about 50/50.
But we got it to
the point where we
could make this instruction
trace of the three operating
systems booting.
And we could run that on
a simple Sun workstation.
And as soon as we
saw a problem, we
could stop and go and look,
really pinpointed where
the problems were.
And of course, we'd fix
the transistors manually.
Run a regression test up
to that point to make sure
we hadn't broken
anything with the fix.
And then kept on booting.
So we were able to boot whole
operating systems that way.
And we ended up finding
every bug in ARM10 that way.
Worked beautifully, along
with the random instruction
generation.
ARM10200 was very successful.
And as I said, it was the start
of that Austin design center.
In the year 2000 we had
that silicon back again.
We knew what we were
going to do for Rev 1.
There's always a few tweaks.
You don't get it perfect
when you type it out.
You get something
that's very close,
and then make some
silicon, bring it up.
The Austin office
was about 45 people
by then, a pretty
experienced team.
And they've gone on to be a
wonderful set of CPU designers.
I bailed.
I was from New
Zealand, remember.
So I bailed back to
New Zealand in early
June 2000 just after
nine years at ARM.
Technically I was on sabbatical.
And I've been ever since working
on much more powerful CPUs.
I spent the next few years
actually wading through patents
because there was a
lawsuit over ARM7TDMI.
But I was pretty happy
with what I've achieved.
I did work hard, but had fun.
We got a few things wrong.
We backed-- we backed--
the people we backed
were mostly wrong.
And the people we didn't
back were mostly right.
So we really got it--
we were one inverter away
from success, one not
8 away from success.
If we had of backed any of
the other games consoles,
we probably would
have been fine.
If we had of backed
the Palm Pilot,
we probably would have
been a little better off.
We didn't see Nokia coming.
I personally did not see
cell phones coming at all.
I looked at the possibility of
the cell phone infrastructure
and thought, wow.
They're really going
to dig up every road
and put aerials on
top of buildings.
And this just
seemed so unlikely.
But I actually thought
the Iridium cell ph--
the satellite stuff was
going to work better.
I often look back at my life.
And I don't if you know the
movie "Slumdog Millionaire."
it's quite well-known.
It's the Indian fellow
who's had a hard life.
But just through serendipity
he just happens to know--
he only asks the questions he
knows the answers to somehow.
He doesn't know much.
But he knows the answers to
the questions he's asked.
And I always feel
that my career has
been a little bit like that.
If any of the things on
the bottom were missing,
I just don't think much of
this would have come together.
Certainly Lee-- ARM was
25 years old in 2015.
And Lee wrote me
this lovely email
saying that he had been asked
as one of the four founders
that were still at
ARM what their most
significant milestone was.
And he said it was hiring
me on the telephone.
I love the quote.
He said, starting with
memorable moments,
starting with returning to the
Barn, the beautiful old Barn
at quarter to 9 to phone
you in New Zealand.
Robin Saxby, [INAUDIBLE]
was just leaving the pub
and offered to buy me a pint--
of beer, obviously.
If I'd ever accepted and
missed the interview,
history might have
been very different.
Ended with picking me up,
taking to Mike Muller's place
for a shower.
I'd been on an
airplane for 24 hours.
I'm glad he did that.
And then into the
Barn and out for lunch
to a curry house in Bottisham--
another cute wee town.
I'll never forget your comment
when your food arrived.
"Gee, Mom.
I flew halfway around the world
to eat lamb and potatoes."
Great time with great people.
But yeah, a lot of
serendipity in there.
And that's-- Robin's 70 and
I'm 50 in that photograph.
A couple of years ago.
We still get together
for our birthdays.
So with a few
minutes to go, I've
got time for any questions.
Sorry if that was
a little rushed.
But it's hard to pack nine
years into 45 minutes.
RAYMOND: You did an amazing job.
You can go to the
microphones for questions.
AUDIENCE: So what
would you recommend
for somebody who is interested
in learning about CPU design
and implementation nowadays,
even as just a hobby?
Or even just any
silicon chip in general.
DAVE JAGGAR: Is there
any ARM snipers?
No.
I would Google RISC-V and
find out all about it.
They've done a fine
instruction set, a fine job.
And they're explaining it.
This is Berkeley and
Stanford are behind this.
There are obviously
commercial companies
like [INAUDIBLE] doing things.
But it's the state of the art
now for 32-bit general purpose
instruction sets.
And it's got the 16-bit
compressed stuff.
So you're learning about
that, learning from the best.
Still.
AUDIENCE: All right.
Thank you.
AUDIENCE: Hello.
So it seems like if you're
programming in the '80s,
you would know a lot more
things kind of down below,
like, the lower levels.
And now things are so
complicated that if someone's
coming out of
school, they're not
going to be able to really
understand everything
that's going on below them.
So do you think that's
sort of making it harder
for us to have a full view?
Or maybe that's just the
way that things are now.
And we're just going
to have to accept that?
What do you think about that?
DAVE JAGGAR: I certainly
agree with you.
There's so much going on.
I mean, I'm still very active.
What was I doing the other day?
I have a-- first of all, let's
talk about the Raspberry Pi.
That whole program was trying
to address exactly what you're
talking about.
It's giving something
simple enough
where you can look at a
software stack top to bottom.
Well, that's still complicated.
Even I look at the
boot process and go,
man, this is hard work
to keep in your head.
So there's a lot going on.
I absolutely agree with you.
I personally hate programming
languages like Python,
because I look at inserting
something into the list.
And just know how many bazillion
instructions are going on
to support that piece of code.
I just can't quite get my head
around doing all that stuff.
I know it's productivity.
I think the best
we can probably do
is things like a Raspberry Pi.
I was recently looking at the
host APD code recently cause
it didn't work at 5 gigahertz.
And you can burrow down into
that a bit and learn a lot.
I think, to come back to
that statement about fog too.
When I started out
I remember being--
I think scared is
the right word.
When you're in that fog
and you know nothing,
and you really feel like
you're a dumb idiot.
And you go, other people
understand this, but I don't.
I've sort of embraced
that over time and gone,
I know tomorrow I will
know more than I do today.
I always feel like I'm
kind of groping around
in a dark room trying
to find the furniture.
But I think that's
also the thing
is not to be afraid
of that situation
and know that you're
a bright person
if you're in this
room, let's face it.
Other people
understand this stuff.
But not to be afraid to grope
around in the dark like that
and just try and get one
more piece of information
than you got yesterday.
And then slowly start--
stuff comes together, and
you can build on that.
But yeah.
It's complicated now.
I mean, look at--
well, I don't want to say
Android on top of Linux
on top of ARM.
But man, there's a stack.
There's a stack
of code in there.
I mean, I've hacked around
in that quite a lot.
And it takes a lot of
understanding, even
with my background.
So yeah, it is.
It's complicated.
It's hard.
Maybe there will be--
with machine learning--
maybe there'll
be another big revolution.
I'm pretty sure it's coming.
Where we really look at
what an algorithm is now
in the modern world,
and reinvent hardware
to support that top down.
So I really think that's coming.
I've got a pretty good idea
of how that will shake down,
I think.
But yeah.
Yeah.
AUDIENCE: How do you
think it will shake down?
DAVE JAGGAR: Pardon?
AUDIENCE: How do you
think it will shake down?
DAVE JAGGAR: If-- and this is a
really interesting experiment.
I think everyone should
do this at some point.
Open a messenger
session to a friend,
have them use a different
service provider to you,
send them--
hit the 1 key.
And run everything in
between on a simulator.
And then just watch how
much data gets sucked in
and sucked out to send
the one key through all
that networking,
all the fonts, all
the graphics and everything.
What's going on is--
and I think it's--
there's this wonderful
one analogy on YouTube.
It's a comedian.
And he says, the difference
between male brains
and female brains-- and this
strikes a chord with me.
He basically says, men's brains
put everything in little boxes,
and the boxes mustn't touch.
And female brains go
[BUZZING SOUNDS] all the time.
And I really think
we have to design
hardware that's much
closer to [BUZZING SOUNDS]..
I think a lot of engineering has
got this data based in blocks.
And we call them buffers.
And we have these interfaces
where you call a piece of code,
and that passes
back a nice buffer.
And then that code must
never touch that code.
And that code must-- and
they're all separated.
I really think we're going
to end up with a machine
where you put the
data on the top.
And the data is going
to fall out the bottom.
And it's working in a
much more integrated way.
If you YouTube that comedian,
you'll sort of understand.
I'm not telling the
story very well.
But it really comes
across as I really
think we have to be thinking
in a much more holistic way
than generally engineers
have in the past.
I think it's a
limited way that we
think when we partition data.
And it means that, of
course, think about--
let me give an easy example.
Think about
inserting a character
into the middle of a string.
So this should be
kind of easy, right?
If I'm a character of
a string, and you're
the next character
of the string,
and I want to put a
character in between us,
I say, well, I'm going to just
not hold your hand anymore.
You're going to hold his hand.
And away we go.
And it's easy.
It's all local.
And we understand
exactly what's going on.
When you convert
that into a computer,
you've got a 64-bit address.
I've got a 64-bit address.
I might be 0.
You might be 1.
But I'm stored on
a 64-bit number.
I have absolutely no idea
of what the locality of you
is in the program related to me.
But if you build
on this hardware,
and it's easy to do
when you think about it,
if I want you to move
along the array more,
I just pull on a wee
line to you that says,
increment your index,
and slot that new guy in.
That's really easy to do.
If I want to delete
you from the queue,
I just say, remove yourself.
And you say, everyone north
of you, decrement your index.
And everything will
sort of close up.
You get all that in a program.
It's called the
directed flow graph.
That's all there.
And we sort of throw that away
with the stake of software
we put on top.
That's the crazy thing.
All the layers that we've put
in between, all the differences
between hardware and
software and assembler
and linkages and
operating systems.
With layers and
layers and layers.
And you actually lose the
meaning of the program.
And the hardware then works
very hard to try and put
that meaning back together.
Anyway.
Sorry that was a long
answer to a simple question.
RAYMOND: That's a great answer.
One thing, I want to
say that I found--
I was really glad
to hear you say was,
you'll be smarter tomorrow.
DAVE JAGGAR: Yeah.
RAYMOND: One of the things
I always tell myself,
gets me through
every day is, you
know those smart gals and guys?
They're just meat
and bone like you.
DAVE JAGGAR: They are.
Yeah, yeah, yeah.
RAYMOND: Thank you.
AUDIENCE: Great story.
Thank you.
I was particularly struck
by one of the quotes
on the slide where Robin
Saxby says that you will never
manufacture chips.
DAVE JAGGAR: Yeah.
AUDIENCE: And I was
wondering if you
could talk more about those when
you decide not to take a path.
I mean, was that a
courageous decision?
DAVE JAGGAR: He is incredibly--
there's this buzz word-- global.
He was always global.
He said, we have this
partnership business model.
We'll do this.
And they do that.
And we're not going to compete.
And the very broad--
and it's nowhere
near this defined--
but the very broad thought
was, if we design the chip once
and sell it three
times, we can afford
to sell it for about
half or a third
of what it would cost
them to develop it.
They're getting a deal.
We are getting a business.
And there's just no need
for us to sell any product.
Our product is just
going to be design.
And it was a very successful
intellectual property company.
I mean, as I said, it's
tiny compared to Google.
But it really has
no real product.
And so his foresight was
very strong about that.
We challenged it a few times.
We should make a few
embedded controllers
to go on development cards.
No.
We should make some
SOCs as demonstrators.
No.
And we did do some
SOCs in the end.
But we never bought
any fab space.
Always done through partnership.
And that clear distinction
was incredibly beneficial.
And he was he was absolutely
rigid in that decision,
and absolutely right
in that decision.
AUDIENCE: It's slightly
different from say, what
Qualcomm has done, for example.
DAVE JAGGAR: That whole industry
is sort of on its-- on its--
AUDIENCE: Ear.
DAVE JAGGAR: On its ear now.
So now those fabs that
don't design anything.
So yeah, absolutely.
The TSMCs and the global
foundries of this world,
you can just buy fab space.
So there's this other product
family in it, knitted in
very well with what
ARM-- you know,
you've got a designer
of a chip, somebody
that integrates the rest
of the IP and then fabs it.
And so they're all quite
separate things now.
But yeah.
The TSMC and global
foundries of this world
are almost exactly
the other way around.
AUDIENCE: Right.
DAVE JAGGAR: Yeah.
Again, they don't compete.
AUDIENCE: Thank you.
DAVE JAGGAR: Yeah.
RAYMOND: We're over.
But I'm gonna say
one more question.
I want to say host privilege.
One more question.
AUDIENCE: It's a
incredible talk.
DAVE JAGGAR: Thanks.
AUDIENCE: Very briefly,
Specter and Meltdown.
DAVE JAGGAR: Yeah.
AUDIENCE: So how much has
that changed your thinking?
And do you feel like there's
a future for CPUs where they
solve the problem in some way?
Or there is securer CPUs
that have completely
rigorous predictable
performance,
and others that can have
variable performance,
but a risk of side channels.
Thank you.
DAVE JAGGAR: This
answer is in 1996
I wrote a patent that
said if you bring anything
speculatively into the
chip, make sure you take it
all the way back out again.
I guess they lost that patent
down the back of the couch,
right?
AUDIENCE: They did.
DAVE JAGGAR: I always
hassle them about that.
Again, that's thinking
like a software guy.
Bringing that stuff
in speculatively.
You've got to take
it out again, guys.
You can't leave it
in the processor.
How to handle that
in the future.
I think we're all basically
nice engineers that
just don't expect people to
stab us ins the back with a--
we're just-- and now we're
all a little less innocent
and probably looking at how
can we break this thing.
But we're always going to be
chasing our tails, you know.
It's impossible to find
every single backdoor
into the processor.
We're always going to be
chasing our tails as far
as trying to spot where some
sneaky little person might--
and so they should, by the way.
You know, if they don't do it,
someone that's really nefarious
will.
But I don't know where there's
a good solution in the--
my patent, while smart, was
a whole lot easier back then
when the chips were
a whole lot simpler.
But I think now the
side channel attacks.
[INAUDIBLE] better known.
Or we're better
able to handle them.
Yeah.
AUDIENCE: Thank you.
DAVE JAGGAR: Yeah.
RAYMOND: All right.
With that, thank you
all, and thank you, Dave.
[APPLAUSE]
