[MUSIC PLAYING]
DAVID MALAN: Recall that an algorithm
is just step-by-step instructions
for solving some problem.
Not unlike this problem here wherein I
sought Mike Smith among the whole phone
book of names and numbers.
But up until now, we've
really only focused
on those step-by-step
instructions and not
so much on how the data we
are searching is stored.
Of course, in this version of that
problem, it's stored here on paper,
but in the digital world, it's of course
not going to be paper, but 0's and 1's.
But it's one thing to say that the
numbers and maybe even the names
are stored ultimately as 0's and
1's, but where and how exactly?
There's all those transistors
and they're flipping on and off,
but with respect to each other, are
those numbers laid out left to right,
top to bottom, are they
all over the place?
Let's actually take a
look at that question now
and consider how a
computer leverages what
are called data structures to
facilitate implementation of algorithms.
Indeed, how you lay out a
computer's data inside of its memory
has non-trivial impacts on
the performance or efficiency
of your algorithms, whereas
the algorithm itself
can be correct as we've seen, but
not necessarily efficient logically.
Both space and the representation
underneath the hood of your data
can also make a significant impact.
But let's simplify the world first.
And rather than focus on, say, a
whole phone book of names and numbers,
let's focus just on numbers, and
much smaller numbers that aren't even
phone numbers, but just integers, and
only save seven of them at a time.
And I've hidden these
seven numbers, if you will,
behind these seven yellow doors.
And so by knocking and
opening, one of these doors
will reveal one number at a time.
And the goal at hand, though, is
to find a very specific number,
just like I sought one specific
phone number before, this time I want
to find the number 50 specifically.
Well, where to begin?
I'll go with the one closest
to me and knock, knock, knock--
15 is the number.
So a little bit low.
Let's proceed from there to see 23.
We seem to be getting closer.
Let's open this door next and-- oh,
we seem to have veered down smaller,
so I'm a little confused.
But I have four doors left to check.
So 50 is not there and 50 is not
there and 50 is, in fact, there.
So not bad.
Within just six steps, have I
found the number in question.
But of course, to be fair,
there were only seven doors.
So if we generalize that to say that
there were n doors where n is just
a number, well that was roughly n
doors I had to open among the n doors
just to find the one that I sought.
So could I have done better?
You know, my instincts like yours
were perhaps to start at the left
and move to the right, and we seem
to be on a good path initially.
We went from 15 to 23,
and then darn it if 16
didn't throw a wrench in the
works, because I expected it,
perhaps naively, to be bigger
and bigger as I moved right.
But honestly, had I not told you
anything-- and indeed I did-- then
you wouldn't have known anything
about these numbers other than maybe
the number 50 is actually there.
I told you nothing as to
the magnitude or the size
of any of the other numbers,
let alone the order,
but in the world of the
phone book, of course,
we were able to take for
granted that those names were
sorted by the phone company for us--
from left to right, from A to Z.
But in this case, if your data is just
added to the computer's memory one
at a time in no
particular order, the onus
is on you, the programmer or
algorithm, to find that number
you're interested in nonetheless.
Now what was left here?
And indeed 4 is even smaller than 50.
So these seven doors were by
design randomly assigned a number.
And so you could do no better.
I might have gotten lucky.
I might not have gone
with my initial instincts
and touch the number 15 at left.
I might have, effectively blinded, gone
and touched 50 and just gotten lucky,
and then it would have
been just one step.
But there's only a one in seven chance
I would have been correct so quickly,
so that's not really an
algorithm that I could
reproduce with the same
efficiency again and again.
So how can I do better?
And how does the phone company
enable us to do better?
Well they, of course, put in a
huge amount of effort upfront
to sort all of those names and
associated numbers from left
to right, from A to Z. And so
that's a huge leg up for us,
because then I can assume I can do
divide and conquer or so-called binary
search, dividing that
phone book in two as
implied by "bi" in "binary,"
having the problem again and again.
But someone's got to do that work
for us, be it the phone company
or perhaps me with these numbers.
So let's take one more
stab at this problem,
this time presuming that
the seven doors in question
do, in fact, have the numbers behind
them sorted from left to right,
small to big.
So where to find the number 50 now?
I have seven doors behind
which are those same numbers,
but this time they are
sorted from left to right.
And no skipping ahead thinking that,
well, I remember all the other numbers,
so I know immediately where 50 is.
Let's assume for the moment
that we don't know anything
about the other numbers other than
the fact that they are sorted.
Well, my inclination is not to start
at the left with this first door,
much like my inclination ultimately
with that phone book was not to start
with the first page, but the middle.
And indeed, I'm going to go here
to the middle of these doors and--
16.
Not quite the one I want.
But if the doors are sorted now, I know
that that number 50 is not to the left,
and so I'm going to go to the right.
Where do I go to the right?
Well, I have three doors left, I'm
going to follow the same algorithm
and open that door in the
middle and-- oh, so close.
I found only, if you
will, the meaning of life.
So 42, though, is not
the number I care about,
but I do know something about
50-- it's bigger than 42.
And so now, it's quite simply
the case that-- aha, 50 is there,
it's going to be in that last number.
So whereas before took me up to
six steps to find the number 50,
and only then by luck
did I find it where
it was because it was
just randomly placed,
now I spent 1, 2, 3 steps in total,
which is, of course, fewer than six.
And as these numbers
of doors grow in size
and I have hundreds
or thousands of doors,
surely it will be the case just like
the phone book that having this problem
again and again is going
to get me to my answer
if it's there in logarithmic
instead of linear time, so to speak.
But what's key to the success of
this algorithm-- binary search--
is that the doors are not only sorted,
but they are back-to-back-to-back.
Now I have the luxury of feet
and I can move back and forth
among these numbers, but even my steps
take me some amount of time and energy.
But fortunately, each such step just
takes one unit of energy, if you will,
and I can immediately jump wherever
I would like one step at a time.
But a computer is purely electronic,
and in the context of memory,
doesn't actually need to take any steps.
Electronically a computer can
jump to any location in memory
instantly in so-called constant time.
So just one step, that
might take me several.
And so that's an advantage a computer
has and it's just one of the reasons
why they are so much faster than
us at solving so many problems.
But the key ingredient to laying
out the data for a computer to solve
your problems quickly is that you need
to put your data back-to-back-to-back.
Because a computer at
the end of the day,
yes, stores only 0's and 1's, but
those 0's and 1's are generally
treated in units of, say, eight--
8 bits per byte.
But those bytes, when
storing numbers like this,
need those numbers to be
back-to-back-to-back and not just
jumbled all over the place.
Because it needs to be the
case that the computer is
allowed to do the simplest of
arithmetic to figure out where to look.
Even I in my head am sort of
doing a bit of math figuring out,
well where's the middle?
Even though among few doors you
can pretty much eyeball it quickly.
But a computer's going to have to
do a bit of arithmetic, so what
is that math?
Well if I have 1, 2, 3, 4,
5, 6, 7 doors initially,
and I want to find the middle one,
I'm actually just going to do what?
7 divided by 2, which
gives me 3 and 1/2-- that's
not an integer that's that
useful for counting doors,
so let's just round it down to 3.
So 7 divided by 2 is 3.5
rounded down to 3 suggests
mathematically that the number of the
door that's in the middle of my doors
should be that known as 3.
Now recall that a
computer generally starts
counting at 0 because 0
bits represent 0 in decimal,
and so this is door 0, 1, 2, 3, 4, 5, 6.
So there's still seven doors, but the
first is 0 and the last is called 6.
So if I'm looking for
number 3, that's 0, 1, 2, 3.
And indeed, that's why I jumped
to the middle of these doors,
because I went very
specifically to location 3.
Now why did I jump to 42 next?
Of course, that was in the middle
of the three remaining doors,
but how would a computer know
mathematically where to go,
whereas we can just
rather eyeball it here?
Well if you've got 3 doors divided
by 2, that gives me, of course, 1.5--
let's round that down to 1.
So if we now re-number
these doors, it's 0, 1, 2,
because these are the only three doors
that exist, well door 1 is 0, 1--
the 42, and that's how a computer
would know to jump right to 42.
Of course, with just one door
left, it's pretty simple.
You'd needn't even do any of
that math if there's just one,
and so we can immediately
access that in constant time.
In other words, even though my human
feet are taking a bit of energy
to get from one door to another, a
computer has the leg-up, so to speak,
of getting to these doors even
quicker, because all it has to do
is a little bit of division,
maybe some rounding,
and then jump exactly to
that position in memory.
And that is what we call constant
time, but it presupposes, again,
that the data is laid out
back-to-back-to-back so that every one
of these numbers is an equal
distance away from every other.
Because otherwise if
you were to do this math
and coming up with the
numbers 3 or 1, you
have to be able to know where
you're jumping in memory,
because that number 42 can't be down
here, it has to be numerically in order
exactly where you expect.
And so in computer science and in
programming is this kind of arrangement
where you have doors or really
data back-to-back-to-back known
as what's called an array.
An array is a contiguous block of
memory wherein values are stored
back-to-back-to-back-to-back--
from left to right conceptually,
although of course, direction has
less meaning once you're inside
of a computer.
Now it is thanks to these arrays
that we were able to search,
even something like a phone so quickly.
After all, you can imagine
in the physical world,
a phone book isn't all
that unlike an array,
albeit a more arcane version
here, because its pages are indeed
back-to-back-to-back-to-back from
left to right, which is wonderful.
And you'll recall when
we searched a phone book,
we were already able to describe
the efficiency via which
we were able to search it-- via
each of those three algorithms.
One page at a time, two pages
at a time, and then one-half
of the remaining problem at a time.
Well it turns out that there's
a direct connection even
to the simplification
of that same problem.
If I have n doors and I search them
from left to right, that of course
might take me as many six, seven total
steps or n if the number I'm seeking
is all the way at the end.
I could have gone two doors at
a time, although that really
would have gone off the rails
with the randomly-sorted numbers,
because there would have been no logic
to just going left to right twice as
fast because I would be missing
every other element never knowing
when to go back.
And in the case of binary
search, my last algorithm where
I started in the middle
and found 16, and then
started in the middle of that
middle and found 42, and then
started in the middle of the
middle and found my last number,
binary search is quite akin to what
we did by tearing that problem in half
and in half.
So how did we describe the efficiency
of that algorithm last time?
Well we proposed that my first algorithm
was linear, this straight line in red
represented here by the label n, because
for every page in the phone book,
in the worst case you might need
one extra step to find someone
like Mike Smith.
And indeed, in the case of these doors,
if there's just one more door added,
you might need one more step to
find that number 50 or any other.
Now I could, once
those doors are sorted,
go through them twice as fast,
looking two doors at a time,
and if I go too far and find, say, 51, I
could double-back and fix that mistake.
But what I ultimately did
was divide and conquer.
Starting in the middle, and
then the middle of the middle,
and the middle of the
middle of the middle,
and that's what give
me this performance.
This so-called logarithmic
time-- log base 2 event
which if nothing else means that we
have a different shape fundamentally
to the performance of this algorithm.
It grows so much more slowly in time
even as the problem gets really big.
And even off the screen here,
imagine that even as n gets huge,
that green line would not
seem to be going very high
even as the red and yellow ones do.
So in computer science,
there are actually
formal labels we can apply to this sort
of methodology of analyzing algorithms.
When you talk about upper bounds, on
just how much time an algorithm takes,
you might say this--
big O, quite literally.
That an algorithm is in
a big O of some formula.
For instance, among the formulas
it might be are these here--
n squared, or n log n,
or n, or log n, or 1.
Which is to say you can
represent somewhat simply
mathematically using n-- or really
any other place holder-- as your value
a variable that represents the
size of the problem in question.
So for instance, in the
case of linear search,
when I'm searching that
phone book left to right
or searching these doors left
to right, in the worst case,
it might take me as many as n
steps to find Mike or that 50,
and so we would say that
that linear algorithm is
in big O of n, which is just a fancier
way of saying quite simply that it's
indeed linear in time.
But sometimes I might get lucky,
and indeed in the best case,
I might find Mike or 50 or
anything else much faster,
and computer scientists also have
ways of expressing lower bounds
on the running times of algorithms.
Whereby in the best case,
perhaps, an algorithm
might take only this much time
and at least this much time.
And we use a capitalized omega to
express that notion of a lower bound,
whereas again, a big O represents
an upper bound on the same.
So we can use these same formulas,
because depending on the algorithm,
it might indeed take n squared steps
or just 1 or constant number thereof,
but we can consider even linear
search to having a lower bound,
because in the best case,
maybe Mike or maybe 50
or any other inputs
of the problem just so
happens to be at the very beginning
of that book or those doors.
And so in the best case, a lower bound
on the running time of linear search
might indeed be omega of
1 because you might just
get lucky and take one
step or two or three
or terribly few, but
independent of the number n.
And so there, we might express
this lower bound as well.
Now meanwhile there's
one more Greek symbol
here, theta, capitalized here, which
represents a coincidence of upper
and lower bounds.
Whereby if it happens to be
the case for some algorithm
that you have an upper bound and
a lower bound that are the same,
you can equivalently say not both of
those statements, but quite simply
that the algorithm is in
theta of some formula.
Now suffice it to say,
this green line is good.
Indeed, any time we achieve logarithmic
time instead of, say, linear time, we
have made an improvement.
But what did we presuppose?
Well, we presupposed in both
the case of the phone book
and in the case of those doors that
they were sorted in advance for us.
By me in the case of the
doors and by the phone
company in the case of the book.
But what did it cost
me and what did it cost
them to sort all of
those numbers and names
just to enable us ultimately
to sort logarithmically?
Well let's consider that in the context
of, again, some numbers, this time
some numbers that I
myself can move around.
Here we have eight cups, and on these
eight cups are eight numbers from 1
through 8.
And they're indeed sorted from smallest
to largest, though I could equivalently
do this problem from largest
to smallest so long as we all
agree what the goal is.
Well let me go ahead and just
randomly shuffle some of these cups
so that not everything
is in order anymore,
and indeed now they're fairly jumbled,
and indeed not in the order I want,
so some work needs to be done.
Now why might they arrive in this order?
Well in the case of the phone
book, certainly new people
are moving into a town every day, and
so they're coming in not themselves
in alphabetical order,
but seemingly random,
and it's up to the phone
company to slot them
into the right place in a phone book
for the sake of next year's print.
And the same thing with those doors.
Were I to add more and more
numbers behind those doors,
I'd need to decide where to put
them, and they're not necessarily
going to arrive for my input
source in the order I want.
So here, then, I have some
randomly-ordered data,
how do I go about sorting it quickly?
Well, let's take a look at
the first problem I see.
2 and 1 are out of order, so let me
just go ahead and swap, so to speak,
those two.
I've now improved to
the state of my cups,
and I've made some
progress still, but 2 and 6
seem OK even though maybe there
should be some cups in between.
So let's look at the next pair now.
We have 6 and 5, which definitely are
out of order, so let's switch those.
6 and 4 are the same, out of order.
6 and 3, just as much.
6 and 8 are not quite
back-to-back, but there's probably
going to be a number in-between, but
they are at least in the right order,
because 6, of course, is less than 8.
And then lastly we have 8 and 7.
Let's swap those here and done--
or are we not?
Well I've made improvements with every
such swap, but some of these cups
still remain out of order.
Now these two are all set.
2 and 5 are as well,
even though ultimately we
might need some numbers between them,
but 4 and 5 are indeed out of order.
3 and 5 just as much.
6 and 5 are OK, 7 and 6 are
OK, and 8 and 7 as well.
So we're almost done there,
but I do see some glitches.
So let's again compare all
of these cups pairwise--
1, 2; 2, 4-- oops, 4,
3, let's swap that.
Let's keep going just to be safe.
4, 5; 5, 6; 6, 7; 7, 8.
And by way of this process, just
comparing cups back-to-back,
we can fix any mistakes we see.
Just for good measure,
let me do this once more.
1, 2; 2, 3; 3, 4; 4,
5; 5, 6; 6, 7; 7, 8.
Now this time that I've gone all
the way from left to right checking
that every cup is in order, I can safely
conclude that these cups are sorted.
After all, if I just went from
left to right and did no work,
why would I presume that if I
do that same algorithm again,
I'd make any changes?
I wouldn't, so I can quit at this point.
So that's all fine and good, but perhaps
we could have sorted these differently.
That felt a little tedious and I
felt like I was doing a lot of work.
What if I just try to select
the cups I want rather than deal
with two cups at a time?
Let's go ahead and randomly shuffle
these again in any old order,
making sure to perturb what
was otherwise left to right.
And here we have now another
random assortment of cups.
But you know what I'm
going to do this time?
I'm just going to select
the smallest I see.
2 is already pretty small, so
I'll start as before on the left.
So let's now check the other cups to
see if there's something smaller that I
might prefer to be in this location.
3, 1-- ooh, 1 is better, I'm going
to make mental note of this one.
5, 8, 7, 6, 4-- all right, so 1
would seem to be the smallest number.
So I'm going to go ahead and
put this where it belongs,
which is right here at the side.
There's really no room
for it, but you know what?
These were randomly-sorted,
let me just go ahead
and evict whatever's there,
too, and put 1 in it's place.
Now to be fair, I might have
messed things up a little bit,
but no more so than I might have when
I received these numbers randomly.
In fact, I might even get
lucky-- by evicting a cup,
I might end up putting it in the right
place so it all washes out in the end.
Now let's go ahead and select
the next smallest number,
but not bother looking at
that first one anymore.
So 3 is pretty small, so
I'll keep that in mind.
2 is even smaller, so I'll forget
about 3 and now remember 2.
5 is bigger, 8 and 7 and 6 and 4--
all right, 2 now seems to be the
next smallest number I can select.
I know it belongs there, but 3's
already there, so let's evict 3
and there you go, I got lucky.
Now I have 1 and 2 in the right place.
Let's again select the
next smallest number.
I see 3 here, and again, I don't
necessarily know as a computer
if I'm only looking at
one number at a time
if there are, in fact,
anything smaller to its side.
So let's check-- 5, 8, 7, 6, 4-- nope.
So 3 I shall select, and I got
lucky, I'll leave it alone.
How about the next smallest number?
5 is pretty small, but 8,
7, 6, 4 is even smaller.
Let's select this one, put it
in its place, evicting the 5
and putting it where there's room.
8 is not that small,
but it's all I know now.
But ooh-- 7 is smaller,
I'll remember this.
6 is even smaller, I'll
remember that, and it feels
like I'm creating some work for myself.
5 is the next smallest, 8's in the way.
We'll evict 8 and put 5 right there.
7 is pretty small, but 6 is even
smaller, but still smaller than 8,
so let's pick up 6, evict
7, and put 7 in its place.
Now for good measure, we're
obviously done, but I as the computer
don't know that yet if I'm just looking
at one of these cups or, if you will,
doors at a time.
7's pretty small, 8 is no
smaller, so 7 I've selected
to stay right there in its place.
8 as well, by that same logic,
is now in its right place.
So it turns out that
these two algorithms
that I concocted along the way
actually do have some formal semantics.
In fact, in computer
science, we'd call the first
of those algorithms that
thing here, bubble sort.
Because in fact, as you compare two cups
side-by-side and swap them on occasion
in order to fix transpositions,
well, your largest numbers
would seem to be bubbling
their way up to the top,
or equivalently, the smallest ones
down to the end, and so bubble sort
is the formal name for that algorithm.
How might express this more
succinctly than my voice over there?
Well let me propose this pseudocode.
There's no one way to describe
this or any algorithm,
but this was as few English words
as I could come up with and still
be pretty precise.
So repeat until no swaps the
following-- for i from 0 to n minus 2,
if the i-th and i-th plus 1 elements
are out of order, swap them.
Now why this lingo?
Well computational thinking is
all about expressing yourself
very methodically, very
clearly, and ultimately
defining, say, some variables or terms
that you'll need in your arguments.
And so here what I've done
is adopt a convention.
I'm using i to represent an integer--
some sort of counter--
to represent the index of each
of my cups or doors or pages.
And here, we are adopting
the convention, too,
of starting to count from 0.
And so if I want to start
looking at the first cup, a.k.a.
0, I want to keep looking up,
up to the cup called n minus 2,
because if my first cup is cup 0,
and this is then 1, 2, 3, 4, 5, 6, 7,
indeed the cup is labeled
8, but it's in position 7.
And so this position more generally, if
there are n cups, would be n minus 1.
So bubble sort is telling me to start
at 0 and then look up to n minus 2,
because in the next line
of code, I'm supposed
to compare the i-th elements and
the i-th plus 1, so to speak.
So I don't want to look
all the way to the end,
I want to look one shy to the end,
because I know in looking at pairs,
I'm looking at this one as well
as the one to its right, a.k.a.
i plus 1.
So the algorithm
ultimately is just saying,
as you repeat that process again and
again until there are no swaps, just
as I proposed, you're swapping any two
cups that with respect to each other
are out of order.
And so this, too, is an example more
generally of smalling local problems
and achieving ultimately a
global result, if you will.
Because with each swap of those cups,
I'm improving the quality of my data.
And each swap in and of itself doesn't
necessarily solve the big picture,
but together when we aggregate all
of those smaller solutions have we
assembled the final result.
Now what about that
second algorithm, wherein
I started again with some random
cups, and then that time I
selected one at a time the number
I actually wanted in place?
I first sought out the smallest.
I found that to be 1 and I put
it all the way there on the left.
And I then sought out
the next smallest number,
which after checking the remaining
cups, I determined was 2.
And so I put 2 second in place.
And then I repeated that
process again and again,
not necessarily knowing in advance
from anyone what numbers I'd find.
Because I checked each
and every remaining cup,
I was able to conclude safely that I had
indeed found the next smallest element.
And so that algorithm, too, has a name--
selection sort.
And I might describe
it pseudocode similar
in structure but with
different logic ultimately.
Let me propose that we do
for i from 0 to n minus 1,
where again, n is the number of cups,
and 0 is by convention my first cup,
and n minus 1, therefore, is my last.
And what I then want to do is find
the smallest element between the i-th
element and the n-th plus--
at n minus 1.
That is, find the smallest element
between wherever you've begun
and that last element, n minus 1.
And then if-- when you've
found that smallest element,
you swap it with the i-th element.
And that's why I was picking
up one cup and another
and swapping them in place-- evicting
one and putting one where it belongs.
And you do this again
and again and again,
because each time your incrementing 1.
So whereas the first iteration of this
loop will start here all the way left,
the second iteration will start
here, and the third iteration
will start here.
And so with the amount
of problem to be solved
is steadily decreasing until
I have 1 and then 0 cups left.
Now it certainly took some
work to sort those n cups,
but how much work did it take?
Well in the case of bubble sort,
what was I doing on each pass
through these cups?
Well I was comparing and
then potentially swapping
each adjacent pair of cups, and then
repeating myself again and again.
Well if we have here
n cups, how many pairs
can you create which you
then consider swapping?
Well if I have n cups, I could
seem to make 1, 2, 3, 4, 5, 6,
7 out of 8 pairs at a time,
so more generally n minus 1.
So on each pass here, it would seem
that I'm comparing n minus 1 cups.
Now how many passes do I
need to ultimately make?
It would seem to be roughly
n, because in the worst case,
these cups might be
completely out of order.
Which is to say, I might indeed
do n things n minus 1 times,
and if you multiply that out, I'm
going to get some factor of n squared.
But what about selection sort, wherein I
instead looked through all of the cups,
selecting first the
smallest, and then repeating
that process for the
next smallest still?
Well in that case, I
started with n cups,
and I might need to
look at all n, and then
once I found that, I might
instead look at n minus 1.
So there, too, I seem to be summing
something like n plus n minus 1
plus n minus 2 and so
forth, so let's see
if we can't now summarize this as well.
Well let me propose more mathematically,
that, say, with selection sort,
what we've done is this.
In looking for that smallest cup, I
had to make n minus 1 comparisons.
Because as I identified the
smallest cup I'd yet seen,
I compared it to no more
than n minus others.
Now if the first selection of a cup took
me n minus 1 steps but then it's done,
the next lesson of the
next smallest cup would
have taken me only n minus 2 steps.
And if you continue that
logic with each pass,
you have to do a little bit less
work until you're left with just one
very last cup at the end, such as 8.
So what does this actually sum too?
Well you might not remember
or see it at first glance,
but it turns out, particularly if
you look at one of those charts
at the back of a textbook, does
this summation or series actually
aggregate to n times n
minus all divided by 2.
Now this you can perhaps multiply
out a bit more readily as
in n squared minus n all divided by 2.
And if we factor that out,
we can now get n squared
divided by 2 minus n divided by 2.
Now which of these terms, n squared
divided by 2 or n divided by 2,
tends to dominate the other?
That is to say, as n
gets larger and larger,
which of these mathematical
expressions has the biggest effect
on the number of steps?
Well surely it's n squared, albeit
divided by 2, because as n gets large,
n squared is certainly larger than n.
And so what a computer scientist
here would typically do
is just ignore those
lower-ordered terms, so to speak.
And he would say with a figurative
or literal wave of the hand,
this is on the order of
n squared this algorithm.
That isn't to say it's
precisely that many steps,
but rather as n gets really
large, it is pretty much
that n squared term that
really matters the most.
Now this is not a form of proof, but
rather a proof by example, if you will,
but let's see if I can't convince
you with a single example numerically
of the impact of that square.
Well if we start again with n squared
over 2 minus n over 2 and say n
is maybe 1 million initially-- so not
eight cups, not 1,000 pages in a book,
but 1 million numbers or
any other element itself.
What does this actually sum to?
Well 1 million squared divided
by 2 minus 1 million divided by 2
happens to be 500 billion minus 500,000,
which of course is 499,999,500,000.
Now I daresay that is pretty
darn close to big O of n squared.
Why?
Well if we started with,
say, 1 trillion then
halved it and ended up with 499
billion, that's still pretty close.
Now in real terms, that does not
equal the same number of steps,
but it gives us a general sense it's
on the order of this many steps,
because if we plugged in
larger and larger values for n,
that difference would
not even be as extreme.
Well why don't we take a look now at
these algorithms in a different form
altogether without the physical
limitation of me as the computer?
Pictured here is, if you will, an array
of numbers, but pictured graphically.
Wherein we have vertical
bars, and the taller
the bar, the bigger the
number it represents.
So big bar is big number,
small bar is small number,
but they're clearly,
therefore, unsorted.
Via these number of algorithms we've
seen, bubble sort and selection sort,
what does it actually look
like to sort of many elements?
Let's take a look.
In this tool where I proceed
to choose my first algorithm,
which shall be, say, bubble sort.
And you'll see rather slowly that
this algorithm is indeed comparing
pairwise elements, and if--
and only if they're out of order,
swapping them again and again.
Now to be fair, this
quickly gets tedious,
so let me increase the
animation speed here.
And now you can rather see that
bubbling up of the largest.
Previously it was my 8 and my 7 and 6.
Here we have 99, 98, 97, but
indeed, those tallest bars
are making their way up.
So let's turn our attention next to
this other algorithm, selection sort,
to see if it looks or perhaps
feels rather different.
Here now we have
selection sort each time
going through the entire list looking
for the smallest possible element.
Highlighted in red for
just a moment here is
9, because we have not
yet until-- oh, now found
a smaller element, now 2, and now 1.
And we'll continue looking through
the rest of the numbers just
to be sure we don't find
something smaller, and once we do,
1 goes into place.
And then we repeat that process,
but we do fewer steps now,
because whereas there are n total bars,
we don't need to look at the leftmost
now because it's sorted, we
only need look at n minus 1.
So this process again will repeat.
We found 2.
We're just double-checking that
there's not something smaller,
and now 2 is in its place.
Now we humans, of course,
have the advantage
of having an aerial view, if
you will, of all this data.
And certainly a computer
could remember more than just
the smallest number it's recently seen.
Why not for efficiency remember
the two smallest numbers?
The three smallest numbers?
The four smallest numbers?
That's fine, but that argument
is quickly devolving into--
just remember all the original numbers.
And so yes, you could
perhaps save some time,
but it sounds like you're
asking for more and more space
with which to remember the
answers to those questions.
Now this, too, would seem
to be taking us all day.
Even if we down here
increase the animation speed,
it now is selecting those
elements a bit faster and faster,
but there's still so
much work to be done.
Indeed, these comparison-based sorts
that are comparing things again
and again and then redoing that work in
some form to improve the problem still
just tend to end up on the order of--
bingo, of n squared.
Which is to say that n
squared or something quadratic
tends to be rather slow.
And this is in quite contrast
to our logarithmic time before,
but that logarithm thus far
was for searching, not sorting.
So let's compare these
two now side by side,
albeit with a different tool that
presents the same information
graphically sideways.
Here again we have bars, and
small bar is small number,
and big bar is big number, but here,
they've simply been rotated 90 degrees.
On the left here we have selection
sort, on the right here bubble sort,
both of whose bars are
randomly sorted so that neither
has an edge necessarily over the other.
Let's go ahead and play all
and see what happens here.
And you'll see that
indeed, bubbles bubbling up
and selection is improving
its selections as we go.
Bubble would seem to have won because
selection's got a bit more work,
but there, too, it's
pretty close to a tie.
So can we do better?
Well it turns out we can, so long as
we use a bit more of that intuition
we had when we started
thinking computationally
and we divided and conquered,
we divided and conquered.
In other words, why not, given
n doors or n cups or in pages,
why don't we divide and conquer
that problem again and again?
In other words, in the
context of the cups,
why don't I simply sort for you the
left half and then the right half,
and then with two sorted halves, just
interweave them for you together.
That would seem to be a little
different from walking back and forth
and back and forth and swapping
elements again and again.
Just do a little bit of
work here, a little bit
more now, and then
reassemble your total work.
Now of course, if I simply
say, I'll sort this left half,
what does it mean to
sort this left half?
Well, I dare say this
left half can be divided
into a left half of the left half,
thereby making the problem smaller.
So somehow or other, we could leverage
that intuition of binary search,
but apply it to sort.
It's not going to be in the end
quite as fast as binary search,
because with sort, you have to
deal with all of the elements,
you can't simply tear
half of the problem
away because you'd be leaving
half of your elements unsorted.
But it turns out there's
many algorithms that
are faster than selection and
bubble sort, and one of those
is called merge sort.
And merge sort leverage is
precisely this intuition of dividing
a problem in half and in half, and to
be fair, touching all of those halves
ultimately, but doing it in a way
that's more efficient and less
comparison-based than bubble sort
and selection sort themselves.
So let me go ahead and play all
now with these three sets of bars
and see just which one wins now.
And after just a moment,
there's nothing more
to say-- merge sort has already won, if
you will, even though now bubble has,
and now selection.
And perhaps this was
a fluke-- to be fair,
these numbers are random,
maybe merge sort got lucky.
Let's go ahead and play the test
once more with other numbers.
And indeed it again is done.
Let me play it one third and
final time, but notice the pattern
now that emerges with merge sort.
You can see if you look closely
the actual halving again and again.
And indeed, it seems that
half of the list get sorted,
and then you re assemble
it at the very end.
And indeed, let's zoom
in on this algorithm
now and look specifically
at merge sort alone.
Here we have merge sort,
and highlighted in colors
as we do work is exactly the elements
you're sorting again and again.
The reason so few of
these bars are being
looked at a time is because again,
logically or recursively, if you will,
are we sorting first the left half?
But no, the left half of the left half.
But no, the left half of the
left half of the left half and so
forth, and what this
really boils down to
ultimately is sorting
eventually individual elements.
But if I hand you one element
and I say, please sort this,
it has no halves, so your work is
done-- you don't need do a thing.
But then if you have two
halves, each of size 1,
there might indeed be
work to be done there,
because if one is smaller than the
other or one is larger than the other,
you do need to interleave
those for me to merge them.
And that's exactly what
merge sort's doing here.
Allow me to increase the animation
speed and you'll see as we go,
that half of the list is
getting sorted at a time.
It's not perfect and it's
not perfectly smooth,
because that's-- because half
of the other elements are there,
but now are reemerging the two halves.
And that was fast,
but it finished faster
indeed than would have been
for bubble and selection sort,
but there was a price being paid.
If you think back to our vertical
visualization of bubble sort
and selection sort, they were
doing all of their work in place.
Merge sort seemed to be getting a
little greedy on us, if you will,
and that it was temporarily putting
some of those bars down here,
effectively using twice as much space
as those first two algorithms, selection
and bubble.
And indeed, that's where merge
sort gets its edge fundamentally.
It's not just a better algorithm,
per se, and better thought-out,
but it actually additionally consumes
more resources-- not time, but space.
By using twice as much space-- not
just the top half of the screen,
but the bottom--
can merge sort temporarily put
some of its work over here,
continue doing some other work,
and then reassemble them together.
Both selection sort and bubble
sort did not have that advantage.
They had to do everything
in place, which
is why we had to swap so
many things so many times.
We had far fewer spots in
which to work on that table.
But with merge sort,
spend a bit more space,
and you can reduce that amount of time.
Now all of these algorithms assume
that our data is back-to-back-to-back--
that is, stored in an array.
And that's great, because that's
exactly how a computer is so inclined
to store data inherently.
For instance, pictured here
is a stick of memory of RAM--
Random Access Memory.
And indeed, albeit a bit of a misnomer
that R in RAM, random, actually
means that a computer can jump
in instant or constant time
to a specific byte.
And that's so important
when we want to jump
around our data, our cups, or our pages
in order to get at data instantly,
if you will.
And the reason it is so conducive to
laying out information back-to-back
contiguously in memory is if we consider
one of these black chips on this DIMM--
or Dual In-line Memory Module--
is that we have in this black
chip really, if you will,
an artist's rendition at hand.
That artist's rendition
might propose that if you
have some number of bytes in this
chip, say 1 billion for 1 gigabyte,
it certainly stands to
reason that we humans could
number those bytes from 0 on up--
from 0 to 1 billion, roughly speaking.
And so the top left one here might
be 0, the next one might be 1,
the next one thereafter
should be 2, and so we can
number each and every one of our bytes.
And so when you store a number on
a cup or a number behind a door,
that amounts to just writing those
numbers inside of each of these boxes.
And each is next to the other,
and so with simple arithmetic,
a bit of division and
rounding, might you
be able to jump instantly to
any one of these addresses?
There is no moving parts here to do
any work like my human feet might
have to do in our real world.
Rather the computer can jump
instantly to that so-called address
or index of the array.
Now what can we do when
we have a canvas that
allows us to layout memory in this way?
We can represent any number of types.
Indeed in Python, there are
all sorts of types of data.
For instance, bool for a Boolean
value and float for a floating point
value, a real number with a decimal.
An int for an integer
and str for a string.
Each of those is laid out in memory
in some particular way that's
conducive to accessing it efficiently.
But that's precisely why,
too, we've run into issues
when using something like a float,
because if you decide a priori to use
only so many bytes, bytes to
the left and to the right,
above and below it might end up getting
used by other parts of your program.
And so if you've only asked for,
say, 32 or 64 bits or 4 or 8 bytes,
because you're then going to
be surrounded by other data,
that floating point value or some other
can only be ultimately so precise.
Because ultimately yes,
we're operating in bits,
but those bits are physically
laid out in some order.
So with that said, what are the options
via which we can paint on this canvas?
Surely it would be
nice if we could store
data not necessarily always
back-to-back in this way,
but we can create more
sophisticated data structures
so as to support not only these
types here, but also ones like these.
Dict in Python for dictionary,
otherwise known as a hash table.
And list for a sort of array that
can grow and shrink, and range
for a range of values.
Set for a collection
of values that contain
no duplicates, and tuples, something
like x, y or latitude, longitude.
These concepts-- surely
it would be nice to have
accessible to us in higher
level contexts like Python,
but if at the end of the day all we
have is bytes of memory back-to-back,
we need some layers of
abstraction on top of that memory
so as to implement these more
sophisticated structures.
So we'll take a look at
a few in particular ints
and str and dict and
list, because all of those
somehow need to be built on top of
these lower-level principles of memory.
So how might this work and
what problems might we solve?
Let's now use the board as
my canvas, drawing on it
that same grid of rows
and columns in order
to divide this screen
into that many bytes.
And I'll go ahead and divide this board
into these squares, each one of which
represents an individual byte, and
each of those bytes, of course,
has some number associated with it.
That number is not the number inside
of that box, per se, not the bits
that compose it, but rather
just metadata-- an index
where address that exists implicitly,
but is not actually stored.
This then might be index
0 or address 0, this
might be 1, this 2, this
3, this one 4, this one 5.
And if we, as for artist's
sake, move to the next row,
we might call this 6 and
this 7, and so forth.
Now suppose we want to store some
actual values in this memory,
well let's go ahead and do just that.
We might stored the actual
number 4 here, followed by 8,
followed by 15 and 16, perhaps
followed by 23, and then 42.
And so we have some random
numbers inside of this memory,
and because those
numbers are back-to-back,
we can call this an array of size 6.
Its first index is 0,
its last index is 5,
and between there are six total values.
Now what can we do if we're ready to
add a seventh number to this list?
Well, we could certainly
put it right here
because this is the next
appropriate location,
but it depends whether that
spot is still available.
Because the way a
computer typically works
is that when you're
writing a program, you
need to decide in advance
how much memory you want.
And you tell the computer by
way of the operating system,
be it Windows or macOS,
Linux, or something else,
how many bytes of memory you would like
to allocate to your particular problem.
And if I only had the
foresight to say, I
would like 6 bytes in
which to store 6 numbers,
the operating system might have
handed me that back and said,
fine, here you go, but the
operating system thereafter
might have proceeded to allocate
subsequent adjacent bytes, like 6
and 7, to some other
aspect of your program.
Which is to say, you might have
painted yourself into a bit of a corner
by only in code asking the operating
system for just those initial 6 bytes.
You instead might have
wanted to ask for more bytes
so as to allow yourself
this room to grow,
but if you didn't do that in
code, you might just be unlucky.
But that's the price
you pay for an array.
You have this wonderfully
efficient ability
to search it randomly,
if you will, which
is to say instantly via arithmetic.
You can jump to the beginning
or the end or even the middle,
as we've seen, by just doing perhaps
some addition, subtraction, division,
and rounding, and that gets
you ultimately right where
you want to go in some constant
and very few number of steps.
But unfortunately, because
you wanted all of that memory
back-to-back-to-back, it's up to you
to decide how much of it you want.
And if the operating system, I'm
sorry, has already allocated 6, 7,
and elsewhere on the board to
other parts of the program,
you might be faced with the
decision as to just say, no,
I cannot accept any more data, or
you might say, OK, operating system,
what if I don't mind where I am in
memory-- and you probably don't--
but I would like you to find
me more bytes somewhere else?
Rather like going from a one-bedroom
to a two-bedroom apartment
so that you have more room, you might
physically have to pack your bags
and go somewhere else.
Unfortunately, just like in the
real world, that's not without cost.
You need to pack those bags and
physically move, which takes time,
and so will it take you and
the operating system some time
to relocate every one of your values.
So sure, there might be plenty of
space down here below on multiple rows
and even not pictured, but it's going
to take a non-zero amount of time
to relocate that 4 and 8 and 15 and
that 16 and 23 and 42 to new locations.
That might be your only option
if you want to support more data,
and indeed, most programs would want--
it would be an unfortunate situation
if you had to tell your user or
boss, I'm sorry, I ran out of space,
and that's certainly foolish.
If you actually do have more space,
it's just not right there next to you.
So with an array, you have
the ability physically
to perform very sophisticated, very
efficient algorithms such as we've
seen-- binary search and
bubble sort and selection sort
and merge sort, and do
so in quite fast time.
Even though selection sort and
bubble sort were big O of n squared,
merge sort was actually n
times log n, which is slow--
which is slower than log n
alone, but faster than n squared.
But they all presuppose that you
had random access to elements
arithmetically via their indexes
or address, and to do so,
you can with your computer's
memory with arrays,
but you need to commit to some value.
All right, fine.
Let's not ask the operating system for
6 bytes initially, let's say, give me 7
because I'm going to
leave one of them blank.
Now of course, that might buy
you some runway, so to speak,
so that you can accommodate
if and when a seventh element,
but what about an eighth?
Well, you could ask the operating
system from the get-go, don't get me
6 bytes of space, but give me
8 or give me 16 or give me 100.
But at that point, you're
starting to get a little greedy,
and you're starting to ask for
more memory than you might actually
need anytime soon, and
that, too, is unfortunate,
because now you're being wasteful.
Your computer, of course, only
has a finite amount of space,
and if you're asking for more
of it than you actually need,
that memory, by definition,
is unavailable to other parts
of your program and perhaps even others.
And so your computer
ultimately might not
be able to get as much work done because
it's been holding off to the side
just some empty space.
Empty parking spaces you've
reserved for yourself or empty seats
at a table that might potentially
go unused, it's just wasteful.
And hardware costs money.
And hardware enables
you to solve problems.
And with less hardware available,
can you solve fewer problems at hand,
and so that, too, doesn't
feel like a perfect solution.
So again, this series of
trade-offs, it depends
on what's most important to you--
time or space or money or development
or any number of other scarce resources.
So what can we do instead
as opposed to an array?
How do we go about getting dynamism
that we so clearly wants here,
whereas it wouldn't-- wouldn't it be
nice if we could grow these great data
structures, and better
yet, even shrink them?
If I no longer need
some of these numbers,
I'm going to give you back that
memory so that I can use it elsewhere
for more compelling purposes.
Well it turns out that
in computer science,
programmers can create even
fancier data structures
but at a higher level of abstraction.
It turns out, we could start
making lists out of our values.
In fact, if I wanted to add some
number to the screen, and for instance,
maybe these two spots were
blocked off by something else.
But you know what?
I do know there's some room
elsewhere on the screen,
it just happens to be available here.
And so if I want to put the
number 50 in my list of values,
I might just have to say, I
don't care where you put it,
go ahead and put it right there.
Well where is there?
Well if we continue this indexing--
this is 6 and 7 and 8 and 9, 10, 11, 12,
13, 14, and 15, if 50 happens to
end up by chance at location 15
because it's the first byte available,
because not only these two, but maybe
even all of these are taken
for some other reason--
ever since you asked for
your first six, that's OK,
so long as you can somehow link
your original data to the new.
And pictorially here, I might be
inclined just to say, you know what?
Let me just leave a little breadcrumb,
so to speak, and say that after the 42,
I should actually go down
here and follow this arrow.
Sort of Chutes and Ladders
style, if you will.
Now that's fine and you can do that--
after all, at the end of the day,
computers will do what
you want, and if you
can write the code to
implement this idea,
it will, in fact, remember that value.
But how do we achieve this?
Here, too, you have to come back
to the fundamental definition
of what your computer is doing and how.
It's just got that chip of memory,
and those bytes back-to-back,
such as those pictured here.
So this is all you get-- there is no
arrow feature inside of a computer.
You have to implement
that notion yourself.
So how can you go about doing that?
Well, you can implement
this concept of an arrow,
but you need to implement it
ultimately at a lower level or trust
that someone else will for you.
Well, as best I can tell, I do know
that my first several elements happened
to be back-to-back from 4 on up
to 42 in locations 0 through 5.
Because those are contiguous,
I get my random access
and I can immediately jump from
beginning to middle to end.
This 50 and anything after it needs
to be handled a little better.
If I want to implement this
arrow, the only possible way
seems to be to somehow remember
that the next element after 42
is at location 15.
And that location, a.k.a.
address or index, just has
to be something I remember.
Unfortunately I don't have quite
enough room left to remember that.
What I really want to do is not
store this arrow, but by the way,
parenthetically go ahead
and store the number 15--
not as the index of that
cell, but as the next address
that should be followed.
The catch, though, is that I've
not left myself enough room.
I've made mental note
in parentheses here
that we've got to solve
this a bit better.
So let's start over for the
moment, and no longer worry
about this very low level, because
it's too messy at some point.
It's like talking in 0's and 1's--
I don't want to talk
in bytes in this way.
So let's take things up
in abstraction level,
if you will, and just agree to agree
that you can store values in memory,
and those values can be
data, like numbers you want--
4, 8, 15, 16, 23, 42, and now 50.
And you can also store somehow
the addresses or indexes--
locations of those values.
It's just up to you
how to use this canvas.
So let's do that and clear
the screen and now start
to build a higher-level concept.
Not an array, but something
we'll call a linked list.
Now what is a linked list?
A linked list is a data structure that's
a higher-level concept in abstraction
on top of what ultimately is
just chunks of memory or bytes.
But this linked list shall enable
me to store more and more values
and even remove them simply
by linking them together.
So here, let me go ahead and represent
those same values starting with 4,
followed by 8 and 15, and then
16 and 23, and finally, 42.
And now eventually I'm going to want
to store 50, but I've run out of room
but that's fine, I'm going to go ahead
and write 50 wherever there's space.
But now let's not worry about that
grid, rows, and columns of memory.
Let's just stipulate that
yes, that's actually there,
but it's not useful to
operate at that level.
Much like it's not useful to continually
talk in terms of 0's and 1's.
So let me go ahead and wrap these
values with a higher-level idea
called a node or just a box.
And this box is going to store
for us each of these values.
Here I have 4, here I have
8 and 15, here I have 16,
I have 23, and finally, 42.
And then when it comes
time to add 50 to the mix,
it, too, will come in this box.
Now what is this box?
It's just an artist's rendition
of the underlying bytes,
but now I have the ability to draw
a prettier picture, if you will,
that somehow interlinks
these boxes together.
Indeed, what I ultimately
want to remember
is that 4 comes first and 42 comes
last, but then wait, if I had 50,
it shall now come last.
So we could do this as an artist quite
simply with those arrows pointing
each box to the next, implying
that the next element in the list,
whether it's next door or far away,
happens to be at the end of that arrow.
But what are those arrows?
Those are not something that
you can represent in a computer
if at the end of the day all you have
are blocks of memory and in them bytes.
If all you have are bytes--
when, therefore, patterns
of 0's and 1's, whatever
you store in the computer
must be representable with those 0's
and 1's, and among the easiest things
to represent, we know already, is
numbers, like indexes or addresses
of these nodes.
So for instance, depending on
where these nodes are in memory,
we can simply check that
address and store it as well.
So for instance, if the 4 still
happens to be at address 0,
and this time 8 is at address 4,
and this one 8, and this one 12,
and this one 16, and this one 20-- just
by chance back-to-back-to-back 4 bytes
apart--
32 bits, well 50 might
be some distance away.
Maybe it's actually at
location 100, that's OK.
We can still do this.
Because if we use part of
this node, part of each box
to implement those actual arrows, we
can actually store all the information
we need to know how to get
from one box to another.
For instance, to get from
4 to the next element,
you're going to want to coincidentally
go to not number 4, but address 4.
And if you want to go from
value 8 to the next value, 15,
you're going to want to go to address 8.
And if you want to go from
15 to 16, the next address
is going to be 12, followed
by 16, followed by 20.
And herein lies the magic--
if you want to get from 42
to that newest element that's
just elsewhere at address 100, that's
what gets associated with 42's node.
As for 50, it's the dead end.
There's nothing more
there, so we might simply
draw a line through that
box saying, eh, just
store it all 0 bits or some
other convention equivalently.
So there's so many
numbers now on the screen,
but to be fair, that's all that's
going on inside of a computer--
just storing of these bytes.
But now we can stipulate
that, OK, I can somehow
store the location of each node in
memory using its index or address.
It's just frankly not all that
pleasant to stare at these values,
I'd much rather look at and
draw the arrows graphically,
thereby representing the same idea
of these pointers, if you will,
a term of art in some languages
that allows me to remember
which element goes to which.
And what is the upside of
all this now complexity?
Well now we have the ability to
string together all of these nodes.
And frankly, if we wanted to
remove one of these elements
from the list, that's fine,
we can rather snip it out.
And we can simply update what
the arrow is pointing to,
and equivalently, we can update
the next address in that node.
And we can certainly add to this list
by drawing more nodes here or perhaps
over here and just link them with arrows
conceptually, or more specifically,
by changing that dead end to
the address of the next element.
And so we can create the idea of
the abstraction of a list using
just this canvas of memory.
But not all is good here.
We've surely paid a price, right?
Surely we couldn't get dynamism
for addition and removal
and updating of a list
without paying some price.
This dynamic growth, this ability
to store as many more elements
as we want without having to tell the
operating system from the get-go how
many elements we expect.
And indeed, while we're
lucky at first, perhaps,
if we know from the get-go we
need at least six values here,
they might be a consistent
distance apart--
4 bytes or 32 bits.
And so I could do arithmetic
on some of these nodes,
but that is no longer, unfortunately,
a guarantee of this structure.
Whereas arrays do guarantee you
random access, linked lists do not.
And linked lists instead require
that you traverse them in linear time
from the first element potentially
all the way to the last.
There is no way to jump
to the middle element,
because frankly, if I do that math as
before, 100 bytes away is the last,
so 100 divided by 2 is 50--
rounding down, keeping me at 50,
puts me somewhere over
here, and that's not right.
The middle element is
earlier, but that's
because there's no now support for
random access or instant arithmetic
access to elements like
the first, last, or middle.
All we'll remember now for the
linked list is that first element,
and from there, we have to
follow all of those breadcrumbs.
So that might be too
high of a price to pay.
And moreover, there's overhead
now, because I'm not storing
for every node one value, but two--
the value or data I care about, and
the address or metadata that lets me
get to the next node.
So I'm using twice as
much space there, say,
at least when storing
numbers, but at least
I'm getting that dynamic
support for growth.
So again, it depends on that trade-off
and what is less costly to you.
But never fear.
This is just another problem to solve.
To be clear, we'd like to retain
the dynamism that something
a linked list offers-- the ability
to grow and even shrink that data
structure over time without having to
decide a priori just how much memory we
want.
But at the moment we've lost the ability
to search it quickly, as with something
like binary search.
So wouldn't it be nice if we could
get both properties together?
The ability to grow and shrink
as well as to search fast?
Well I daresay we can
if we're just a bit more
clever about how we draw on our canvas.
Again, let's stipulate
that we can certainly
store values anywhere in memory
and somehow stitch them together
using addresses.
Now those addresses,
otherwise known as pointers,
we no longer need draw, because
frankly, they're just now a distraction.
It suffices to know we can draw them
pictorially as with some arrows,
so let's do just that.
Let me go ahead now
and draw those values,
say 16 up here followed by
my 8 and 15, as well as my 4.
Over here, well I draw
that 42 and my 23,
and now it remains for me to
somehow link these together.
Since I don't need to leave
room for those actual addresses,
it suffices now to just draw arrows.
I'll go ahead and draw just a box around
16 and 8, as well as my 4 and my 15,
as well as my 23 and my 42.
Now how should I go about linking them?
Well let me propose that we no
longer link just from left to right,
but rather assemble more of a
hierarchy here with 16 pointing at 8,
and 16 also pointing at 42.
And 42, meanwhile, pointing at 23
with 8 pointing at 4 as well as 15.
Now why have I done it this way?
Well by including these arrows
sometimes bidirectionally,
have I stitched together
a two-dimensional data
structure, if you will?
Now this again surely could be
mapped to that lower level of memory
just by jotting down the addresses
that each of these arrows represents,
but I like thinking at
this level of abstraction
because I now can think in more
sophisticated form about how
I might layout my data.
So what properties do I now
get from this structure?
Well, dynamism was the
first goal at hand,
and how might I go about
adding a new value?
Say it's 50 that I'd like
to add to this structure.
Well, if I look at the
top here, 16, it's already
got two arrows, so it's full,
but I know 50 is bigger than 16,
so let's start to apply
that dynamic and say 50
shall definitely go down to the right.
Unfortunately, 42 already has one arrow
off it, but there is room for more,
and it turns out that 50 is,
in fact, greater than 42.
So you know what?
I'm just going to slot 50 right there
and draw 42's second arrow to 50.
And what picture seems
to be emerging here?
It's perhaps reminiscent
of a family tree of sorts.
Indeed, with parents and children,
or a tree more generally with roots.
Now whereas in our human
world, trees tend to grow up,
these trees in computer
science tend to grow down.
But henceforth, let's
call this 16 our root,
and to its left is its left child, to
its right is its right child, or more
generally, a whole left subtree
and a whole right subtree.
Because indeed, starting at 42,
we have another tree of sorts.
Rooted at 42 is a child called
23, and another child called 50.
So in this case, it's each of
the nodes in our structure,
otherwise known in computer science as
a tree, has zero, one, or two children,
you can create the second dimension.
and you can preserve
not only the ability
to add data dynamically
like 50, but, but,
but, we also now gain back
that ability to search.
After all, if I'm
asked now the question,
is the number 15 in this structure?
Well let me check for you.
Starting at 16, which is where this
structure begins, just like a linked
list starts conceptually
at the left, I'll
check if 16 is the value you
want-- it's not, it's too big,
but I do know that 15, if
it's here, it's to the left.
Now 8, of course, is not
the value you want either,
but 8 is smaller than 15,
so I'll now go to the right.
And indeed, sure enough,
that I now find 15.
And it only took me one,
two steps, not n to find it,
because through this second dimension
am I able to lift up some of those nodes
rather than draw them just
down as a straight line,
or in the linked to list, all
the way from left to right.
With the second dimension can I
now organize things more tightly.
And notice the key
characteristics of this tree.
It is what's generally known,
indeed, as a binary search tree.
Not only because it's a tree
that lends itself to search,
but also because each of the nodes
has no more than two or bi-children--
zero, one, or two.
And notice that to the left of
the 16 is not only the value
8, but every number that can be reached
to the left of 16 happens to be,
by design, less than 16.
And that's how we found 15.
Moreover to the right of 16,
every value is greater than 16,
just as we have here.
And that definition can be
applied so-called recursively.
You can make that claim about every
node in this tree at any level,
because here, 42, every node to
its left albeit just one is less.
Every node to its right
albeit one is indeed more.
So so long as you bring to bear to
our data the same sort of intuition
we brought to our phone book can
we achieve these same properties
and goals, this efficiency
of logarithmic time.
Log base 2 of n is indeed how long
it might take us, big O of that
to find or insert some value.
Now to be fair, there are
some prices paid here.
If I'm not careful, a
data structure like this
could actually devolve
into a linked list
if I just keep adding,
by coincidence or intent,
more and more big and big numbers.
They might just so happen to get
long and long and long and stringy
unless we're smart about how we
rebalance the tree occasionally.
And indeed, there are other
forms of these trees that
are smart, and with more code, will
rebalance themselves to make sure
that they don't get long and stringy,
but stay as high up as possible.
But there's another price paid
beyond that potential gotcha--
more space.
Whereas my array used no arrows
whatsoever and thus no extra space,
my linked list did use one extra
chunk of space for each node--
storage for that point or
address of its neighbor.
But in a tree structure, if
you're storing multiple children,
you're using as many as two
additional chunks of memory
to store as many if two of those arrows.
And so with a tree structure
are you spending more space,
but potentially it's saving you time.
So again, we see this
theme of trade-offs,
whereby if you really want
less time to be spent,
you're going to have to
spend more of that space.
Now can we do even better?
With an array, we had
instant access to data,
but we painted ourselves
into that corner.
With a linked list did we
solve that particular problem,
but we gave up the ability
to jump right where we want.
But with trees, particularly
binary search trees,
can we rearrange our data intelligently
and regain that logarithmic time.
But wouldn't it be nice if we
could achieve even better, say,
constant time searches of
data and insertions thereof?
Well for that, perhaps we could
amalgamate some of the ideas
we've seen thus far into just
one especially clever structure.
And let's call that particular
structure a hash table.
And indeed, this is perhaps, in theory,
the holy grail of data structures,
insofar as you can store anything
in it in ideally constant time.
But how best to do this?
Well let's begin by
drawing ourselves an array.
And that array this time
I'll draw vertically simply
to leave ourselves a bit more
in room for something clever.
This array, as always, can be indexed
into by way of these locations
here where this might be
location 0 and 1, 2, and 3,
followed by any number of others.
Now how do I want to use this array?
Well suppose that I want to
store names and not numbers.
Those names, of course, could just
be inserted in any old location,
but if unsorted, we
already know we're going
to suffer as much as big O of n time--
linear time with which to find
a particular name in that array
if you know nothing a
priori about the order.
Well we know already, too, we could
do better just like the phone company,
and if we sort the names we're
putting into this structure,
we can at least then do binary search
and whittle that search time down
to log base 2 of n.
But wouldn't it be nice if we
can whittle that down further
and get to any name we want in nearly
constant time-- one step, maybe two
or a few?
Well with a hash table can you
approximately or ideally do that,
so long as we decide in advance
how to hash those strings.
In other words, those strings of
characters, here called names,
they have letters inside of
them, say D-A-V-I-D for my own.
Well what if we looked
at not the whole name,
but that first letter, which
is, of course, constant time
to just look at one value.
And so if D is the fourth letter in
the English alphabet, what if I store
DAVID--
or really, any D name at the
fourth index in my array,
location 3 if you start counting at 0?
So here might be the A names, and here
the B names, and here the C names,
and someone like David now belongs
in this bucket, if you will.
Now suppose I want to store
other names in this structure.
Well Alice belongs at location 0,
and Bob, for instance, location 1.
And we can continue this
logic and can continue
to insert more and more names
so long as we hash those names
and jump right to the right location.
After all, I can in one
step look at A or B or D
and instantly know 0 or 1 or 3.
How?
Well recall that in a computer
you have ASCII or Unicode.
And we already have numbers
predetermined to map
to those same characters.
Now to be fair, A I'm pretty
sure it was 65 in ASCII,
but we could certainly
subtract 65 from 65 to get 0.
And if capital B was 66, we could
certainly subtract 65 from 66 to get 1.
So we can look, then, at the first
letter of any name, convert it to ASCII
and subtract quite simply
65 if it's capital,
and get precisely to the index we want.
So to be fair, that's not one,
but it is two or three steps,
but that is a constant number of
steps again and again independent
of n, the total number of names.
Now what's nice about this is that we
have a data structure into which we
can insert names instantly by
hashing them and getting as output
that number or index 0 through 25,
in the case of an English alphabet.
But what problem might arise?
The catch, though, is that we
have someone else, like Doug,
whose name happens to
start with the same name,
unfortunately there seems to be
no room at this moment for Doug
since I'm already there.
But there we can draw inspiration
from other data structures still.
We could maybe not just
put David in this array,
but not even treat this array
as the entire data structure,
but really the beginning of another.
In fact, let me go ahead and
put David in his or my own box
and give Doug his own as well.
Now Doug and I are really
just nodes in a structure.
And we can use this array still to
get to the right nodes of interest,
but now we can use arrows
to stitch them together.
If I have multiple names,
each of which starts with a D,
I just need to remember
to link those together,
thereby allowing myself to
have any number of names
that start with that same
letter, treating that list really
as a linked list.
But I get to that length list instantly
by looking at that first letter
and jumping here to the right location.
And so here I get both dynamic growth
and instant access to that list,
thereby decreasing
significantly the amount of time
it takes me to find someone
maybe 1/26 of the time.
Now to be fair, wait a
minute, we're already
seeing collisions, so to speak,
whereby I have multiple inputs hashing
to the same output--
three in this instance.
And in the worst case,
perhaps everyone in the room
all has a name that starts
with D, which means really,
you don't have a hash
table or array at all,
you just have one really long
linked list, and thus, linear.
But that would be considered a more
perverse scenario, which you should try
to avoid by way of that hash function.
If that is the problem you're facing,
then your hash function is just bad.
You should not have
looked only in that case
at just the first letter of every name.
Perhaps you should have looked
at the first two letters
back-to-back, and put anyone's name
that starts with D-A in one list;
and D-B, if there is any, in a second
list; and D-C, if there's any of those,
in some third list altogether;
and D-D and D-E and D-F
and so forth, and actually have multiple
combinations of every two letters,
and have as many buckets, so to
speak, as many indexes in your array
as there are pairs of
two alphabetical letters.
Now to be fair, you
might have two people
whose names start with D-A or D-O,
but hopefully there's even fewer.
And indeed, I say a hash table--
this whole structure approximates
the idea of constant time
because it can devolve in places to
linear time with longer lists of names.
But if your hash function is good
and you don't have these collisions,
and therefore ideally you don't have
any linked lists, just names, then
you indeed have a structure that
gives you constant time access,
ultimately, combining
all of these underlying
principles of dynamic
growth and random access
to achieve ultimately the
storage of all your values.
How, then, might a language like Python
implement data types like int and str?
Well in the case of
Python's latest version,
it allows ints to grow as
big as you need them to be.
And so it surely can only be using
contiguous memory once allocated
that stays in the same place.
If instead you want a
number to grow over time,
well you're probably going to need to
allocate some variable number of bytes
in that memory.
Strings, too, as well.
If you want to allocate strings, you're
going to need to allow them to grow,
which means finding
extra space in proximity
to the characters you already have, or
maybe relocating the whole structure
so that that value can keep growing.
But we know now, we can do
this with our canvas of memory.
How the particular language does it
isn't even necessarily of interest,
we just know that it can, and
even underneath the hood, how
it might do so.
As for these other structures in Python
like dict or dictionary and list,
well those, too, are exactly
what we've seen here.
A dictionary in Python is really just
a hash table, some sort of variable
that has indexes that are not
necessarily numbers, but words,
and via those words can
you get back a value.
Indeed, more generally does a
hash table have keys and values.
The keys are the inputs via
which you produce those outputs.
So in our data structure, might
have been the inputs as names.
The output of my hash function was
an index value like some number.
And in Python do you have a
wonderful abstraction in code that
allows you to express that
idea of associating keys
with values, names with
yes or no, true or false
they are present so that you can ask
those questions yourself in your code.
And as for list, it's quite simply that.
It's the idea of an array
but with that added dynamism,
and as such, a linked list of sorts.
And so now at this higher level of code
can you not only think computationally,
but express yourself
computationally knowing and trusting
that the computer can do that bidding.
How the data structures
are organized really
is the secret source of
these languages and tools,
and indeed, when you have some
database or backend system, too,
the intellectual property
that underlies those systems
ultimately boils down not
only to the algorithms
in use, but also the data structures.
Because together, they--
and we've seen this--
together combine to produce not only
the correctness of answers you want,
but the efficiency with which
you can to those answers.
