[MUSIC PLAYING]
SPEAKER 1: All right.
So this is CS50.
And this is the start of week 3.
And you'll recall that last time,
we introduced a number of new ideas
that we'll now start taking for granted
as being already under our belt.
In particular, we looked last
time at debugging more generally.
And recall that debugging is the
process of removing bugs from your code,
removing mistakes from your code.
And we proposed these
three CS50-specific tools.
In particular, help50, which
is a command line tool.
Something that you can
run at your keyboard
and actually run it
before your own command.
So that if you don't
understand some error
message from a program you're
compiling or a command you're running,
you can actually see some
helpful output from CS50 staff.
eprintf, which, by contrast, is
something you can put into your code,
like printf, and print
a message to yourself.
And nicely enough, it's going to
actually include the line number
and the file name from which it came.
So you can identify where
some output is coming.
And then lastly, debug 50 which is the
most sophisticated of these programs.
And debug 50 when you run
it at the command line,
followed by the name of your
program in the usual way,
it's actually going to open
up what's called a debugger.
A graphical user interface
that CS50 staff put together
that will actually let you
step through your code,
see variables, and generally
understand it better.
So we also looked at this
in the context of Ovaltine
in particular, and Little Ralphy.
And we looked at cryptography,
the art of concealing information.
And indeed, among the
challenges in CS50,
ultimately if you choose
to accept them, are
things like implementing
Caesar cipher and Vigenere
cipher, cracking passwords.
So indeed, those problems lie ahead
and draw inspiration from this domain.
And then, we focused more
technically on what a string is.
So Zamyla, a name we keep taking for
granted, that a computer can express.
Z-A-M-Y-L-A, as a string
or a phrase in a program.
But it turns out that
underneath the hood,
strings are a little more mundane.
They really are just
this grid of information.
And we might draw Zamyla's
characters boxed in like this.
But if you think of what's
inside of your computer itself,
you have a whole bunch
of memory, or RAM.
And inside of that RAM,
you have Z-A-M-Y-L-A,
but then this special character.
What is this /0 used for.
Why is there this seemingly strange
character at the end of Zamyla's name?
Anyone?
Yeah.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Yeah, to denote
the end of the string.
Because after all, if we were to
have someone else's name here,
if the next string I got
is Maria, for instance.
M-A-R-I-A, we need to be able
to know where Zamyla's name ends
and Maria's name starts in memory.
Because at the end of the day,
C, and programming in particular,
isn't really all that sophisticated.
At the end of the day, the only
place you have to store things
is your computer's memory or RAM.
And each of these boxes represents a
byte of memory inside of your computer.
And so if all this is-- so
far as the computer cares--
is a sequence of six
alphabetical letters,
the computer needs to be told
or given a hint as with this /0,
which just represents 8 zero bits, as to
where Zamyla's name ends and something
else begins.
And we use this as an opportunity
to introduce ultimately
command line arguments.
So in particular, we've been
writing programs thus far
with just int main void.
And void, we realized, signifies that
a program takes no command line inputs.
It can still call functions, like get
string and get int and get [? flow. ?]
But when you run the program, you can't
put any additional words at the prompt
after it if you've declared
main as taking void as input.
Nothing as input.
But we introduced instead
this, int main int arg
c string arg v open bracket
close bracket, which
means main can alternatively take two
inputs from the so-called command line
when you run it.
It can take arg c and arg
v. And what was arg c?
What was arg c?
Arg c.
Yeah, argument.
Yeah.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Yeah, the number
of command line arguments.
Arg c for argument count.
So how many words did
you type at the prompt?
And arg v means argument
vector, which is
another way of saying it's an
array, a contiguous block of memory
inside of which are all
of the words themselves
that you might have typed at the prompt.
So this time, we're
going to build on that.
We're going to focus
today mostly on ideas.
Mostly on algorithms.
Mostly on problem-solving and less
on the lower-level implementation
details of code.
Though, we'll look at
a couple of examples.
So this more generally is
what's called an array.
If you think of your computer's
memory as just a block of memory,
block of bytes.
And maybe it's byte 0, then byte 1.
Then, byte 2, just because you
can number them, certainly.
If you think of them that way, this
is kind of an arbitrary generic data
structure, if you will.
A generic way of just storing
things underneath the hood.
Now, what do I mean by that?
And why is this useful?
Well, consider something like this.
Consider having 8 doors,
behind which are some numbers.
And the goal at hand
is to solve a problem.
Find a number behind these doors.
Now, unfortunately, you
don't know in advance
what numbers are behind those
doors and where they are.
Indeed, that's why the doors are closed.
But you know that, say, the number
50 is somewhere behind those doors.
So what do we do to actually find this?
Well, let's go ahead and do this.
Let me go ahead and summon up
a brave volunteer if you would.
There's a CS50 stress
ball in it for you.
OK, come on up.
You want to come up on up over here?
So we have not 8, but
rather 7 doors here.
Even easier.
So 7/8 the size.
And what's your name?
AUDIENCE: Derek [INAUDIBLE].
SPEAKER 1: Derek?
OK.
David, come on up.
Nice to meet you.
All right.
So here we have 7 doors.
And the goal at hand for you is
to simply find the number 50.
How do we find it?
What are you going to do?
AUDIENCE: [INAUDIBLE].
SPEAKER 1: It's behind one
of these doors, I think.
AUDIENCE: All right.
I'll just try a door?
SPEAKER 1: Try a door.
Not the door.
So 16 turns up.
OK.
Stress ball gets one step away.
Try another.
42.
Answer to life, the universe, and
everything, but not this question.
50.
OK, very well done.
So a round of applause if
we could for Derek here.
Very nicely done.
OK.
And let's pause for just a moment.
What was the algorithm?
What was the set of instructions
that you used to find the number 50?
AUDIENCE: Just try and see if I
could find any sort of arrangement.
But otherwise, try random doors.
SPEAKER 1: OK.
So try random doors.
And indeed, given no
additional information from me,
that's the best you can do.
Just try random doors.
Or, if it's a little easier, just
go left to right or right to left.
But random is perfectly equivalent
to all of those algorithms
because you don't know
anything necessarily a priori.
So how about for two stress balls?
Find the number 50 this time,
but the additional ingredient
is that we have taken
the time in advance
to sort the numbers behind these doors.
Does that change your
thinking in any way?
AUDIENCE: Well, I don't
know whether you're
sorting increasing left or right.
SPEAKER 1: Good.
So good challenge.
He doesn't know if we've sorted
from left to right or right to left.
Left to right.
So generally, a computer scientist
when he or she says "sorted"
would say left to right.
But reasonable question.
AUDIENCE: Left to right
increasing or decreasing?
SPEAKER 1: Left to right increasing.
Yes.
It would be left to right in any way.
16.
Not 50.
So what are you going to do?
AUDIENCE: Well, let's see.
I know these have to be
less, so I can remove them.
SPEAKER 1: Good.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: 42 is still not 50.
And to the right.
Very well done for Derek.
OK, come on.
OK.
They're less impressed that time.
So what did you do, though, that second
time that was actually more effective?
I mean, frankly, at first glance it
looks like this approach was just as
fast or just as slow
as your first approach
because it still took you through steps.
AUDIENCE: So what I did was since
you told me that it was sorted left
to right increasing, I decided to
look at the middle number first
and see whether it was
greater than or less than--
SPEAKER 1: Good
AUDIENCE: --the target number.
SPEAKER 1: Good.
AUDIENCE: And since it was less than,
I knew that all of these three values--
SPEAKER 1: Wonderful.
And indeed, are smaller by design.
And so you can throw half of that
problem away, much like in week 0
when we tore the phone book in half.
You can throw half of that
phone book away as well.
So thank you very much for
finding us the number 50.
So this is something that we
did see, indeed, in week 0.
And it's what we would
generally call binary search.
Or more generally, just a
divide and conquer algorithm.
But today, what we
start to do is to assess
just how much better this actually is.
And how we can go about finding
things more programmatically.
So what Derek could have tried
is what someone would generally
call linear search.
And as the name suggests,
linear, meaning line,
you can just kind of solve a
problem from left to right.
And indeed, while Derek
did take a random approach,
it would have been
functionally equivalent
for him simply to go left to right or
right to left and find the number 50.
It's indeed there.
So let's introduce some
pseudocode for this.
And there is no one right
way to implement pseudocode.
Here is one of the most
succinct ways I could come up
with for expressing this kind
of left to right approach.
Or equivalently, right to left.
For each element in the array, if
element you're looking for return true.
And so this is a very common
paradigm in programming.
Just say something generally,
like for each element in an array.
And the implication is that you
would generally go left to right,
but it's equivalent to go right to left.
And if we indeed see at
some location in that array
that array of boxes, the element we're
looking for, we can just return true.
And yet, here I have return false.
But the indentation, so to
speak, is very different.
Why have I aligned return
false with the for loop
here all the way on the left as opposed
to it being aligned in the middle
or one line deeper?
Yeah.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Yeah.
In order to denote that all
of the indented elements
are within the for loop.
So it doesn't strictly
matter how much you indent.
And indeed, in some
languages, C among them,
it doesn't matter that
you indent at all.
But for the sake of good
style and readability,
we've indented this
to convey to the human
especially that this "if" is only
applicable when you're inside
of the for loop.
This return is only applicable when
you're inside of this "if" condition.
And this return false
really only happens
as the very last step in this program if
nothing else actually returns a value.
So only if return true
happens does this program say,
yes, I found the element
you're looking for.
And you can think of this, therefore,
as the sort of default case.
Now, just to toss it out
there, why would this be wrong?
Let me go ahead and
temporarily change this program
to be a more familiar if/else block.
So why is this seemingly logically fine?
It's a valid puzzle piece
if you will, a la scratch.
But this program is now incorrect.
Why?
Yeah.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Once you hit
it, it's automatically
going to come out of the for loop when?
Exactly.
When will this come out of the for loop?
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Good.
So it's going to return false
if the first element isn't
what you're looking for.
In so far as for each
element in an array
kind of implies that you should
be looking from left to right
or right to left, the
first question you're
asking yourself when looking
in that first element
is, is this the element
you're looking for?
If it is, great.
You're going to return true.
And your answer is correct.
But if it's not the
element you're looking
for, you're executing else return false
immediately saying no, the element
I'm looking for is not here.
But when really the only question
you've asked is, is the element
I'm looking for the
very first in the array?
And so, indeed, this would
just be logically incorrect
because you might as well
not look at any of the array
if you're only looking at
that very first element.
So not what we intended.
So indeed, that original
pseudocode is just fine.
So alternatively, there's something
called binary search, which
Derek actually used intuitively
for his second approach,
where he was told in advance that the
array is sorted numerically from left
to right.
This gives him more information,
much like I had in week 0
when I searched a phone book
that I knew was alphabetical.
And you can write this algorithm
in any number of ways, too.
I kind of borrowed the spirit of
our Mike Smith algorithm from week 0
and I proposed this.
Look at the middle of the array.
If element you're
looking for return true.
We just get lucky if it's smack
dab in the middle of the array.
Else if the element is to
the left, because the element
you're seeing in the middle is too big,
then search the left half of the array.
Else if the element is to the
right, as it was in Derek's case,
search the right half of the
array, else return false.
But there is no go-to instruction here.
And there's no line numbers.
Recall from week 0 when we
searched for Mike Smith,
I deliberately numbered
all of the lines.
And then I said, go back to
step 2 or 3 I think it was.
Here, I seem to be cutting a corner.
I'm just kind of waving
my hand and saying,
search the left half of the list.
Or, search the right half of the list.
But what algorithm should we
use in either of these cases?
How do you search the
left half of a list?
How do you search the
right half of a list?
How do I do this?
Well, where do I have an
algorithm for searching?
Well, we saw one a moment ago.
I've already got in my toolkit,
so to speak, linear search.
So I could certainly use that.
But sort of cyclically,
or sort of confusingly,
I also have this other seemingly
fancier, faster algorithm
called binary search.
And yet, I'm in the middle of defining
binary search with my pseudocode.
And yet, here I am sort of trying
to define myself with myself.
You can't get away with this in English.
If you're asked to
define an English word
and you use the word in the definition,
usually a teacher or someone
in a conversation will sort of
verbally slap you on the wrist
because you're not
answering the question.
And yet somehow in programming,
you can get away with this.
Somehow, you can get away with defining
a problem with a solution like this,
and then using that same
solution inside of your solution
because something is changing.
This line here, search the
left half of the array,
and this line here, search
right half of the array,
is not equivalent to asking
the same question again.
Is this element in the array?
What changes when I either search
the left half or the right half that
makes this acceptable?
AUDIENCE: [INAUDIBLE].
SPEAKER 1: What's that?
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Yes.
That's the key ingredient.
In so far as search left half
and search right half imply
search in the same way, but
search a smaller problem,
this problem will eventually stop.
You're not going to get stuck
in an infinite loop so to speak.
Because eventually-- you
can't half something forever.
You're eventually going to end
up with one or zero elements.
At which point, the
answer's going to be no.
Return false, it's not here
per the else condition.
But these sort of reflexive
calls or inductive,
or more properly here, recursive
calls-- the new buzzword
for today-- means for a function, or in
this case pseudocode, to call itself.
To use its own definition
again and again
and again, which is bad instinctively,
unless you keep doing it
on smaller and smaller problems.
And so this is a general
principle, actually,
that we're going to be able to leverage
ultimately, not only for searching but
also, as we shall see, for sorting.
So let's go ahead and
let me share actually
how this example went last year.
We're fortunate to capture
sort of memories on film.
And I thought it would
be fun to share this,
when one of your predecessors
in the class came up,
was asked the exact same
question, find us the number 50.
But it didn't quite go
as well as we hoped.
Let's take a look at 2014.
[AUDIO PLAYBACK]
-All I want you to do now is
to find for me and for us,
really, the number 50
one step at a time.
-Number 50?
-The number 50.
And you can reveal what's
behind each of these doors
simply by touching it with a finger.
Damn it.
[APPLAUSE]
[END PLAYBACK]
SPEAKER 1: OK.
So that didn't really
demonstrate anything that year.
So we tried again, this time saying, all
right, this time the array is sorted.
What are you going to do this time if
you want to find the same number 50,
but the array is now sorted?
I thought I'd share this.
[AUDIO PLAYBACK]
-The goal now is to also find us the
number 50, but do it algorithmically.
And tell us how you're going about it.
And if you find it, you keep the movie.
If you don't find it, you give it back.
-Man. [INAUDIBLE].
OK.
So I'm going to check the ends
first to determine if there's-- oh.
[APPLAUSE]
[END PLAYBACK]
SPEAKER 1: OK, now I feel a
little bad because clearly we
used to give away movies and
now we give away stress balls.
But no one really has DVD players
anymore, so it's OK probably.
And then, let me share one other
clip back from our SD days.
Many years ago when Shawn, who
has since graduated, came up.
Was asked the same question.
So Derek, you are in this
wonderful history of students
who have tried this demonstration.
This was, as you can, see before
we had touchscreen technology.
So we had sheets of
paper up on the screen.
But the idea was the same.
And Shawn is perhaps one of
our favorite memories in so far
as he, too, was asked to solve
the same problem in his way.
[AUDIO PLAYBACK]
-OK, so your task here,
Shawn, is the following.
I have hidden behind
these doors the number 7.
But tucked away in some of these doors
as well are other non-negative numbers.
And your goal is to think
of this top row of numbers
as just an array, or just a
sequence of pieces of paper
with numbers behind them.
And your goal is only using the top
array here, find me the number 7.
And we are then going to critique
how you go about doing it.
Find us the number 7, please.
No.
5, 19, 13.
It's not a trick question.
1.
At this point, your score is not very
good, so you might as well keep going.
3.
Go on.
-[INAUDIBLE].
-Frankly, I can't help but wonder
what you're even thinking about.
Only the top row.
So you got 3 left.
So find me 7.
-[INAUDIBLE].
-17.
-[INAUDIBLE].
-7.
[END PLAYBACK]
SPEAKER 1: All right.
So we don't ask people
to search for 7 anymore.
But this invites the
question, how are we
allowed to have that assumption, right?
I've been assuming in week 0 that
the phone book was alphabetized.
And therefore, I can find Mike Smith
really fast in logarithmic time.
And we were just assuming
a moment ago that we
could find the number 50 super-fast
because of divide and conquer.
Again, but only if
that array were sorted.
Indeed, Derek technically
just got lucky in so far
as he found 50 the
first time in 3 steps.
But it could have been as many as
7 steps because it was, indeed,
a random algorithm.
Or even if he had used linear search.
So suppose we want to
actually sort something.
So these are still used
on campus sometimes.
So these are blue books for exam period.
And suppose that we're at the end of
a semester and a bunch of students
have written their
names on these things.
And so A's and B's and C's and D's.
And suppose that they're handed
in at somewhat random times.
There's always that kid who hands
in his or her exam at an hour
into the 3-hour exam,
and then most of them
come in around like 2 hours 45 minutes.
And so therefore, they're all
in this pretty random arbitrary
order like this.
Actually, if I just them down like
that, they're not really random at all.
And so they're just
in some random order.
And the goal at hand, ultimately,
is for the head TF, for instance,
or the professor to actually
sort all of these blue books
and come up with an alphabetical order
so you can make sure that everyone
has actually submitted on time.
So how do we do this?
What is the algorithm
with which to do this?
Because indeed, if he
or she, the professor,
later wants to find a certain
name of the alphabet, like Smith,
it'd be nice if they don't have to sift
through all 26, or all 100, or all 500
blue books.
They can just jump
roughly to the middle,
and then divide and conquer
and find Smith there after.
Can we get one volunteer to
come up and propose how to sort?
You want come on up?
Come on up.
What's your name?
AUDIENCE: Allison.
SPEAKER 1: Allison.
All right, come on up.
So if you've ever
sorted something before,
now's your chance to
show that off, Allison.
All right.
So here are a whole bunch of blue books.
And I've just arbitrarily written
not names, but first letters
of the alphabet on them.
So go ahead and sort them.
AUDIENCE: So I start with a letter.
And if this is before
that, I just put it on top.
SPEAKER 1: OK.
So that's X and L. Gotcha.
AUDIENCE: And because Z is after that--
SPEAKER 1: There's Z.
AUDIENCE: --I put it below.
And D is here.
SPEAKER 1: OK.
So you seem to be taking them one at a
time and just dealing with the problem
then and now?
AUDIENCE: Yes.
SPEAKER 1: OK.
So that's J. What happens next?
AUDIENCE: [INAUDIBLE].
SPEAKER 1: OK.
That's K. K, L, good.
B, that's easy.
On top.
A, even easier.
On top.
AUDIENCE: Am I going to
finish this entire thing?
SPEAKER 1: Yes.
We need 13 more, give or take.
M and--
AUDIENCE: N.
SPEAKER 1: N, O. I didn't
do a very good job there.
U, V. OK, definitely not good.
Y. That's good.
It goes toward the end.
R, a little harder.
But this is actually a
good situation for us
to be in because-- you keep
going, I'll keep talking.
So even as Allison is
sorting this, she's
dealing with each of these
blue books one at a time.
But notice what's happening
to her right hand.
There literally is more and more work,
it seems, happening in her right hand
because as she encounters R or now W,
not only does she need more thought,
she also needs to do more
flipping to actually find
the right spot for W. This is H.
She's doing the same thing again.
So it feels like the algorithm
has started to slow down.
At first, it was pretty easy
to just make a short list.
But now that the list
is getting longer, it
feels like she's fixing a lot of
work, or finding a location is
getting harder and harder.
Or at least taking longer and longer.
Maybe the alphabet itself is
not that much more difficult.
Now, we have I. I can
only stall so much longer.
G.
AUDIENCE: Sorry.
SPEAKER 1: No, that's fine.
E, good.
OK.
So it's getting a little easier
because we're at the beginning.
Sorted.
So what were you thinking?
What was that algorithm?
AUDIENCE: It was just take a
letter, take another letter,
and then test to see if the
letter is before or after.
SPEAKER 1: OK, good.
AUDIENCE: And then put it in the
correct place on either side.
SPEAKER 1: Perfect.
So a perfect way of
thinking about this as well.
So thank you so much.
AUDIENCE: Thank you.
SPEAKER 1: Very nicely done.
And suppose the problem's
a little harder.
I'll do this one myself here.
But suppose we have some playing cards.
And these are the type that
probably everyone here can see.
So here is a standard
deck of cards, including
the jokers, which we won't need.
How do you go about
sorting things like this?
I don't know if-- actually,
I didn't think in advance.
I should have shuffled these.
So if you want to shuffle these cards.
This is not going to end well.
OK.
So suppose we have all of these
cards nicely shuffled here.
And I have before me a big
problem, a big pile of cards.
And the goal, for whatever reason,
you wanted to start the game
or you like to be orderly,
is to sort the cards.
What's your algorithm going
to be for sorting these cards?
What might you do?
Yeah.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Do them in
piles based on the number.
OK, so here's the number 2.
So you want to make a
pile for all the 2's?
OK, so we can do that.
So here's a pile for all the 4's.
And let me make myself some more room.
So we can put 2's here.
4's here.
Here's a 3.
So I'm going to put this
in between the 2 and the 4.
Here's a 5.
I'm going to put that over here.
And if I jump around, here's a Jack.
So this is going to go way over there.
7, 8.
6 can go over here.
Here's another 4, so I can put it here.
And I can bucketize all of the cards.
So this is good intuition because it
makes a pretty big problem, 52 cards,
a little more tenable.
You can just now flip through
all of the cards one at a time,
or whatever you encounter first.
And just drop it in the correct bucket.
But that's, of course, not
going to get you to the end.
Let's assume we fast
forward to the ending.
And now I have these 13 piles
from Ace all the way up to King.
What do I then do?
What's step 2?
AUDIENCE: Stack them.
SPEAKER 1: So I want to stack them.
So I can take all the 2's,
all the 3's, all the 4's.
And I didn't specify
earlier, one of the things
that complicates this problem with
cards is that you have different suits.
And so if you really
want to be nit-picky,
maybe you want to keep
all the suits the same,
which is like a nested
sorting problem of some sort.
But this general idea of bucketizing
in this way is good intuition,
but it clearly takes some time.
I mean, enough time that I'm not
going to sort of drag this out
by sorting all of those cards myself.
And therein lies the point, though.
Even though in week 0 with Mike
Smith, we had that sorted phone book.
And even though a moment ago,
Derek, albeit with a few numbers,
had them already presorted,
sorting takes time.
And even when we were just sorting
the blue books alphabetically,
even though it felt pretty fast, pretty
obvious, it starts to drag eventually.
And god forbid there
were more than 26 books.
Maybe there were 260 books, or
2,600 books, or 2 million books.
Or web pages.
Or database records.
Any number of problems in this
world now require sorting sometimes.
Whether it's on Google, or
Facebook, or your own computer.
It feels like we need a good algorithm
for actually doing that sorting.
So how can we go about doing that,
not just with blue books, not
just with cards, but more generally?
Well, let's try to frame the
problem first with a simpler world.
In fact, we can generalize
blue books and playing cards
as really just numbers.
Or some piece of data that there is
a defined order or sequence to them.
And suppose I have these numbers here.
How do I actually go about
sorting this many numbers?
Well, for this-- and this
will be our last chance
for audience participation here.
I need eight brave volunteers
to actually come on up.
All right, so 1, 2.
This is what everyone's
been waiting for.
1, 2, 3, 4, 5, 6, 7, and 8.
Thank you.
Come on up.
And I will meet you over here.
All right.
Come on over here.
If you want to grab two music stands.
OK.
So let's go ahead and turn these
around like this if you could,
as though the music's
going to face the audience.
OK.
All right.
And let me go ahead and give you sheets
of paper, and then we'll resume here.
All right.
So here is our eighth music stand.
And what we have here
is 8 pieces of paper.
And each of you guys is going to take on
the role of a certain number right now.
So I'm going to go ahead and give
each of you a number in sorted order
initially, but that would be too easy.
So if you don't mind, let's go ahead and
arrange yourselves behind these music
stands in the same order
that you see on the screen.
So on the left end here near
me should be the number 4.
Then, 2.
Then, 6.
Then, 8.
Good job.
Then, 1.
Then, 3.
Then, 7.
Then, 5.
All right.
And you can put the
numbers on the music stand
there, but just move
them with you as you go.
So now the question is, here
is a seemingly random list
in just some order I came up
with somewhat arbitrarily.
And I want now to sort my list.
How do I go about doing it?
What would you propose?
The goal is to get from 1 to 8.
What are our possible approaches here?
Oh, you want-- sure, OK.
We'll just do it over here.
AUDIENCE: Take out the smallest number.
SPEAKER 1: OK.
AUDIENCE: Take it, shift
everyone else down.
Have the smallest number
in the first place.
SPEAKER 1: OK, good.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: So let's
go ahead and do that.
For those tuning in at home,
what's your name again?
AUDIENCE: Allison.
SPEAKER 1: Allison.
And?
AUDIENCE: Alex.
SPEAKER 1: Alex.
AUDIENCE: Adam.
SPEAKER 1: Adam.
AUDIENCE: Brandon.
SPEAKER 1: Brandon.
AUDIENCE: Joseph.
SPEAKER 1: Joseph.
AUDIENCE: Vicky.
SPEAKER 1: Vicky.
AUDIENCE: [? Arpin. ?]
SPEAKER 1: [? Arpin. ?]
AUDIENCE: Derek.
SPEAKER 1: Derek.
OK.
So now, because that all
just was too much to absorb.
Number 1 is now the
first smallest element
according to the proposed algorithm.
So what are we going to
do exactly with number 1?
AUDIENCE: He's going to [INAUDIBLE].
We're [INAUDIBLE].
SPEAKER 1: OK, good.
So pop back if you would.
You guys are going to shift
three places this way.
And we're going to insert number
1 at the beginning of the list.
All right, so what's the
next step just to be clear?
AUDIENCE: Number 2 is going to pop up.
SPEAKER 1: Good.
AUDIENCE: I'm going to shift down.
Number 1 is going to stay in place.
SPEAKER 1: Good.
So we find the next smallest
element, which happens to be 2.
We insert him at the
beginning of the list here.
And now, we move on to the
number 3 problem Excellent.
So pop back.
Shift.
And insert.
Good.
4, we got lucky.
So we don't have to do
any extra work here.
So that's a freebie.
Now we look for 5.
Oh, there's 5 at the end.
Good.
And now 6.
We got lucky.
7 and 8, we need to fix that.
OK, good.
So I like this.
It felt pretty fast.
Though, to be fair, the
list is pretty short.
But we were doing kind of a lot of work.
In fact, that was perfect the fact
that number 1 was all the way over here
because we shifted 3 humans.
And can you guys reset
for just a moment?
Let me propose an alternative approach.
So if number 1 is in the middle here
and really belongs all the way there
on the left, we went
to great lengths, it
seems, to shift all of these volunteers
to the left, which is a lot of steps.
Because at the end of the day,
let's assume for today's purposes,
that a computer can only
do one thing at once.
So even though all of you
humans moved pretty quickly,
that was really like one of you moved.
Then, the next of you moved.
Then, the next of you moved.
Then, the next.
So that was like 4 total steps just
to put number 1 on the end there.
Let me propose this.
1, can you pop out again?
Now, this list, so far as I'm
concerned as the computer,
is perfectly random from the get-go.
So why do I even care
that you guys move?
I haven't even looked at
the rest of these elements
yet or done anything with them.
Why don't I just evict number
4 where number 1 belongs?
If you want to go over here.
And if the list is already
randomly sorted in the beginning,
well, then fine.
4, you're going to go over here.
So I've made the problem no worse,
but I have made it slightly better
by putting 1 on the end.
Now I'm going to go ahead and select
the next smallest element again.
2, I got lucky.
Now, I'm going to look for
the next smallest element.
3 indeed is the smallest.
So let me do the same thing.
3, if you could pop back.
6, you're out of there.
And notice that we're doing less work.
Now, the humans are moving a little
farther physically, but that's fine.
The computer can just move values
around in memory pretty fast.
But I've moved fewer of you.
So this is a slight optimization.
It turns out this isn't
fundamentally better as we'll see.
But it's this instinct
that you should appreciate
of trying to do the least
amount of work possible.
Because otherwise, it's going to add up.
Let's try another one.
Well, let me clarify one thing then.
So what am I doing on each
iteration of this algorithm?
As you suggested, I'm
selecting the smallest element.
But the only catch is, if I'm
now looking for the next smallest
element, that's going to be 4.
But I'm only sure that it's
4 once I get to the end
and realize, yep, that was the
smallest element and then I act.
So once we fix that problem--
if you guys want to switch--
I start again over here.
And I have to find the
next smallest element.
And let's see, 8 is pretty big.
6 is smaller.
Good.
7, not smaller.
But only once I get to the end do I
realize, oh, here is the smallest.
So in short, to find
the smallest element
as you proposed again and again, I have
to keep going through the whole list
because it might be
all the way at the end.
We, humans, have the
advantage of just kind
of looking, sort of taking
a step back and saying,
OK, this is obviously unsorted.
I know where everything is.
A computer can only look
at one number at a time,
one element of an array at a time.
So let's try a fundamentally
different approach.
If you guys could reset to
your original 8 locations.
Let me propose that we
just looked at an algorithm
that we'll call selection sort.
Selection sort in so far as you
iteratively again and again select
the next smallest element.
Let's now do an approach that
we'll call bubble sort, which
has sort of the visual effect of numbers
bubbling up over time as follows.
You know what?
I'm just going to look,
not at the whole list
because it felt like that was
creating a lot of work for me.
I'm going to look at 2
numbers at a time, 4 and 2.
Are these in order or out of order?
Obviously.
So they're out of order.
So you know what?
I don't care about the
rest of the problem.
Let me just swap you two.
All right.
Now, let me take one step.
Look at 4 and 6.
In order or out of order?
In order.
So I'm going to leave it be.
6 and 8?
In order.
8 and 1?
Not in order.
So let me swap that again.
8 and 3?
Out of order.
Let me swap that again.
8 and 7?
Out of order.
Let me swap that again.
8 and 5?
Out of order.
Let me swap that again.
And now, what has happened?
Is the list sorted because I
fixed all the pairwise mistakes?
So obviously not.
But is it better, the list?
Why is it better?
Derek.
AUDIENCE: Because now you have
one less element to worry about.
SPEAKER 1: Yeah, exactly.
8 bubbled all the way up
to the top, if you will.
So now he is effectively done.
And frankly, every other
number that's smaller
bubbled one step closer to its position.
So we've taken one big pass through this
to solve at least one of the problems.
And now maximally, 7 problems remain.
So let me go back to the end here.
And let me look at pairwise.
2 and 4?
You're good.
4 and 6?
You're good.
6 and 1?
Let's fix that.
6 and 3?
Let's fix that.
6 and 7?
We're good.
7 and 5?
Let's fix that.
And now I can ignore number 8 because
I know the end of my list is sorted.
So now, I've improved it slightly.
7 has bubbled up to
where he needs to be.
I just have 6 problems
left to solve at most.
So now I look again.
2 and 4?
You're good.
1 and 4?
Let's swap that.
And notice, 1 is moving closer
to the left of the list.
4 and 3?
Let's swap.
4 and 6?
You're good.
6 and 5?
You're not.
But now I fixed 6.
And I can stop here, leaving me
with just five potential problems.
And then you can see
where this is going.
2 and 1?
Let's swap.
2 and 3?
You're good.
3 and 4?
You're good.
We already solved these problems.
So let me go back.
And now, 1 and 2?
You're good.
2 and 3?
You're good.
3 and 4, 4 and 5, 5 and
6, 6 and 7, 7 and 8.
What do I know now as the computer?
What's that?
AUDIENCE: It's sorted.
SPEAKER 1: It's sorted.
Why?
AUDIENCE: Because everything's in order
SPEAKER 1: Everything is in order.
But what does that mean?
We humans can see that, but the computer
can only look at individual values
at a time.
It can't take a step back
and say, yep, sorted.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Good.
At a lower level implementation
detail, I did no swaps.
There were no two numbers
that were out of order.
So it would just be
foolish of me and wasteful
for me to bother doing another pass.
Because assuming the numbers
are not changing on their own,
I'm just going to do the same
thing again and again otherwise.
So I can just stop.
And indeed, the bubble sort
allows us to stop like this.
Let's do one last approach.
If you could, let me toss the
starting points on the board again.
Let's reset one last time to this.
And let me propose-- let me
draw inspiration, actually,
from the approach you took
with the blue books, which was
a little different from how we began.
Which was recall that you took
the problem on your left hand
and you just dealt
with it one at a time.
You didn't go sifting
through looking for A.
You didn't go sifting
through looking for B.
You didn't go sifting through
looking for C, which would really
be the analog of what you
proposed a moment ago.
Instead, you just dealt with
each problem as it came.
And this is our third
and final algorithm
for sorting called insertion sort.
Take each problem one at a time and
just deal with it then and there.
Don't postpone.
So here's 4.
You know what?
Done.
I now claim that this list, which
we'll visually make a little separate,
is sorted.
So I'm going to mentally divide my
list into a left half and a right half.
And now I just claim trivially-- I
mean, almost silly-- this is sorted.
Because it is.
But now the right half
of my list remains.
So let's go ahead and do this.
So number 2.
OK, where does this belong?
So if you want to go ahead
and pick up the number.
The stand can stay.
Number 2 goes where?
Here's my left list.
And we'll make a little more
space here just to be clear.
It goes over to this side.
So what has to happen?
So this time, we do
need to shift you over.
So let's take 2 out.
Let's shift 4 over.
And now we won't move the stand.
We'll just assume now that
my left half is of size 2
and the right half is of size 6.
But now, the left half of
the list is still sorted,
but it costs me a little bit of work.
I had to move 4 over.
All right, let me deal with 6.
Where does 6 go?
It's already in order.
I don't need to shift anyone.
Where does 8 go?
Phew, it doesn't need to go anywhere.
It's already in order.
Uh-oh, 1.
Where does it go?
Now we incur a good amount of cost.
So let's do this.
Where do you go?
One shift, two shift,
three shifts, four shifts.
Feels-- yep-- as bad as
our original approach.
But notice I only have
4 more problems left.
I don't have to keep going
back through the list.
So where does 3 go?
All right, let's pop you out.
Shift you in there.
7.
We got a little lucky.
Let's pop you out and insert
you right where you belong.
And then 5, you belong
a few shifts over.
So let's shift three of you.
And then, done.
All right.
So let's give all of you a stress ball.
And maybe a round of applause.
And let's see if we can't tease
apart what just happened here.
All right.
And then you can exit stage
left-- right if you'd like.
All right, thank you.
Derek, you're really
racking them up here.
All right.
AUDIENCE: I'll [INAUDIBLE].
SPEAKER 1: All right.
Thank you.
All right.
Sure.
So what was the point of this exercise?
And how do we actually now get to
a decision point where one of these
is the way to go?
So in bubble sort, recall that
the key was to do something
again and again until we did no swaps.
At which point we could conclude
that the list was sorted.
So let's do exactly that,
repeat until no swaps.
And then, let's give
ourselves a counter,
just to make things a
little more methodical,
and say for i from 0 to n
minus 1-- rather n minus 2.
Why?
Well, the end of the list
recall is it n minus 1.
Because if you have a list of
size n, the beginning is at 0.
The end is at n minus 1.
And so one shy of the end
is going to be n minus 2.
So we want to do this up through
the second to last element.
If the i-th and the i-th plus
1 elements are out of order.
So if left hand and right hand
are out of order, swap them.
And repeat this again
and again and again.
And thanks to that
outer loop, we're going
to do it again and again and
again until we have no such swaps.
And now to be clear, why this n minus 2?
We want to make sure as we're sort
of walking down our row of volunteers
that my left hand does not actually
point at the last human on stage
because then, where would
my right hand point?
There's no person to beyond that point.
So this then is bubble sort.
Now, let's propose pseudocode for
our other algorithm, selection sort,
wherein we walked through
the list iteratively
trying to select on each
pass the smallest element
and putting it into its place.
So for i from 0 to n
minus 1, we're going
to do this from beginning
through the end,
find the smallest element between
i-th and n-th minus 1 element.
Swap the smallest with
that i-th element.
In other words, walk through the list.
Find the smallest, put it on the end.
Walk through the list, swap
it with the second to the end.
Walk through the list, do the
same, do the same, do the same,
each time grabbing the
then smallest element.
And then lastly, we had insertion sort,
whereby we iterated through the list,
but we pretty much dealt with
every element as we encountered it.
So let me propose this
pseudocode code here.
For i from 1 to n minus 1.
So let's just assume that the
very beginning of the list
is sort of trivially sorted, no
matter what number he or she is.
Because if that left side of the list
is of size 1, it is, indeed, sorted.
And indeed, let's call the 0-th element
through the i minus [? 1-th ?] element,
wherever we are, the sorted side.
So essentially, call the
left the sorted side.
And you can think of the right-hand
side as the unsorted side.
Remove the i-th element, whatever
element you're dealing with,
and insert it into the
sorted side in order.
In other words, just walk
through the list, one at a time,
plucking out elements, and
then forcibly insert them
into the sorted side of
the list, making room,
as our humans did, for that
list by shuffling everyone over.
This then was bubble sort,
selection sort, and insertion sort.
But which to choose?
So let's introduce now an answer to why.
Not just what these things are, but
why you might use one over the other.
So the running time
of an algorithm might
be the number of seconds
an algorithm takes to run,
or the number of minutes, or the number
of steps, or the number of comparisons.
It doesn't really matter
what your unit of measure
is, so long as you're consistent,
whatever feels the most
expensive or useful to express.
So let's consider an
example, like bubble sort.
If we want to consider the
efficiency of bubble sort
and make a claim as to
whether we should or shouldn't
use it because it's good or bad, let's
consider it a little formulaically.
So a little mathy, but
pretty generically.
So if there are n elements to
store, and a computer scientist,
recall from week 0, will generally just
call n the size of his or her problem.
n is the number of numbers
to sort, or humans to sort.
How many steps is it going to
take to sort that many people?
Well, when I had 8 music stands
and 8 people here before,
I considered a pair, a pair,
a pair, a pair, a pair.
And so if there were n humans up here.
Or let's speak concretely,
8 people initially.
How many pairs of people did
I consider in my first pass?
Say again.
AUDIENCE: 7.
SPEAKER 1: 7.
Because if you pair the
first and the second,
then the second and the third,
then the third and the fourth,
that gives you n minus
1 generally speaking.
You can't make 8 pairs out of 8 people.
You can make 7 pairs if you
keep walking through the list.
We're not doing permutations.
We're just doing adjacent neighbors.
OK.
So that might generically
be n minus 1 total steps
for the first pass of my algorithm.
But the upside of that was that
our eighth volunteer, number 8,
bubbled his way all
the way up to the end.
And so with bubble sort, I didn't
need to consider him again.
The biggest number
bubbled all the way up.
And so that was done.
So how many pairs did I need to consider
the second time through bubble sort?
6.
Or more generically, n minus 2.
And then dot, dot, dot, until there
was just one pair of two people left.
So I'll just do plus
dot, dot, dot plus 1.
Now, you might recall from
high school especially,
at least my math or
physics textbooks always
had a little cheat sheet for like what
these kinds of recurrences or formulas
add up to.
Does anyone recall what this
one adds up to in a math book?
Or just mathematically?
So if we actually do this out,
it's the same thing, beautifully,
as n times n minus 1 over 2.
And that, of course if
you multiply things out,
is just the same thing as
n squared minus n over 2.
And that, of course, feels like it
would just multiply out to this.
So in other words, if asked, what
is the efficiency of bubble sort,
or what is the running time of
bubble sort to soar n elements,
you might sort of
impressively say, well, it's
n squared divided by 2 minus n over 2.
But what does that actually mean?
Well, it turns out that when talking
about the efficiency of algorithms,
you should really generally
care about the component that
has the biggest order of magnitude.
The number that contributes
the most to the total cost.
And by that I mean this, which is
obviously bigger, n squared or n?
Assuming positive values of n.
So n squared, right?
Especially as n gets bigger, n squared
is going to get even bigger than n.
So you know what?
I'm just going to kind of propose,
let's just ignore the n minus 2
because as n gets really big,
the dominating factor really
is going to be n square.
And for that matter, the n over 2,
like who cares about the over 2?
n squared is kind of the essence
of this formula right now.
So what does that mean?
Well, let's just do a concrete
example to convince you of the fact
that we can cut a little bit
of mathematical corner here
and only care about the big
number for the following example.
This is proof by example, which
is not a proof of anything.
It's just really to paint a picture
of why we do this intuitively.
So suppose that there are a million
volunteers on stage or a million
numbers that we want to sort.
Much bigger than 8.
Let's plug this in.
So if the total cost of bubble sort
is n squared over 2 minus n over 2,
let's plug that in.
So that's a million squared divided
by 2 minus a million divided by 2.
So if we multiply that out, that
is 500 billion minus 500,000.
Big numbers.
But when you multiply that out,
I mean-- my god, 499,999,500,000.
I mean, to my own eyes, that's
pretty darn close to 500 billion
in the first place.
Why don't we just call it 500 billion?
So n squared, in other words.
So again, this is not a
formal proof of anything.
But this is why, especially
as n gets bigger,
that lower-ordered term,
the minus n over 2,
just matters less and less
and less in absolute form.
And so we'll generally say that
something like bubble sort,
its running time is on
the order of n squared.
Big O is actually formal computer
science notation for on the order of.
And it has a formal
mathematical definition.
But for us, we'll consider it really
to be an upper bound on the running
time of this algorithm.
So an upper bound on how long this
algorithm might take given n steps.
So big O itself is a formal notation.
We'll see it in a number
of different contexts.
Depending on the algorithms
we talk about in class,
we might say that the
running time of an algorithm
is on the order of n
squared, which is pretty bad.
Pretty slow. $500 billion
sounds like a lot.
Well, maybe it's a little better
on the order of n times log n.
And more on that some
other time-- later.
But on the order of
n, it was pretty good.
If there's n elements, it only
takes you roughly n steps.
Maybe it's even better big O of log n.
Or really best would be big O of 1.
It just takes 1 step, or
10 steps, or 100 steps,
but a constant number of steps
is what big O of 1 represents.
And there's any number
of other formulas we
could use or equations to express
the running time of algorithms.
But many of the algorithms
we'll look at in CS50 certainly
fall into these categories.
So let's take an example.
What algorithm has the running
time of big O of n squared?
Well, bubble sort for one thing.
And it turns out selection sort.
And it turns out insertion sort.
All three of those algorithms
actually have the same running time,
even though they're fundamentally
different in execution.
But they do share a commonality.
All of them are comparison-based.
In each case did we
actually compare values,
either adjacent or we would look at a
number there, compare it to a number
here, and then decide whether
or not to do a switch.
So they were all comparison-based.
And all of them, frankly, involved
quite a bit of movement or walking,
either on my part or the
humans-- volunteers part.
For instance, in bubble
sort's case, I had
to walk through the whole list,
sorting elements pairwise like this.
But that wasn't enough.
I then had to do it again
and walk across the stage.
And then I had to do it again
and walk across the stage.
And this is what n squared feels like.
And even though the problem
was getting a little smaller
as the big elements bubbled up,
mathematically, even that algorithm,
it's kind of essentially like doing
n things, looking at n people,
n times, iterating n total times, which
is roughly n times n, or n squared.
It's just going to add up a lot.
So what about selection sort?
What did I do in selection sort's case?
Here, it's even more obvious, perhaps.
To find the smallest
element in a list, I
have to look logically at every element
before I can confidently conclude,
yes, I have found the smallest element.
Once you find it, you can
plop it at the beginning.
And that's easy.
But then you have to find
the second smallest element.
And even though we humans could
kind of eyeball things and realize,
oh, 2 is the next smallest.
You don't know that as the computer
unless you look at every element.
Ah, I found 2.
And then you could put
2 where it belongs.
So again, there's this
walking back and forth.
And if you're going to walk as many as
n steps and have to do this n times--
one for every element--
that, too, is n times n.
And insertion sort, same damn
problem, even though it's
sort of logically
different, the algorithm.
Insertion sort, beautifully I
only move one step at a time
and I never turn back.
Because I just deal with, as you did
for the blue books, each problem one
at a time, never again backtracking.
But where was all the work happening?
In the shifting of humans.
Now, technically you guys helped out
by doing the shifting yourselves.
But if I'm the computer,
it actually would
have been like me tiptoeing back,
shifting, shifting, shifting,
inserting, shifting, shifting, shifting.
So again, in the worst
case, I might have
had to shift everyone all the way
over to make room for someone.
Because in the worst case,
what's the worst possible input
you might get to a sorting algorithm?
What input would create
the most work for you?
AUDIENCE: Backwards.
SPEAKER 1: Backwards.
If the list is perfectly
backwards, every darn element
is out of place, which means you
have to traipse back and forth
and back and forth to insert or to
select or to swap all of those elements
that many times.
So in all of these cases, they
are not numerically identical.
Some of those algorithms might take
fewer steps than others in totality,
but they're all on the order
of n squared for the reason
that they're all involving comparisons
as many as n times n times.
All right.
What about an algorithm
that's big O of n?
An algorithm that only takes n steps?
We've seen one.
Derek showed us one.
What's an algorithm that takes n steps?
Derek, hopefully.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: A random search?
Kind of, sort of.
It's a little harder to
put a number to because it
depends what you mean by "random."
Because if you have random
with repetition such
that you're checking foolishly
the same elements again and again,
it can actually add up
to more than n steps.
So not random, but-- yeah.
AUDIENCE: Binary.
SPEAKER 1: Binary is even better.
Hold that answer.
You're about to be correct.
Linear search, right?
If you just blindly
go from left to right
or right to left looking
for some element,
that is on the order of n steps.
Because you are just doing
foolishly unnecessary
work if you go beyond
the end of the list
or beyond the beginning of the list.
You only need n steps for linear search.
But what if you want to instead find
an algorithm that's logarithmic?
What algorithm that we used
might be logarithmic in nature?
Binary.
So binary search, as Derek proposed,
once we knew the list was sorted
was an alternative to that.
And constant time.
What about this?
What's an algorithm, maybe
independent of today,
that allows you constant
time, running time?
It could be anything.
How about hello world?
The [? say ?] block,
or printf like that.
If you can just say a
whole phrase at once.
Or maybe just adding two numbers
together is like one step.
No matter how big the numbers
are, it still just takes one step
to add them all together.
It depends on some of the
lower-level implementation details.
And even printf or the [? say ?]
block has multiple characters.
So we'll actually come
back to this in the future
as to what it really means to be a step.
But anything that takes a
constant number of steps.
Checking an "if" condition,
returning a value.
Statements, generally, might be
broadly speaking just constant time,
but we'll see contradictions
to that before long.
So let's take a look
at two other symbols.
This one being a capital omega symbol.
And this omega symbol is the
opposite, really, of big O. Big O
is an upper bound on the running time.
In the worst case, how
many steps might it
take to find an element
in a list of n elements?
n steps.
So big O of n.
Bubble sort, selection
sort, insertion sort?
They're on the order of n squared.
But what about the lower bound?
Especially when you get lucky
in the best case so to speak,
how few steps might you
solve problems with?
So for instance, if we
consider the same formulas,
but this time with capital
omega, what's an algorithm
that no matter what
takes n squared steps,
even if, for instance,
the list is sorted?
So suppose our 8 volunteers here
were already 1, 2, 3, 4, 5, 6, 7, 8.
Which of those algorithms would
still have required n squared steps?
So not linear search or binary
search because those are much lower.
But insertion, bubble
sort, selection sort.
Well, what about selection sort?
Even if the list is already sorted,
how do I find the smallest element?
I'm lucky and it's right here,
but I don't know that yet, right?
To know that, I have to confidently
walk the whole list and realize,
well, that was a waste of time.
1 is indeed the smallest element.
It's sorted.
How do I find the next smallest element?
Start at the left.
But you can ignore number 1.
So it's a minor savings.
OK, 2.
Feel's pretty small, but got to check.
Dammit, that was the smallest.
And so selection sort, even in the best
case when the list is already sorted,
is going to take at least
on the order of n squared
steps because of that naivete of it.
Because of that underlying
design of just looking
for the smallest element
with no optimizations means
the algorithm's always going
to take us that many steps.
But it doesn't have to be that way.
What about bubble sort?
Suppose that all 8 humans are
sorted, 1 all the way through 8.
And here I am comparing the first two.
1 and 2?
Not out of order.
2 and 3, 3 and 4, 4 and 5,
5 and 6, 6 and 7, 7 and 8.
OK, everything looks good.
And in particular, what did I not
do while walking from left to right?
I didn't do any swaps.
So it would be foolish of me
algorithmically to do any of that
again because the answer
is not going to change.
And so done.
It took me n steps to sort
n elements with bubble sort
if the list is already sorted.
So it would be omega of n because in
the best case, it takes you n steps.
Could a sorting algorithm
be constant time
or logarithmic, log n or order of 1?
Or omega of 1 rather?
Can you sort n elements in
logarithmic time or constant time?
What's the intuition there?
These are kind of like now the
fundamental meaning of computing
and what you can actually do.
What does it mean for me to
ask, can you sort n elements
in log n time or constant time?
Well, that effectively
is like saying, can you
sort n elements without even
looking at some of those elements?
Because both log n and
1-- or mathematically,
and if you don't recall, take it on
faith today-- less than omega of n
in general.
So can you possibly,
intuitively, in the real world,
sort n elements in some random
order in fewer than n steps?
Yeah.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Good question.
Could you make the whole
list a big integer?
You could, that and other tricks.
But to do that, you're going to
need to touch or somehow manipulate
all of those values at least once
to kind of do that, I would think.
And so you've already spent n steps.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: OK.
So if you have a like guessing
algorithm-- yes, this is sorted.
That's constant time.
You're just not always correct.
So I should clarify, can you
sort correctly n elements
without necessarily looking at
all n-- the short answer is no.
There is like a fundamental
limit of computation.
And also, just human intuition here.
You can't possibly claim to me and
convince me, in a court of law even,
that these n elements are
sorted if you can't even
testify to having looked
at all n of those elements.
It's just not possible given
the computational model
we've been discussing where a computer
can only look at one element at a time.
So sorting is out of the question.
But an algorithm that
takes omega of one step?
For instance in the best case, when
might an algorithm need only one step?
How about something like linear
search or even binary search?
The running time of those algorithms
was big O of n and big O of log n
respectively.
Linear search, binary search.
What's the lower bound on those
two algorithms' running times?
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Yeah, it's omega of 1.
But why?
Derek wasn't so lucky.
But AJ, last year, touched
the number 50-- voila, 1 step.
And that was actually, while
sort of bad pedagogy for us,
was a perfect manifestation
of yes, in the best
case the lower bound on your running
time might indeed just be one step.
Or two steps, but some
constant number of steps.
So we have a range of these options.
Now, sometimes these ranges coincide.
If your big O notation, so to speak,
and your omega notation, so to speak,
are one in the same values, you
can claim that the algorithm
has a running time in capital Theta.
So theta just means when big
O and omega are the same,
you can instead use this formula and
sort of convey two thoughts at once.
But for now, the takeaway
is that as best we've
seen for something like sorting,
the best we've done so far, at least
in the worst case, is
big O of n squared.
And this doesn't feel very good,
especially since sorting and searching
is all around us.
Anyone with an iPhone or an Android.
I mean, there's so much data that
we're carrying around in our pockets
these days, let alone in the cloud
and on our laptops and desktops,
that it's just generally
good practice to keep sorted.
In your phone book, it
would be pretty tedious
if all of your friends and family were
in random order in the contacts list.
And you had to like scroll through
the whole darn list, big O of n,
to find someone.
No, instead we have a
search box, of course.
And most of us probably wouldn't
blindly scroll through the whole list.
But even that search box, if you start
typing in D-E-R to look up Derek,
or S-M-I-T-H to find Smith.
Well, how is iOS?
How is Android finding Mike
Smith or finding Derek?
Those pieces of software themselves
have algorithms that underneath the hood
are hopefully leveraging smarter
algorithms than we've even seen here.
Because especially as you have more
and more friends, more and more family
members, even big O of n squared
starts to feel kind of slow
as we can see here.
Let me go ahead and open up an example.
So let's take a look at a demonstration.
Here is a visualization that someone
on the internet kindly put together.
And it focuses on sorting numbers.
But rather than just show the numbers
textually as 1, 2, 3, 4, 5, 6, 7, 8,
it instead shows them as bar.
So the shorter the bar,
the smaller the number.
The taller the bar,
the taller the number.
And let me go ahead and
increase the speed of this,
just so it doesn't get too tedious.
And let's take a quick
look at bubble sort.
Now, notice what's happening.
The two red bars adjacently
are the ones that are
being compared and potentially swapped.
And you'll notice that just like in
our human demo, what is happening over
there on the right-hand side?
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Yeah.
So the biggest elements seem to be
bubbling up, if you will, to the top.
Even though there's still a lot of
work to be done to the left of those,
we indeed have the ability here to see
the list getting sorted progressively.
But you know, I'm already getting bored.
This algorithm doesn't feel very fast.
It's very pretty.
It's very correct, it would
seem, but it's a little slow.
Let's try another one to see
if it's any better than that.
That's the spoiler.
That's what it should look like.
Let's try our friend selection
sort, which is already sorted.
Let me stop this.
Let me click on Randomize Array.
And now click Selection Sort.
And what's happening here?
This is like me walking across
the audience again and again
and again and again.
You don't see any of
this bubbling effect,
but you do see the smallest
element being selected again
and again and again.
So we're kind of cleaning up
that mess one step at a time.
It's still going to take some
steps, but you can perhaps
see it a little better
and a little faster here.
And spoiler alert, when we skip forward
to the end, that's what it looks like.
If we now randomize it one last
time and do insertion sort,
this one kind of feels
and looks different.
It's kind of mostly
fixing things, but then it
kind of goes in and fills in the gaps.
Because indeed, it's moving forward
one at a time, grabbing the element,
and then shifting the element as
needed to make room for whatever number
it's encountered.
So that's why sometimes
it only takes a few steps
because you find where it belongs.
That time it took a while to
find the space that it belongs.
But generally, the list is
getting more and more sorted.
And once we do get to
the very end, this one
we can actually feel
being a little faster.
Which isn't necessarily a testament
to it being fundamentally better.
All of these remain on the
order of n squared steps.
And now indeed, it's starting to
drag, kind of like our blue books.
And now we're actually complete.
But what if we were to
compare these things?
Bubble sort on the left,
selection sort in the middle,
and then a third algorithm that we've
not seen this time called merge sort.
So insertion sort is roughly
the same as all of these.
So this visualization lets us try three.
To start this, I have to click on them.
So there's a little
bit of a fudge factor
here where I have to click on all
three programs quickly in succession
to see them.
But let's see if we can't
do fundamentally better.
Merge sort is our new
friend on the right.
Oh my god.
And these are different visualizations,
but they're the same algorithms.
And it's not that they're
just poorly implemented.
It's arguably that
they're poorly designed.
They are all correct,
but merge sort clearly
seems to have some upsides
over selection sort.
And if you'll take it on faith for
now, just because we can only fit three
on the screen, insertion sort, too.
You see the bigger elements
bubbling up to the top.
You see us selecting.
Again, it turns out this person
implemented selection sort
by selecting the biggest element.
But that's fundamentally equivalent
to selecting the smallest element.
And that's because you can see
them all stacking here in this way.
But my god, I wonder,
can me start this again?
Let's try.
Let's see if we can lap it.
Come on.
Come on.
Come on.
Come on.
Oh, so close.
All right.
Had I not talked as much.
So merge sort.
Where is this magical algorithm?
What are we actually doing wasting
time on these other things?
Well, let's consider exactly that, but
we need sort of a secret ingredient.
And let me revisit Mike
Smith for just a moment.
So you'll recall, this
was the pseudocode
that we had from Mike Smith in week 0.
And we decided when Mike
is earlier in the book
or when Mike is later in the book,
to just go to an earlier step.
Thereby, inducing some kind of loop.
But suppose we changed that
language to be this instead.
Don't go to a specific line number.
Don't go to some earlier step.
But just more generally,
say, search for Mike
in the left half of the book or search
for Mike in the right half of the book.
This is borrowing an idea that
I proposed a moment ago when
we were searching for something else.
And what word did we
ascribe to this kind
of approach of this cyclical
use of your own definition?
Yeah.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Recursion.
So recursion is actually
this secret ingredient
that we can leverage, both
intuitively and in code,
to implement this new and improved
fourth algorithm, merge sort.
And indeed, this we'll use
demonstrative of a class of algorithms
that doesn't behave in quite the
same way fundamentally as selection
sort, insertion sort, and bubble sort.
Indeed, you can do better
than big O of n squared.
You can do much better, as you felt
a moment ago with visualization.
And merge sort is one such
algorithm with which we can do that.
And amazingly, the pseudocode
code is this simple.
When you're given n elements, if
n is less than 2, just return.
If you have fewer than 2 elements, the
list is of size 1, which is sorted,
or size 0, which is also
meaninglessly sorted.
So you're done.
So that's kind of a base
case, like a default scenario.
Let's just make sure that no
matter what, we eventually
quit, even if the list is super-small.
But then, the magic is hidden
in those last three lines.
Else, sort the left
half of the elements,
then sort the right
half of the elements,
which is completely punting so to speak.
Just kind of recursively
calling yourself.
But the key detail here is that then
you have to merge the sorted halves.
You're not just looking for Mike to the
left and looking for Mike to the right.
You're doing something on the
left, something on the right,
sorting each half, but then you have
to kind of weave things together.
And we'll see how
expensive that actually is.
So let's take an example.
And I mocked this up with some
very simple animation so we
could walk through it step-by-step.
But consider the following.
So here is that same
list from before where
we had a whole bunch of volunteers.
But rather than call up humans,
we'll just do this visually.
Let me consider the fact that
this is really just an array.
So I'm just going to box
everything like this.
And this is actually key that we
have random access to these elements.
We can jump right to
the elements we want
and code ultimately, too, if we cared.
And now, let's focus
on the left-hand side.
So I've wrapped 4, 8,
6, 2 in a solid line
and left everything else
dash because I want to focus
our attention on sorting the left half.
So again, to rewind
this is the algorithm.
And the goal at hand is
to walk through visually
an example whereby we sort 8 elements.
And we do that by sorting the
left half, sorting the right half,
and then somehow merging
these things together.
So how do we do this?
Well, let's consider the left half.
All right.
How do I sort a list of 4 elements?
What does it mean to sort the left half?
Well, what algorithms do I have in
my toolkit for sorting elements?
Yeah, selection sort, insertion
sort, bubble sort, but-- merge sort.
And that's where this
gets weird quickly.
You want to use your same algorithm to
sort the left-hand side of this list.
And yet, we haven't even
finished defining this algorithm,
but that's going to be OK we'll see.
So let's sort the left half.
How do you sort the left half of a list?
Well, you sort a list of size 4.
What does that mean?
How do you sort a list of size 4?
Sort the left half of that.
So let's focus on 4 and 8.
How do you sort a list of size 2?
Merge sort it.
So how do you sort a list of size 2?
Sort the left half.
OK, beautifully.
Done.
Done.
Sort the right half.
Nice, done.
I mean, I'm doing real
work, or claiming to.
But now I have to do something more.
Now I have to merge the two halves.
So I'm taking a step back.
I have two lists now, one
of size 1, one of size 1.
I have to merge these
things, two together.
Which comes first, 4 or 8?
4.
So let's create some extra space for it.
Let's put 4 here and let's put 8 here.
So here's a key detail
with merge sort, kind
of cheating, at least at the
moment, I'm using some extra space.
So I hope you won't mind.
To achieve this, I'm using
more memory, more RAM,
that I wasn't allowing myself
for selection sort, bubble
sort, or insertion sort.
OK.
So now it's going to be
easy to get lost quickly.
So what have I just done?
I have sorted a list of size 2
by sorting the left half, done.
Right half, done.
Merging it together.
So now I have a sorted list of size 2.
Rewind the story.
What step comes next?
Sort the right half.
That was of size 2.
So if you kind of mentally buffer
all of the to-dos that we're doing,
and then pluck them off as we go,
you'll see where the story goes.
Now I want to sort the right
half of the original list.
Sorry, the right half of the
left half of the original list.
How do I do that?
Sort of left half, done.
Sort the right half, done.
Merge them together.
What comes first, 6 or 2?
OK.
So let's give ourselves a
little more memory for that
so we have a place to put it.
2.
And then 6.
All right.
So it doesn't look like all
that much progress, frankly.
My left half of the list is now 4,
8, 2, 6, which is not really sorted.
But we'll get there.
What step comes next?
I've just sorted the right half of
the left half of my original list.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: You're missing a step.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Merge, that third step.
Once you've sorted the
left half of something,
then you've sorted the
right half of something.
Now you have to merge them or
weave them together somehow.
So the left half of the list in
the story at the moment is 4 and 8.
The right half is 2 and 6.
So here's where mentally
you can kind of bring out
your fingers to think through this sort
of conceptually, or in pseudocode code.
Here's the left list.
Here's the right list,
or the right half.
So how do I merge these two together?
Well, it turns out--
and you can see it now.
Before it was idiotic that we
just were merging lists of size 1.
It's pretty easy.
Here now, we have something
substantive to talk about.
So I have one list here, one list here.
Which element from these two
lists obviously comes first?
So 2.
And so what I'm going to do is then
2 is going to have to come down here.
And then I can advance my
right hand so to speak.
And now, compare 4 and 6.
What's going to come next?
4.
Now I can advance my left hand.
I compare 8 and 6.
Which comes next?
6.
Now, my right hand, its job is done.
I have only one more to do.
And this.
So what's key in the merging
is that you essentially
have this left hand and
this right hand concept.
Each of them kind of steps forward
over time, but you never go back.
You never double back.
So it's this sort of linear relationship
with your left and right hand
as you're merging the lists together.
So pictorially, let's see that.
We compare 4 and 2.
2 goes first.
Then, 4.
Then, 6.
Then, 8.
Where am I in the story?
I mean, we've been droning
on for some time now.
I have just successfully sorted what?
The left half of the original
list, which means the next step is?
AUDIENCE: Sort the right.
SPEAKER 1: Sort the right
half of the original list.
So I can box that in to
draw attention to it.
How do I sort a list of size 4?
Well, that's really
just 2 lists of size 2.
Let's sort the left half.
How do I sort a list of size 2?
Well, that's really just a
list of-- 2 lists of size 1.
Done.
1 is sorted.
Done.
7 is sorted.
Now I have to merge them together.
1 comes first, then 7.
OK, now sort the right half of the
right half of the original list.
OK.
Sort the left half.
Done.
Right half.
Done.
Merge them together
into some extra space.
3, and then 5.
Now where am I in the story?
Now I have sorted the left
half of the original right half
and the right half of
the original right half,
so now I have to merge them together.
So 1, 3, 5, 7.
Almost there.
Now we're sort of rewinding in time.
Now we're at the very first
chapter in the story--
sort left half, sort right half, merge.
I have that one final merge step.
So what comes first from the
left half and the right half?
1, 2-- even if you've
lost track of the story,
the answer's going to be
1, 2, 3, 4, 5, 6, 7, 8.
But 1, 2, 3.
But notice what's
happening here in effect.
Whatever code is implementing this,
it's as though my right and left
fingers are just moving from left to
right, plucking off the element that
is indeed then the smallest.
That was a heck of a lot of work.
And frankly, that story seemed to take
longer to tell than any of the others.
But what happened?
And in fact, I've left deliberately,
with this little animation
of sorts, on the screen the
remnants of where we came.
And it looks like I
really cheated, though.
I seemed to first say,
hey, I need some extra RAM.
And I took it.
And then I didn't
mention I need more RAM.
I didn't mention I needed even more RAM.
So this seems to have taken like four
times as much total memory than any
of our previous algorithms, but
that's not actually the case.
I left this here pictorially
so we could see the history.
Technically, you do need more RAM to
implement this algorithm, merge sort.
But we could have cut corners.
Instead of merging down
here, we could have just
merged back into our original chunk.
And so we could have just bounced
between two separate arrays,
but this is just a little more
clear in terms of remnants.
So how many steps did this take?
Well, how many times did I
divide my original list in half,
in half, in half effectively?
I had 8 elements here.
And that was really like, if
you divide it up differently,
two bigger halves and two bigger halves.
But if you divide it again,
that's like two really big halves,
and then you get one big list.
So it would seem that starting
here, I divided it 1, 2, 3 times.
I did some splitting.
Left half, right half,
left half, right half.
And any time you split something
in half, in half, in half,
in half, what's the running time of
anything involving like halving again
and again and again?
It's like binary search.
Log n.
Log n.
So any time you see in
computer science more
generally, certainly CS50 in
upcoming weeks, this division
and conquering again and
again and again where
you're dividing something in half,
odds are logarithmic running time
is somehow at play.
But this algorithm, merge sort, surely
cannot run in big O of log n time.
Because again, you can't sort n
elements in less than linear time
because you'd be guessing
that everything is sorted.
So log n is not the final
answer here, but there
is something logarithmic happening.
However, even though we did
a left half/right half thing,
a left half/right half thing,
a left half/right hand,
thing sort of a total of three
times, from here to here to here
to here, at which point we were done.
On every row that's on
the screen if you will,
what did I do with my
left and right hand?
I sort of always did n steps.
You can really see it up here.
How did I merge this last thing?
I had to touch all 8 elements
by walking through the list.
How did I merge these two left halves?
Well, I had to do this.
And then I had to do this.
So if you kind of consider the remnants
of my finger touching the screen,
every time I did a divide and conquer,
I had to touch every element in order
to merge them together.
So I did log n things.
And every time I did that, I
incurred n steps of merging.
So log n things times n is going
to give me big O of n log n.
So this was among the options on our
sort of menu of possible running times,
at least that we'll focus on right now.
This is bigger than log n, of course.
Because you're multiplying it by n but.
It's smaller than n squared.
Because if log n is less than n, then
surely n times log n is less than n.
And so the running
time here of merge sort
is indeed going to be big O of log n.
Any questions on this here?
If I may, let me point
out one other approach
for seeing this same thing as follows.
You can actually glean
this kind of detail
more formally, especially if you're the
mathy type, from the pseudocode itself.
We didn't have to walk through this
whole verbal exercise or visualization
thereof.
What if we just go back to basics and
just analyze our own pseudocode code,
or our own C code
eventually, line by line?
So here's the original algorithm.
How many steps does this take to
just check if n is less than 2?
Return.
We're done.
Big O of what?
There's just one step.
Or maybe, it's two, like
the if and then the return.
But it's constant.
It's one or two.
We can debate that, but it's constant.
It has nothing to do with the size of n.
You're just making a finite
number of decisions there, 1.
So you know what?
The running time, which I'm going
to formulaically say is t of n.
So the running time when
your input is of size n,
is just going to be like on the
order of constant time, big O of 1,
whenever n is less than 2.
This is not a very bold claim.
It's sort of like I'm plucking off
the easiest part of the question,
but it's at least correct.
If you have a small
list, it's constant time.
The harder question is when we
analyze the rest of the algorithm.
How many steps does it take to
sort the left half of elements?
How many steps does it take to
sort the right half of elements?
And how many steps does
it take to sort-- rather,
to merge the sorted halves?
So we have three questions to answer.
And let me propose this.
You can kind of punt.
And this is the beauty of recursion.
You can sort of answer the
question with the question.
The running time to sort n
elements is really the running time
to sort n over 2 elements, the left
half, plus the running time to sort n
over 2 elements, the right
half, plus the merging step,
which I claimed per my sort
of left/right finger intuition
is big O of n steps linear.
And now this holds if n is
greater than or equal to 2.
Now, unfortunately, this is like a
recurrence if you will, mathematically.
This really is a cyclical definition.
We really need to be able to
generalize this and explain
it a little more formulaically.
But indeed, take a guess.
If you have your old physics
textbook or your math textbook,
what does this recurrence actually
equal if you multiply it all out
and prove it mathematically?
What's the running
time total of something
where you do some function of n over
2 plus some function of n over 2
plus some number of linear steps where
each of those inner running times
is recursively defined?
And just to be clear, the
reason this is like non-obvious
and should be kind of over
most heads, at least if you're
less familiar with this world, is that
I'm saying that the running time, t, is
a function of the running time, t.
Like again, that's defining
a word with the word itself.
But it turns out there are ways to
prove this inductively, mathematically.
And indeed, if you actually multiply
this all out in your textbook,
you will get something on
the order of n times log n.
So where has that left us?
We've introduced a whole number of
algorithms here, the last of which,
merge sort, gives us this ability
to actually solve problems
much faster than any of the others.
But in each case, are we able to
apply some general formulaic analysis,
asymptotic notation so to speak--
big O, big Omega, and big Theta, that
describes in general terms how much
time these kinds of things take.
So let's now translate one of today's
key ingredients to actual C code
so that we can see how we can
leverage this programmatically.
So here in CS50 IDE, I've got
an example, sigma 0 and sigma 1.
Both of which at the end of
the day do the same thing,
but they do it differently.
And in sigma 0 here, we
have the following code.
A sigma function that takes as input
an integer m as its sole argument
and it has a return value of type int.
So it's going to return a number.
Main takes no command line arguments.
So no arg v, no arg c this time.
But inside of main is code that's going
to pester the user again and again
until he or she gives us a positive
integer as per this condition
down here.
And then, notice down here I'm
going to call a function sigma,
passing in whatever
the user's input was.
10, for instance.
Or 11.
Or some other value.
Storing the answer in answer,
and then just printing it out.
And what's nice about
the fact that frankly you
can't see anything else on the screen
is that for all intents and purposes,
sigma, the function at the
moment, is a black box.
It is a function whose purpose in life
is to add all of the numbers from 1 up
until the user's input and return it.
And we don't know yet or need to
care how it's actually implemented.
If we do take a look inside that
black box down lower in the file,
you'll see this code here.
Sigma takes an input, m, as an int.
Returns an int.
And what does it do inside?
Well, I've got what I'll
call a counter-variable.
Called it sum.
Initialized it to 0.
And then using a for loop,
iterating from 1 up to m,
because I want to sum all the
numbers from 1 up through the value
the human has typed in.
I just want to keep adding
to the sum that value.
So it's not just sum plus plus.
If I did sum plus plus as we keep
seeing syntactically in examples,
that would just add 1.
I want to add in whatever i is.
Whether it's 1, or 2, or 3,
as I'm counting all the way
up through the user's input.
And then I return sum.
But the neat thing
about sigma is that it
affords an opportunity for recursion.
This isn't necessarily a better
opportunity or more correct,
but we could view the
world a little differently.
Sigma1.c is identical code in main,
but the implementation of sigma
is fundamentally different.
And here's where it's kind of
mind-blowing, at least at first glance.
You can implement a
function, like sigma,
in such a way that the
function calls itself.
Much like the running time for merge
sort is defined in terms of itself.
Much like searching
or sorting a list, it
can be recursively defined as
searching or sorting a smaller list,
but with the same steps.
We implement this code in C as follows.
On this line here, if m is
less than or equal to 0,
it's really just a sanity check.
If the human was messing
with me and somehow
was able to inject a negative value
or a 0 value into this function,
I don't want to get stuck in
some random, like infinite loop.
I want to catch that.
So I'm just going to say less than or
equal to 0 just to be super-defensive
when programming return 0.
It's not correct to say
that, but I just want--
and I'll document this in
my comments, for instance.
Return 0 if the user does not cooperate.
I need that base case so
that stuff stops eventually.
And then, here's kind
of the amazing thing.
In my "else" condition,
is one line of code
that implements the
entirety of this function.
The result of asking me the
question, what is sigma of m,
is really the same thing as
saying m plus sigma of m minus 1.
In other words, if the goal is to count
up all of the numbers between 1 and m,
logically that's the same as counting
up all the numbers from m to 1.
And so if I view it that
way, if you give me m,
I am going to sort of on my cheat
sheet write down the number m,
and then say plus whatever the
summation is from m minus 1
all the way down to 1.
And you know what?
I have a function, a piece of code, that
gives me exactly that answer if I just
pass-in that slightly smaller value.
So on first glance, it might
look like sigma calls sigma.
That can't possibly be a good thing.
But it doesn't do it forever.
Sigma might call sigma might
call sigma might call sigma.
But at what point in the story
does sigma stop calling sigma?
AUDIENCE: [INAUDIBLE].
SPEAKER 1: When it's counted.
When m, the input, is what?
Negative.
Or we shouldn't hit negative ever.
We should first hit?
AUDIENCE: 0.
SPEAKER 1: 0.
So once m minus 1 gets so small, 1 minus
1, that it equals 0, this line of code
is going to kick in.
And it's going to say return 0.
Thereby, short circuiting
this seemingly infinite loop,
which does, in fact, terminate
eventually once the input
gets small enough.
Now, probably you shouldn't bother
implementing code like this.
If anything, we'll talk
about this in future weeks.
Calling function after function after
function after function after actually
can take up a lot of memory.
It can actually cause your
computer to crash in some cases.
And we'll talk about when and why.
So this isn't necessarily
the best technical decision,
but it's illustrative of the capability
we have in programming to map intuition
from our binary search or our
merge sort into actual code.
And indeed, there's going to
be data structures that we soon
see far fancier than arrays.
We're going to see things
like link lists, and trees,
and tries, and hash tables.
And other data structures
still that are going
to allow us opportunities to implement
algorithms on top of those data
structures so to speak,
to answer questions
using an ingredient like this one here.
But of course, we're left then
fortunately with a better algorithm
than something like bubble sort.
And indeed, this is fairly
well-known now it would seem.
And I thought we'd conclude with
a look at this interview that
was conducted with a certain senator
some years ago who was running
for the presidency and was asked
by Google's own Eric Schmidt,
the CEO at the time, a set
of interview questions.
One of which we thought was
appropriate note to end on here.
[AUDIO PLAYBACK]
[APPLAUSE]
-Now, Senator.
You're here at Google.
And I like to think of the
presidency as a job interview.
Now, it's hard to get
a job as president.
You're going [INAUDIBLE] now.
It's also hard to get a job at Google.
We have questions.
And we ask our candidates questions.
And this one is from Larry Schwimmer.
[LAUGHTER]
You guys think I'm kidding?
It's right here.
What is the most efficient way to
sort a million 32-bit integers?
[LAUGHTER]
-Well--
-I'm sorry.
Maybe--
-No, no, no, no.
I think the bubble sort
would be the wrong way to go.
-Come on.
Who told him this?
OK.
I didn't see computer
science in your background.
-We've got our spies in there.
[END PLAYBACK]
SPEAKER 1: All right,
so that's it for week 3.
We will see you next time.
SPEAKER 2: So you want
to know the full story?
SPEAKER 3: As far back
as you can remember.
SPEAKER 2: All right.
Well, let me tell you.
The first time I met David Malan,
I knew he was nothing but trouble.
SPEAKER 4: There's a David
J. Malan here for you.
SPEAKER 5: Who?
SPEAKER 4: He says he's in
your Computer Science 50 class.
Wants to talk about his grade.
SPEAKER 5: Are you kidding me?
I mean, who does he
think he is, Bill Gates?
All right.
send him in.
