[VIDEO PLAYBACK]
- Represented ultimately in C.
[END PLAYBACK]
DOUG LLOYD: So what motivated you to
make this the ultimate challenge in C?
This is right around the point
where we say goodbye to C and we
transition to another language.
What motivated this as the endpoint?
DAVID MALAN: You know
that's a good question.
It's been so many years
that I'm not sure I
could put my finger on the one reason.
But what I do really like about
this portion of the class,
and this problem set wherein students
have to implement the fastest spell
checker possible.
By implementing a dictionary
according to an API that we give them.
A set of four functions they
need to fill in the blanks for.
Is that it's a really open ended
opportunity to design something.
And we do give them a menu of sorts.
We encourage them to do a hash table, a
trie, barring that even a linked list,
or an array.
Albeit less compelling than the more
sophisticated and efficient of the data
structures.
And it's nice.
Because students reach
this decision point
where they have to decide non-trivially,
are you going to go down this road
and commit to it.
Or go down this road and commit to it.
And I would say many students find
the hash table a little easier.
We do give them a decent
amount of linked lists
sample code that they can weave into it.
But a trie is also a nice
challenge for students.
And it's not all that hard.
It's just a little more difficult
to think about, perhaps.
Because in each of your
node you have an array.
So it's an amalgam of even more things.
But it's great fun too.
Because we have the
so-called big board wherein
students are invited to challenge
each other, literally with a prescript
called challenge.
And benchmark their code in terms
of how much RAM they're using
and how many CPU cycles in
order to spell check big files.
DOUG LLOYD: Yeah.
For my own students I
usually set a beatable time.
And I have special prizes
for anybody who can get ahead
of my time on the big board.
We're really trying to get
them to think about design.
DAVID MALAN: I too post a code
that I know to be very slow.
Knowing that I'll get beat by the
best of our brightest students.
DOUG LLOYD: I'm not sandbagging.
DAVID MALAN: I mean, I iterate.
DOUG LLOYD: I'm not
CS50 on a [INAUDIBLE]
and I can't get to the
top of the big board.
DAVID MALAN: Well what's fun too about
this assignment in this gamification,
if you will.
And it's opt in.
Students don't have to post
their code or the result publicly
if they don't want.
But we do encourage them that even if
they're the bottom of the big board.
Hey, you made it.
That means your code's correct.
And there's nowhere to
go but up from there.
DOUG LLOYD: Right.
DAVID MALAN: But we've had
fun anecdotes of students.
One student will be just
above his or her roommate.
They go to dinner.
They come back.
And, oh my God.
The roommate now bested me.
And it motivates students
to put even more time
and more thought into the
design of their program.
So that they can inch
back above whoever it is
they're playfully competing against.
DOUG LLOYD: Yes.
It's probably the problem set where
we see the most multiple submissions.
DAVID MALAN: Yes.
That's true.
DOUG LLOYD: I have always
been the kind of person who
found hash tables to be more intuitive.
But I know many of my students actually
think that tries are more intuitive.
I think I've always gotten
tripped up by tries.
Yeah, I think it's a lot of pointers.
And so I always appreciate
it when a student
is willing to throw him or herself
into the challenge of having
to manage so many pointers.
DAVID MALAN: We do tend to find
that the tries under perform.
In fact, my implementation is a trie.
And there's actually some
non-trivial memory effect.
You just waste so much
memory with a trie.
You then suffer from cache
misses underneath the hood,
though we don't discuss this to
this level of detail in CS50.
So in reality, hash
table with a good hash
function, which students are welcome
to go find online if they'd like.
To just use that as a
building block or a black box.
Seems to historically best most other
implementations in this challenge.
DOUG LLOYD: And that's
actually something
that we can show them
on the big board too.
Because in addition to
showing them their runtime.
We actually show them how much
RAM their data structure consumes.
And you can usually tell by looking
for who implemented a hash table
and who implemented a trie.
DAVID MALAN: Indeed.
It's a lot of fun.
And what I'm so proud of, honestly.
Is when students, as you've long
said, who just six, seven weeks prior
had never programmed before.
Or they were just
understanding how binary works.
They're implementing a hash table,
a trie, pretty sophisticated data
structures.
And this is atypical.
I mean we always say that CS50
is an amalgam of sorts of what
most universities call CS1 and CS2.
And this really is the CS2
side of things, just seven,
eight weeks into the semester.
Students are implementing
data structures in CS50
that they might not otherwise
encounter for another term or two,
more traditionally.
DOUG LLOYD: Yeah.
We certainly message,
and I certainly hope,
that students who complete problem set
five, the misspellings problem set.
That really feel a
sense of accomplishment.
It is a really big task.
And it's always very gratifying to
see students light up in office hours
when their spell checker is finally
matching the staff solution.
DAVID MALAN: I think
there's some relief.
And seriously, when you get to
Python just a week or two later.
It's just magical how
you can then whittle down
this entire concept of a hash table or
a trie to just a single line of code.
Instantiating a dictionary
in Python, a similar data
structure, an associated array in other
languages, or an object in JavaScript.
You just get these things for free.
But what's key is that students
understand them going into them.
They're not just taking for granted
that these constructs exist.
DOUG LLOYD: Because it is
such an appreciation for what
is under the hood.
DAVID MALAN: And for
those wondering what
those milk cartons are
doing onstage, also
discovered in the nearby dining hall.
But it's a nice way I
think of representing--
DOUG LLOYD: That's like
our major prop warehouse.
DAVID MALAN: --hash buckets.
You'll see that we
cut some corners here.
We don't try to carry off 26 of those.
We just do ABC. . .
Z. And then I'm using,
I think, blue books.
For instance, that you might have for
examination periods where students
might have their names on them.
And you can use most anything for this.
But it's a way of very
physically and demonstrably
showing how you could hash on alphabetic
values and toss them into the program.
DOUG LLOYD: And we actually do
that ourselves, behind the scenes.
Up until last year when we were
doing all of our quizzes on paper.
When we finished grading them, we would
hash them by TFs name to more quickly--
DAVID MALAN: You did.
Lesson learned.
And one of my favorite photos actually,
and things that happened in reality.
Was that one of our first
CS50 hack-a-thon's years ago,
did one of the TFs just completely on
their own decide to write a poster.
That just says hash yourselves.
And there were three lines for checking
into the event, A through L and M
through whatever, and so forth.
It was a playful, geeky
thing to message with.
