BENEDICT BROWN: This is CS50.
I'm David Malan.
And you can tell because
I'm wearing all black.
[LAUGHTER]
So, when I'm not being David
Malan, I'm Benedict Brown.
I'm a lecturer here in the
computer science department.
And it's really my pleasure to
welcome you all to CS50 at Yale.
[SCATTERED CHEERING]
[LAUGHTER]
And we've been doing this--
this is now our fourth year.
And it was great the first year.
And it's been getting better
and better every year since.
So I'm really thrilled to be here.
This is my second year with it.
And we're looking forward
to a fantastic year.
You will see me throughout the semester,
sometimes in the lab for office hours,
sometimes when you come see me
in my office for office hours,
which I encourage you all to do.
I also work a lot behind the
scenes with the organization
of the class with the staff
and supporting the staff,
making sure that they are in a
position to help you and work with you
in sections and in office hours.
And the other person who
is really working overtime
all semester long on
the course here at Yale
is Natalie Melo, who I would like
to bring up to introduce herself.
And then she'll also introduce you to
many of our core staff that are here.
And Harvard's David Malan will also
be making an appearance, I believe.
NATALIE MELO: [CHUCKLES]
Hi, everyone.
[APPLAUSE]
Oh!
[LAUGHS]
[APPLAUSE CONTINUES]
Hi, everyone!
Welcome!
Happy Thursday!
Woo-hoo1 [LAUGHS]
[SCATTERED CHEERING]
I'm Natalie Melo.
I'm the course manager for CS50 at Yale.
You'll see me a lot at office hours.
I might have my own office hours
by appointment this semester.
But you'll see me around quite a bit.
So when you see me, I hope you smile.
[LAUGHS] But I'd like to introduce
my amazing staff members.
So come on up.
[CHUCKLES] Woo-hoo!
[APPLAUSE]
So we have a full team out
here ready to help you out,
help you learn computer science--
people from all different
backgrounds, some not even
majoring in computer science.
They only took CS50, and they
still wanted to help out.
So feel free if you see
any of them around campus--
say hi.
Wave to them.
If they see you, and
they remember your face,
they'll obviously wave to you, as well.
So, yeah-- let's give a round
of applause to our staff.
[APPLAUSE AND CHEERING]
And now I'd like to hand
it off to Professor Malan.
DAVID MALAN: [LAUGHS] OK.
NATALIE MELO: OK.
DAVID MALAN: [LAUGHING] OK.
[APPLAUSE]
(WHISPERING) OK.
[APPLAUSE CONTINUES]
Thanks, guys.
OK. (WHISPERING) We're good.
[LAUGHTER]
(LAUGHING) Thanks.
(WHISPERING) Thanks.
All right.
So this is, indeed, CS50--
An Introduction to the
Intellectual Enterprises
of Computer Science and
the Art of Programming.
And what's perhaps striking is that so
many of the folks who just stepped up
here alongside Natalie and Benedict
were exactly where you are seated here
just last year.
And so, let's get one
detail out of the way.
68% of you have never--
if last year is any indication--
studied computer science before.
So if you're in this
classroom right now,
and you're feeling, [SUCKING IN AIR]
I'm not so much a computer person,
or why am I shopping this class?
Odds are, the person to the left of
you and maybe even to the right of you
is feeling that exact same thing.
And if I can say, back in my day,
when I first got to college, even
I shied away from computer science,
even though it's kind of in vogue
these days.
Certainly, I daresay, it's
still a field to beware.
An courses like this and others
are a bit daunting, I think,
if you don't think of
yourself as a computer person.
And you certainly use computers and
mobile phones and all that everyday,
but you don't really
know what's going on.
And god forbid, something
goes wrong, it's
not really in your wheelhouse to fix.
And so ultimately in this
class, it's about building
comfort and confidence, skills, and
a foundation in computer science.
And we'll see what
that looks like today.
And let me start with this message that
you'll see in the course's syllabus
as well, that what ultimately matters
in this course truly is just this,
"is not so much where you end
up relative to your classmates,
but where you, in week 11, and up
relative to yourself in week 0."
As we'll soon see, computer
scientists and programmers
tend to start counting from 0.
And it's about a 12-week semester.
And so by week 11--
the 12th-- will you hopefully feel quite
the delta versus where you feel today.
So what is, then, computer science?
And what are we going
to get out of this?
So I dare say, we could distill it--
maybe oversimplify it-- as just this--
inputs and outputs, right.
It is problem solving, but using
computation, using computers,
using hardware, using thought to
actually solve those problems.
And to solve problems, we take inputs.
Now what might the
inputs to a problem be?
I might want to take
attendance and start counting.
So the people in this
room are the inputs.
And the output is the
total number of people.
Or you have some
ingredients in the kitchen.
And the goal is to produce dinner.
And so your inputs are those
ingredients, and the output, of course,
is dinner.
So that's really what
problem solving is.
And inside of this so-called black box
is where computer science really fits.
And giving you the tools
and the ideas with which you
can take inputs and produce outputs
that are of interest to you.
But to do that, we have to represent
inputs and outputs in some way.
Right, I'm using
English words right now.
And if you're taking
math classes, you're
probably using math symbols and numbers.
If you're taking chemistry classes,
you might have yet other symbols
to play with, as well.
But so we have to represent
inputs in this class
and in this world of
computer science in some way.
So how might I represent information?
If I'm doing something like attendance--
1, 2, 3.
It's not uncommon to do something
old school, like, 1, 2, 3, 4.
And then we have little tricks just
to make it a little more compact--
5, 6, 7, 8, 9, 10.
So we can just use hash marks.
And I could do that on my hands,
too-- not for the whole class.
But how high can I count
with just one hand?
You say five, right?
1, 2, 3, 4, 5.
Not really a trick question.
But I daresay, I can actually count
higher than that on this hand, right.
I'm pretty naively just
treating each of my fingers
at the moment like hash
marks on the board.
They're just straight lines
representing people in this room.
But I'm not really permuting
them or combining my fingers
in any interesting ways.
They're just down or up, but from
left to right, or from right to left,
in your case.
So how might I be more clever?
Well, what if this is just 0?
And what if this is 1?
And what if this is 2?
So just one finger up
still-- so not one, but two.
What if this is 3?
What if this-- offensively-- is 4?
[LAUGHTER]
(CHUCKLING) What if this is 5?
5, 5.
And 6, and 7.
In other words, if I
actually take into account
the ordering in which
I have my fingers up,
I can actually count up notably higher.
Off the top of one's head,
anyone want to ballpark
how high I can count on one hand?
30-- yeah, 31, actually.
And we'll come back to
why that is before long.
But 31-- that's pretty painful
to even imagine or physically do.
But if you do kind of permute
your fingers up and down
in different patterns, we can do even
better than just 5 people on one hand.
So it turns out that computers
aren't all that dissimilar to how
they represent information,
because we, in our human world,
tend to use numbers like 1 and 2 and 3,
and we immediately see that and think,
probably, 123.
But why is that?
All you see on the screen, technically,
are just like three symbols, right?
They might be typed.
They might be drawn or painted.
But these are three symbols, or
glyphs, if you will, that just have
meaning that we humans ascribe to them.
And we know that that is 123
because it's been ingrained in us.
But we know this because we've had an
alphabet for quite some time that's
actually pretty expressive, right.
I grew up knowing 0s and 1s
and 2, 3, 4, 5, 6, 7, 8, 9--
technically called the decimal
system, "dec" meaning 10--
10 because there's 10
digits in this alphabet,
so to speak, for representing numbers.
But it turns out computers
are more simple than this.
And they actually don't have access,
necessarily, to all of these numbers.
They actually-- anyone know what
alphabet they actually do use?
Yeah, 0s and 1s, otherwise
known as binary--
"bi" implying 2, because it has
just 2 digits at its disposal--
0 and 1.
But how in the world using just
0 and 1 can you count to 2,
let alone 3 or 4 or 31
or much higher than that?
Well, it turns out, it's
actually pretty familiar,
if you think back to grade school,
how you might have first learned math.
Odds are, if you're like me,
that you saw number like 123--
and it is 123 and not
just 1-2-3, because this
was the so-called 1's place, this
was the so-called 10's place,
this was the so-called 100's place.
And then, none of us do this
anymore, but this would be 100 times
1, plus 10 times 2, plus 1 times 3.
And that, of course, is
100, plus 20, plus 3.
And then we get, mathematically, 123.
And even though it's obviously
the same pattern of symbols,
it now has some mathematical
meaning because we
ascribe meaning to these columns.
Well, it turns out computers work
in fundamentally the same way.
But they just don't have
2s and 3s and 4s and 9s.
They just have 0s and 1s.
So they have to be a
little more clever how
they use those 0s and 1s--
those binary digits, or "bits."
"Binary digits" is where
the word "bits" comes from.
And so they just change
what these columns mean.
And so instead of having 1, 10, and 100,
which it turns out are powers of 10--
it's technically 10 to
the 0, which gives you 1.
10 to the 1 gives you 10.
10 to the 2 gives you 100, and then
1,000, and 10,000, and so forth.
Computers actually use powers
of not 10, but you might guess--
2s, because they only have
2 digits in their alphabet.
So this is the 1's place, and
this would be the 2's place,
and this would be the
4's place, for instance.
And now I only have 0s
and 1s at my disposal.
So how do I represent the number
we humans know is the number 0?
Well, if I need to put a number in these
columns, I'm just going to put a 0.
Or without fundamentally changing the
meaning, I can actually do three 0s.
But, just as in the human world,
we don't need, necessarily,
all of those left-most 0s.
We just need the one.
How do I represent 1 in binary using
just 0s and 1s, if I have three of them
at my disposal?
0, 0, 1.
Why?
Well, I need zero 4s, zero 2s, but I do
need one 1 to represent the number 1.
And here now, can you see why my
fingers went up in the order they did?
How do I represent the
number 2 in binary?
With what pattern?
Yeah, 0, 1, 0.
And again, if that's not
obvious, that's fine.
You can just literally do the math.
Well, I don't need a 4 to
represent the number 2.
I do need a 2.
But and once I have a 2, I
don't need any more numbers.
So I can have zero 1s.
So here's where we began.
This with 0, all fingers down.
This was 1.
This was 2.
And 3 is going to be
what on the board now?
0, 1, 1.
I just need to go from 2 to 3.
And then when I almost
made a gesture, that
was because my middle finger could
have been up, and my other two fingers
were down.
And if we continue this story, how
high can you count with 3 bits,
or, equivalently, 3 fingers?
What's the biggest decimal
number you can represent?
7, because if you just put a
1 in each of these columns,
that gives me a 4 plus a 2 plus 1,
which gives me the number we humans know
has 7.
So that's it.
Like, all this talk for
all these years that you
might have gleaned that computers
only speak binary or 0s and 1s--
it's still fundamentally the same thing
that we all learned in grade school.
And now, just take for granted,
it's just computers only
have 0s and 1s at their disposal,
but they treat them a little bit
differently.
And what's nice is that
we're doing this in chalk
and we're doing this
in fingers, but there's
a very nice mapping between 0s
and 1s to the physical world.
All of us who have phones that
need to be charged at night,
or all of us that have laptops that are
plugged in now or at night or in class,
are plugging them into the wall.
And even if you're not
an electrical engineer,
you probably generally know that
there's like electrons flowing in
and out of things to charge things up.
So there's some kind of input--
a physical or electronic input--
that's either there and
plugged in, or it's not.
And we can actually
simulate this with lights.
The lights are all on right now.
So I daresay, this room,
SSS114 represents a 1.
It's just on.
But if I go ahead and hit a
button, which in the real world
is sometimes literally labeled "0,"
if you've ever noticed, now it's off.
This room is representing a 0.
So this is kind of a
dramatic way of representing
0s and 1s with all of these lights,
but we can do it on a simpler form.
Like all of us have phones.
And those phones have probably built
into software little flashlights
these days.
So if I just tap this button, my
phone here is now representing,
arguably, the number 1.
And if I turn it off,
it's representing 0.
So using one phone or one
flash or really one switch
that I'm throwing,
digitally, by touching it,
I can turn a light bulb on and
off or represent a 1 or a 0.
And can I borrow someone's phone?
Yeah, OK.
That helps.
You don't have to
unlock it for me though.
[CHUCKLES] So if I go
ahead now, and now I
have two bits, two
phones, or two switches
at my disposal, what am I representing?
I think 0, 0.
The light bulbs aren't off.
If I go ahead and turn on this
one, what am I now representing?
1.
And now?
3, good.
And now, if I turn this one off, too.
So thank you so much for volunteering.
And so if, now, I just had more
switches, and in turn more light bulbs,
I could keep counting higher and higher.
So what's actually inside of a computer?
It's just many, many-- billions, even--
of tiny little switches these days.
And those switches happen
to be called transistors.
So if you've ever heard
that word, it's just
like a light switch, or
the switch on a lot bulb,
that either is turning something
on or turning something off.
And as soon as you have that ability
to turn something on, like a 1,
or turn something off, like a 0, you
can clearly count as high as you want,
so long as you have more
and more of those switches.
I have just three, or I have just five,
but we can certainly throw hardware
at it by adding more and more.
So let me pause there for just a second
because that was a bit of a mouthful.
Are there any questions on what
we've now defined as binary?
And recall that the reason
we got into the weeds
here was we need a way to
represent information inputs,
so we just need some kind
of pattern or symbology.
All right.
So that's all fine and good
that we can represent numbers.
But what would you also
like that your computer do?
Back in the day, a computer was, by
definition, really just a calculator
doing mathematical things.
But of course, these days we use
computers and phones for so much more.
What might you also want to represent?
Well, what about just English messages,
or any language for that matter?
In English, I might want
to represent the letter A.
Well, if we had been focusing
on math and numbers here
and patterns, how do we make
this leap conceptually now
to represent letters, so you can
send text messages and emails
and write essays and the like?
How might we solve this problem?
Because again, the only physical input
is, like, my power cord into the wall.
Yeah?
AUDIENCE: A is the first
letter of the alphabet,
so you could represent it as 1.
DAVID MALAN: Yeah.
So that's pretty good intuition.
A is the first letter
of the English alphabet,
so why don't we just start
numbering those letters?
So if A is 1, sure.
And capital A is 1.
Capital B, presumably, would be 2.
And C would be 3, and so forth.
And that's actually spot on.
That's exactly what computers do.
They actually give themselves
a little bit of breathing room,
so it's not actually starting at 1.
Actually-- weirdly-- starts at 65.
But then, the answer
is exactly the same.
B is 66.
C is 67, and so forth.
And so just some years ago, decades
ago, a bunch of humans in a room
essentially needed to decide
that, hey, everyone, can
we all agree that capital A shall
be represented in a computer as 65.
Now, in turn, 65 is represented how?
Well as a pattern of 0s and 1s.
But that's fine.
We don't need to keep
talking about 0s and 1s
if we can just kind of
talk at a higher level,
like our familiar decimal numbers.
So C is going to be 67.
And D is going to be 68, and so forth.
And then there's a whole
mapping, so to speak,
between lowercase letters
and numbers, as well.
What they are doesn't really matter--
just that there is an
agreed upon standard
is actually what's important here.
So it turns out this standard, this
mapping that humans came up with,
which is called ASCII--
American Standard Code for
Information Interchange.
Not interesting what it stands for,
but we all agreed some time ago.
But it turns out that ASCII
actually has some limitations.
It has some limitations, in that you
can only express so many letters.
Here, for instance, is a summary of just
some of the letters we might represent.
And just to hammer this
home, if this is the mapping,
as you were proposed here,
albeit starting at 65,
and you received on a computer
somehow some series of bits--
0s and 1s-- somehow, through Wi-Fi or
ethernet or whatever-- and the bits
you received represent
the decimal numbers
72, 73, and then 33,
what message did you just
received from someone on the internet?
72, 73, 33.
Yeah?
AUDIENCE: Hi.
DAVID MALAN: Hi, so 72 is H. 73 is I.
33, it's actually not on the board.
Anyone want to guess what it is?
So, well said.
It is an exclamation point.
So "HI!" is indeed correct.
And that's because we just agreed
some time ago to have that mapping.
So 72, 73 is HI, followed
by, if you actually
looked at what the dot, dot, dot
meant in an online reference,
you would see that, indeed, it
represents an exclamation point.
Meanwhile, though, underneath the
hood, so to speak, like actually
in the computer's memory--
and more on memory in
the weeks to come--
some pattern of switches,
effectively, is actually
storing that pattern of 0s and 1s.
There's a bunch of light bulbs on.
There's a bunch of light bulbs off.
And they're in groups.
And it turns out that computers
usually use groups of eight
because it's convenient for today's
purposes-- eight bits-- eight 0s or 1s.
You don't have to write all of them
if the leading things to the left
are 0s, so we don't bother typically,
but computers typically use eight bits.
So if you were to represent, in
ASCII, any letter of the alphabet,
you would have 1, 2, 3, 4,
5, 6, 7 bits, historically.
But it turns out, for convenience,
it's often represented as eight.
And there's something called extended
ASCII, but the point is, it's finite.
There's only a fixed number
of bits, or 0s or 1s,
used to represent every
letter of the alphabet.
So what's the implication of that?
If you're only using
eight bits-- and we don't
have to get too into the
weeds of math or arithmetic--
but if you only have eight
bits or eight fingers,
how many different combinations
of fingers up and down
can you come up with?
AUDIENCE: 255.
DAVID MALAN: 200-- well,
a total pattern's 256.
And why is that?
Well, if each of these sort of hangman
placeholders is either a 0 or 1,
there's two options here, times two
more options here, times two, times two,
times two.
And if you just multiply out
two by itself eight times,
you actually do get 256.
So with just eight bits, you can
count from 0 all the way up to 255
if, again, you start counting at 0.
So it's finite.
It sounds like a lot, though.
But if you consider a typical
keyboard, say a US keyboard,
you'll notice that there's
actually a lot of symbols not
on our American keyboards, at least.
And indeed, if you're from abroad
or you speak another language
and you have a different
type of Mac or PC keyboard,
there's probably symbols you use
every day with family or friends that
just aren't on the keyboards that
you might see in the computer lab
or on most American's keys.
And, indeed, ASCII was pretty
biased-- like, American Standard Code
for Information Interchange.
Hey, the operative word
there was "American."
And so what, apparently, can we not
represent with a typical keyboard, let
alone ASCII?
What kind of symbols, if
you speak other languages?
Yeah?
AUDIENCE: A Chinese character.
DAVID MALAN: Yeah, like a Chinese
character, which is not obvious,
certainly on this type of keyboard.
Yeah?
AUDIENCE: Upside-down question mark.
DAVID MALAN: Upside-down question
mark that you might use in Spanish,
for instance, or another language.
Yeah?
What's that?
AUDIENCE: Accents.
DAVID MALAN: Accents on A's and E's and
vowels and other characters, as well.
So we're just scratching
the surface, right.
There's a lot of characters
you might see in your own life
or in any language you've studied
that just don't fit that model.
So just instinctively
then, what is the solution?
There's a problem.
ASCII is not expressive enough.
So what's the solution to representing
accents and other characters?
AUDIENCE: Use more bits.
DAVID MALAN: Yeah, use more bits, right.
And computer scientists might say,
throw hardware at the problem.
And it's kind of true, literally,
like just use more light bulbs,
use more switches, use more memory.
But indeed, that's the solution.
So along came something
called Unicode some years ago,
whereby you can actually use not
just one byte or eight bits--
so eight bits is a byte.
It's just a convenient term to describe
eight bits like the ones on the board.
And with Unicode, a
successor to ASCII, can you
use one or two or three or four bytes?
And no one pointed this out,
but I just flew past this.
Besides being able to represent
things like accents on a keyboard,
turns out that these are very
much in vogue these days.
Right, most of us in this
room probably use emojis,
perhaps use them a little too much.
But what is it?
You're actually not
sending images, per se.
Well, you might be.
If you're downloading GIFs
or animations or whatnot,
those might be proper images.
But if you're just sending emojis from
the built-in Android or iOS or Mac
or PC keyboards, you're actually
sending a keyboard signal, so to speak.
It's not just eight bits.
It's more.
Because on my American
keyboard, I don't typically
see keys labeled with
all of these faces.
But Unicode increases exponentially
just how many symbols, accents,
and upside-down question marks, and
other punctuation-- and, it turns out,
these fun things like
emoji that you can actually
represent inside of a computer.
As an aside, if you've ever
seen or you will eventually
see the expression, UTF-8,
this is just the name
for the very specific standard.
And we'll come back to this
toward the end of the semester
when we start using this.
So does anyone want to guess
what the decimal number is
being used underneath
the hood of a computer,
so to speak, to represent
this, which, according to Apple
is the most popular iOS emoji by far?
I couldn't find a statistic for Android.
I mean, actually, out of curiosity,
how many people have recently
sent this emoji?
OK.
So I guess the data holds out.
Proof by example.
So anyone want to guess
what the value is?
Is it 1?
Is it 65, 64?
Is it 255, 256?
Well, turns out there's a hell
of a lot of emoji these days.
It is number 128514.
That is the decimal
number that represents--
what is it called?
Smiling face with joy,
or crying face of joy--
something like that.
And there's a lot of other emojis, too.
And they have different
skin tones and the like.
So that just involves using more bits--
more 0s and 1s.
So this is only to say that we have
had, in the past, and we have had now
and in the future, like, a way
of representing information.
And we just evolved over time.
And in fact, this is
the pattern of bits.
If you send a text message
or an iMessage right
now to someone with that emoji,
you're not sending a picture, per se.
You're sending, effectively, this
pattern of bits to their phone.
And their phone, upon receiving
that sequence of bits,
realizes, oh, that sequence of bits
represents-- the number 128514.
Let me display to the user the
128514th emoji, if you will,
that it knows how to display.
Well, that's a bit of an exaggeration.
This is just how many
there actually are.
So we've got numbers that we
can represent from binary--
just 0s and 1s and electricity.
We've got letters, and
even things like emojis.
But what else?
Well, it's super common, of course,
to use computers to send actual images
or to look at photographs,
take photographs, look
at pictures on websites.
And so here is just a single dot,
enlarged for the projector's sake.
And we're going to call it a pixel.
A pixel is just a single
dot that has some color.
How, in a computer, might I go
about representing colors now?
Again, if you unwind
the story, all we've got
is like electricity coming
out of the wall and switches
to turn it on and off.
What could we do?
How would you go about defining a color?
Yeah?
AUDIENCE: You would have three
different colors for RGB.
DAVID MALAN: Yeah
AUDIENCE: Then, I guess that you'll
send 0s and 1s for red, 0s and 1s
for green, 0s and 1s for blue.
DAVID MALAN: Exactly.
That's spot on.
And there's actually
two different answers.
There's different ways
you can combine colors.
But that's actually the most common one.
RGB, if you've ever heard that
acronym, it stands for red, green blue.
And if you also regress a
little bit, if you think back
to grade school or primary school or
whenever we all played with like finger
paints, right, odds are
you combined paints,
which is a little different
from combining waves of light.
But with different colors
of paint, can you obviously
make, well, as I
recall, just like brown.
By just combining everything makes
sort of brown by combining things.
But if you think about colors
being represented ultimately
as some combination of red and green
and blue, or RGB as you proposed,
you can imagine taking those colors
and overlapping them and seeing
what color actually emerges.
And so if we call this
RGB, each of those colors
uses one byte or eight
bits, so which is to say,
you pick a number between 0 and 255--
because, again, that's
how high we can count
using a byte, which is eight bits.
So you use eight bits to represent
red, eight bits to represent green,
eight bits to represent blue.
So if you have the value 0, 0, 0,
that's no red, no green, no blue.
And it turns out, that's how you
represent black, is just 0, 0, 0.
But if you throw big values
at red and green and blue--
and this is where it diverges
from the paint metaphor--
then you have lots of red,
lots of green, lots of blue.
And it turns out that makes white--
255, 255, 255.
And we'll come back to this.
Because when we actually introduce
web programming later in the semester
and you actually visually create
web pages that have colors in them,
the onus will be on you to
decide, well, what color
do I want to make my
text or the background?
And in the web, even though you
can just say, make this page red
or make this text blue, you can
also specify more precisely--
give me this much red, this
much green, this much blue.
And the browser will display
exactly that for you.
So if here we have three swatches
of color red, green, and blue
and we actually take 72 amount of red,
73 amount of green, 33 amount of blue--
and then if you're just kind of
animate it in your mind's eye,
overlapping all three
of those colors, that
is how we end up getting this
murky shade of yellow here.
Underwhelming, perhaps.
But we can make a lot of
other colors, as well.
But notice the resemblance of those
values to something we saw earlier.
What numbers that I just propose?
Yeah, it's also HI!
And so, somehow, HI!
is represented by computers as yellow.
So it turns out, context matters.
Like, if you're using a text messaging
application or Microsoft Word or Google
Docs or whatever, and your computer
sees the pattern 72, 73, 33, odds are,
because of that context and
how that program is written,
it's going to display to you H-I-!.
But if you open up another program--
Photoshop, for instance,
if you've heard of it.
Even if you've never used it,
it's like a photo editing program.
Well, odds are, if that program
sees the same pattern of bits--
72, 73, 33, among others, it's going to
say, oh, I should show the user not HI!
but rather one yellow pixel--
probably among many others
if you're indeed editing
a photograph in it.
So it's just context
that actually matters.
So it turns out you can
see this if you just zoom
in on some of the things we see here.
And you might need to do
this with special software
because most phones won't
let you zoom infinitely.
But here is just a copy of that
same laughing-with-joy emoji.
And if I zoom in a little bit,
it still looks pretty good.
If I zoom in a little bit
more, starting to get a little
pixelated, so to speak--
a little blocking.
If I really punch in on
it, now you can actually
see what is going on with this image.
It turns out that an image is just
a whole bunch of dots or pixels--
thousands of them-- tens of thousands
of them, depending how big the image is.
And each of those pixels or dots just
has some color associated with it.
Each of those colors
is 3 bytes or 24 bits.
So just ballpark, if you
had an image like this,
how would you go about
estimating how big the file is?
If this is an emoji and you know
that every pixel is 24 bits,
what kind of intuition would you use
to estimate how big this emoji is
on disk, so to speak?
How big of a file it would be
to actually send a copy of?
Yeah?
Oh, sir?
AUDIENCE: Multiply the
number of pixels with--
DAVID MALAN: That's it--
multiply the number of pixels.
I mean, literally, if
I pointed at them--
1, 2, 3, 4, 5, 6-- count them
all up, all of those squares,
and multiply by that value.
Of course, we're zoomed in so there's
even more of them, but that's it.
So all this stuff with
which you might be
sort of familiar on a day-to-day
basis, it's actually not
even all that complex, so long as
you start so-called first principles
and you understand from the ground
floor up how you solve one problem,
and from there go about solving others.
Now some of you with iPhones might be
familiar with animojis too-- the sort
of silly feature on iPhone X
or your friends' phones that
allows you to do Snapchat-style
things on iPhones, that
actually creates animations or videos.
And I will admit that Natalie
and I have, in the past,
spent far too much time over
dinner like playing with animojis,
though it was only once and only once.
But that footage is somewhere.
But that results in
video files, essentially.
But what is a video?
Whether in the context of animojis or
just movies on TV or movies on Netflix
or movies in the
theater, what is a video?
Yeah, it's just a sequence of images
flying past the screen, quickly.
If you've ever heard of 24 frames per
second or 30 frames per second or FPS--
it doesn't matter if you
haven't-- that's all a movie is.
And back in the day, they
called them "moving pictures"
because they were just pictures,
again and again and again,
flying in front of your face so
quickly that it looks like animation.
But each of the frames
in a movie-- and you
can see it, if you just pause a movie
or a video, you're seeing an image.
And it might even be a
little blurry because
of compression and fancy techniques
humans use these days to save space.
But it's just an image-- one of 24,
one of 30 or so, for that given second.
So we can consider where we started the
story, like electricity from the wall.
Then we had 0s and 1s that
we can represent digitally,
so to speak, as binary.
Then we have letters
of the alphabet, ASCII.
From there, we can go
to colors if we want.
From colors, we get images.
From images, we get movies.
And all of that, by simply
deciding how to represent
the inputs to some problem.
And the outputs, presumably,
come in that same format.
Whatever the problem is you're solving--
you're outputting an image, a
video, a number, a string of text--
that ultimately being
how you solve a problem.
So let me pause here, too, see
if there are any questions.
All right, well-- yeah?
In the back.
AUDIENCE: How do you know
where [INAUDIBLE] rather,
where a string breaks?
Because you can't just put a
0 in if you, like, close it.
DAVID MALAN: Where a string breaks?
AUDIENCE: Yeah, where a binary
string-- like, where it ends,
where a number ends.
So that would be 72,000, or,
well, just a string of 0s and 1s
that somewhere ends
rather than eight bits.
DAVID MALAN: Ah, really good question.
And I only alluded to this earlier.
Humans do tend to use
units of eight bits.
And so, unlike schemes like
Morse code, if you're familiar,
where there are disparate
lengths based on the character,
in ASCII and in Unicode, you were always
using eight bits at a time, or maybe
16, or 24, or 32.
So the boundaries are
always at an eight-bit mark.
The ninth bit onward
represents the next byte.
But there, too, it's context-sensitive.
Because in some programs, you might
interpret four bytes in a row-- byte,
byte, byte, byte--
as four separate characters.
Or you might interpret it as a really
big value for a really big emoji.
And it's totally context-sensitive.
The software, the program you're using,
has to know how to interpret that data.
And so when I mentioned, we'll see
things like UTF-8 and these fancy terms
later in the semester, because
when you're making web pages
and you're writing software,
you sometimes have to decide,
I'm just going to get from my users
a string of bits from the internet.
How do I interpret it?
So it depends.
Other questions on inputs?
Yeah?
AUDIENCE: Why use the binary system
instead of the decimal system?
DAVID MALAN: Oh, why use the binary
system instead of the decimal system?
Let me toss that back to someone
for just someone's intuition.
Why do you think?
Like, why did we solve a problem
in just a different way here?
Yeah.
AUDIENCE: The computer's
only outputs [INAUDIBLE]..
DAVID MALAN: Exactly.
It just maps very conveniently
to the physical world.
It turns out that just having
a switch that's on or off,
or electricity that's
flowing or not flowing,
it's just a very nice unit of measure.
It's either doing something or not.
And humans built things on top of that.
There are systems like ternary, where
you actually have three states--
sort of one here, one
here, and one here.
But it turns out that that can be a
little more error-prone, especially
if there's a bit of noise.
And if you do take a class in
like electrical engineering
or if you've used a battery
in the human world recently,
you know about things like voltages.
Well, generally, voltage is
also a very noisy medium.
And so you roughly want your 9-volt
battery to be close to 9 volts
or close to 0 volts when it's dead.
You don't really want to in the middle
because then your hardware is not
going to perform correctly.
So it's just convenient, really.
Yeah?
AUDIENCE: What happens if you
were to write a message saying,
hi, exclamation point, I'm 65 years old?
With a 6 and a 5, right?
Like, how does it know within
that framework [INAUDIBLE]??
DAVID MALAN: A really good question.
If you want to send a message
that mixes letters and numbers,
it turns out that you are
technically sending only characters,
as we're going to start calling
them in a couple of weeks.
So this is just a website I knew
the URL of-- asciichart.com.
It's looked like this for 20 years.
And it's provided courtesy of
Appropriate Solutions, Inc.
for some reason.
But this is just a mapping of
decimal numbers to the ASCII letters
that they accord to.
And if you look at the chart,
you'll see in the middle--
65 is A, 66 is B. But if you
look farther to the left,
how would you actually
represent in a text message
or whatnot, as you propose, HI!
I AM 6-5 YEARS OLD--
well, how do you represent that 6?
As the pattern 54.
And 65 is actually the pattern 53.
So in the context of sending text,
even if the symbols you're sending
look like human numbers,
they are still represented
in this same system called ASCII.
Great question.
Any others?
Yeah, over here.
AUDIENCE: How would
you do a yellow letter?
DAVID MALAN: How would you do a?
AUDIENCE: A yellow letter?
DAVID MALAN: Oh, really good question.
How would you implement a yellow letter?
OK, a lot of combinations here.
So I to give a simplistic
answer, but it totally
depends on the piece of software.
So a program, such as the ones you
will all be writing before long--
you would have to decide,
well if I want to allow a user
to colorize all of the
letters in his or her message,
well, I minimally need to use 8 bits
for every letter he or she is sending.
But to represent the color,
I might need 8 plus 8--
I might need 24 more bits
to represent the color
of the letter the human is sending.
So you know what, I, this programmer,
am just going to use 32 bits--
8 for the letter, 24 for the color--
to represent every letter that
I'm sending in this message.
But you can do it any
number of other ways.
It's totally up to the
designer of the software.
All right, so from there, if we now
stipulate we have a way of representing
information-- or more generally, just
inputs to problems and the outputs
to those problems-- this course, and
computer science more fundamentally,
is really about what's
in between those two--
the thought processes, the
instructions, the so-called algorithms,
step-by-step instructions
for solving problems,
that fits here inside
this proverbial black box.
So what might be an algorithm that
we want to solve something with?
Well, consider something like this.
It's a little old school, these days.
But this is, or was, a phone book.
And it has a whole bunch of names
alphabetically from left to right.
And it has a whole bunch
of phone numbers in it.
And even though this is
an old-school technology,
honestly, if you think about your own
iPhone or Android phone or whatnot,
and you open up the contacts
application, odds are,
those contacts are sorted
from top to bottom, A to Z,
either by first name or last name.
So it's pretty much just
the digital version of this.
So if we can solve
this physical problem,
we could then apply that same
logic as a programmer to solving
a problem in an actual Android phone
or iPhone or just address book,
more generally.
So if I want to solve a
problem, and here's the input--
represented on paper
as numbers and names
alphabetically from left to right-- and
I want to find someone like Mike Smith.
And the whole book is
alphabetized by last name.
Maybe I'll start at the beginning.
And I'll look at the first page and
look for Mike Smith-- not there.
So I keep going.
And I keep going.
And this is a step-by-step process.
Literally, page by page I'm looking--
should be looking-- for Mike Smith.
And when I find him,
I'll eventually call him.
This process, this
algorithm, is it correct?
Yeah, it's correct, because if
Mike's here, I will find him.
And I think intuitively we'd
all agree that if the problem is
to find Mike Smith, this
algorithm, though slow,
is correct because it
will eventually find him.
But it's slow.
It's inefficient.
All right.
So from grade school, I also remember
counting by 2s, so, 2, 4, 6, 8, 10, 12.
All right, I'm going through faster.
And I'm being a little
sloppy, to be fair.
But is this algorithm more efficient?
Yeah, I mean, it's
literally twice as fast.
But is it correct?
OK, why not?
AUDIENCE: You could skip some of
the pages.
DAVID MALAN: Yeah, I could get unlucky.
And like, once I get to the S's, oh,
dammit, like, just by bad coincidence,
Mike get sandwiched between two pages.
So the algorithm is incorrect.
But do I have to just throw it
away altogether, or can I fix this?
Like, what could you, the human, do?
If you're flying through and like,
oh, dammit, I passed Mike Smith,
do you just give up?
Obviously not.
You just double back.
Right, so there is a fix.
So I would argue this algorithm
is twice as fast, minus one step.
Because if I go ever so slightly too
far to hit the T section, for instance,
or SN instead of SM, then I might
have to double back one page.
And that's not bad, right.
If this is like a thousand pages,
OK, doubling back by one page
really just a drop in the bucket.
But none of us are going to solve this
problem in either of those ways, right.
If you're going to solve this problem,
what are you going to do these days?
Yeah, just go like roughly to the middle
where the S is, if it's not obviously
labeled on the outside.
And OK, I look down.
It's roughly in the middle.
I'm in the M section.
And I know, alphabetically,
Mike is to the left--
Mike Smith.
And so what do I know
about this input now?
Where is Mike?
Yeah, it's on the right--
Mike Smith, yeah.
So on your left, so, Smith over here.
And now, just to be dramatic, we can--
[CHUCKLES] dammit.
[LAUGHS]
[LAUGHTER]
Tear the problem-- thank you-- in half.
[LAUGHS]
[APPLAUSE AND CHEERING]
We can tear the problem in half,
like throw half of the problem away.
And what we're left with,
really is the same problem.
It's the same problem
in the sense that I can
use the same logic, the same algorithm.
But I've taken a 50%
bite out of the problem.
And that was way bigger, because
the first algorithm took one bite.
The second algorithm took two bites.
This time it took like 500
bites out of the problem
all at once, in one instant.
And now I go do the same
thing-- roughly in the middle.
Darn, I went a little too far.
Now I'm in the T section.
But again, I can apply the
same-- much easier now-- logic,
throw that quarter of the problem
away, and be left with like 250 pages.
And I can repeat and repeat and
repeat, until hopefully, logically, I'm
left with let's just say--
(CHUCKLING) the injury law page--
"let us win and fight for you"--
that is not Mike Smith, but
it is, in fact, one page.
And so how quickly did
I actually get there?
Well, if you imagine starting
with like 1,000 pages,
the first algorithm might
take 1,000 steps at worst.
Like, if Mike's all the
way at the end or maybe--
S isn't in the Z section, so
let's say that first one was going
to take like 750 steps, give or take.
Now the second approach
would take maybe 300 steps--
325 steps or whatever, because
I'm going twice as fast.
But this algorithm, where
I started with 1,000 pages
and went from 1,000 not to 999
999, and not from 1,000 to 998.
I went from 1,000 to 500, to 250, to
125, to 60ish, to 30ish, to 15, to 7,
to 3, to 2, to 1.
And I'm just kind of
rounding to clean it up--
to 1, I mean, that was pretty darn fast.
And in fact, if we did that
perfectly arithmetically correctly,
how many times can you divide
roughly 1,000 by 2 by 2 by 2?
It's actually only about 10 times.
So with just 10 dramatic tears of
the page, could I find Mike Smith?
As opposed to taking maybe 1,000
steps or maybe even taking 500 steps,
I whittled it down to 10, by
just leveraging the intuition,
daresay, that all of us are
generally familiar with already.
Now as an academic class, let's
try to apply some thought process
to just what this performance looks like
and see if that recurs down the road.
But first let's express this
a little more precisely.
So pseudocode, I claim, is just a
way of programming using English.
There's no formal definition of this.
You just use succinct, clear
English that gets your point across.
And you might generally
number your steps just so
that you can refer to it verbally.
So here might be the pseudocode
for representing that algorithm,
if I wanted to repeat it or feed it
to like a robot to implement as well.
Pick up the phone book.
Open to the middle of the phone book.
Look at names.
And then I have to start
making some decisions.
If Smith is among names--
call Mike.
That's a success.
That's the output.
Else, if Smith is earlier in
the book, open to the middle
of the left half of the book, as I did.
And then go back to step
2, focusing on the middle
of the left half at that point.
You don't have to technically tear it.
Else, if Smith is later in the book,
do the same thing but to the right.
Else-- and the fourth
thing is important--
if he's just not there at all
and you get down to one page
and he's just not on it,
you have to do something.
You can't just hang or sort of do
the spinning beach ball or hourglass
like Mac OS or Windows might
do if a programmer didn't
anticipate a fourth scenario.
That forth scenario is,
he's just not there.
So you should give up.
So let's call out some
fundamental building blocks here.
Everything highlighted in yellow at the
moment looked to be verbs in English.
They're actions.
And moving forward, we're going to
start calling these things functions--
functionality that a human might have
or eventually that a computer might
perform on that human's behalf.
If we highlight these next words in
yellows-- if, else if, else if, else--
those are going to be called conditions
or branches, the proverbial fork
in the road, if you will.
We can call it any number of things.
It's just a decision point on which
you need to make a judgment call.
And then there's these,
the actual questions
you're answering in order to
decide to go left, so to speak,
or right, or down the middle road.
And those questions might
be, is Smith among names?
Is Smith earlier?
Is Smith later in the book?
Those, more fancifully, are going
to be called Boolean expressions
after someone named Boole.
But they're just questions
that have yes/no or true/false
or, heck, 1 or 0 answers, right.
It isn't really what we
call any of these terms.
We can even distill
them down to 1s and 0s,
and that's great because
computers can do exactly that.
Lastly something here highlighted
in yellow, go back to step 2--
that might just be a loop.
It's some kind of cycle
that we're inducing.
And indeed, that's going to be another
construct that we have in programming,
not using English or
pseudocode, but rather using
more traditional languages.
But before we do that,
let me propose this,
how do we think about just how
fast or slow that algorithm was?
Well, we don't actually need to do
any kind of actual strong measurements
or actually precise measurements.
But let me just propose a relationship.
So here's a general graph.
Let's actually call this like time
in some unit of measure-- seconds
or whatever.
This is number of pages,
or more generally, instead
of calling this pages,
let's just genericize it
as the size of the problem.
So if this chart represents along the
horizontal axis how many pages are
in a phone book and the vertical axis
represents how much time it takes
to find Mike Smith or anyone else--
the performance of the problem--
what shape did the first algorithm have?
Took me a thousands
steps in the worst case.
And what's the relationship between
the length of the phone book
and the amount of time?
Yeah, it's just-- it's linear.
It's a straight line.
Why is that?
Well, a handy way of
thinking about it, I think,
is if like the phone
book company, Verizon
or whoever, adds one more
page to the phone book
next year because a bunch of
new families move into town,
well, that's going to add
a page to the phone book.
And suppose you're looking for a
family whose last name starts with Z,
they're all the way at the end.
And that first algorithm, you might
have to turn 1,001 pages to get to them.
So one more page--
one more second.
And so the relationship
is a straight line.
One more page-- one more second.
One more page-- one more second.
And you get a straight line.
The second algorithm, where I
was flying through the phone book
two pages at a time, what's the shape
of that algorithm going to look like?
Yeah, but steeper, I heard.
Yeah, so steeper at least, or lower
in the way I'm drawing it here.
For instance, if this line
is straight up this way,
the other algorithm is
going to be half as high.
Now how do you think about that?
Well, for instance, if this is the
number of pages in the phone book,
the first algorithm might take this
many seconds, whatever that is.
The other algorithm
should take half as long.
So I just eyeballed it.
Hopefully that point is roughly
half as tall as the other point.
And that relationship holds
no matter how far out you get.
And this one's a little less intuitive,
but that third and final, daresay,
intuitive algorithm-- dividing and
conquering the phone book in half
and half and half--
has a fundamentally
different shape that's
actually a curve that
doesn't ever flatten out,
but it almost looks like it is because
it starts to grow, so to speak--
it starts to rise so terribly slowly as
the problem gets really and really big.
How do you think about that intuitively?
Well if Verizon next year
consolidates two towns,
like New Haven and another town over,
just for whatever bureaucratic reasons,
the phone book might go from 1,000
pages to 2,000 pages overnight.
But how many steps is that third
algorithm actually going to take?
Just one additional step.
So even if this is 1,000
pages and this is 2,000 pages,
it just barely increases-- maybe by
one second or one additional page turn.
And that holds true the
farther and farther out you go.
And this is only to say that, even if
computer science doesn't necessarily
feel like something with which you're
all that familiar or necessarily going
to be comfortable, it really
is about just harnessing
some of this intuition
and these heuristics
that we take for granted,
but kind of formalizing them
so that you can really and powerfully
solve problems that you're actually
interested in and not just
some of the smaller examples,
like counting and representing numbers.
Like, the intuition
isn't all that far away.
But of course, with this phone book or
with the contacts in your Android phone
or your iPhone, they're sorted already.
Right, I made a pretty
serious assumption
that that phone book was sorted.
And in fact, that's perhaps
not necessarily fair.
Could we ask for one
volunteer to come on up here?
She's got to be comfortable
appearing on camera.
But you don't have to say too much.
Yeah, what's your name?
AUDIENCE: Katarina.
DAVID MALAN: Katarina.
Come on up.
Nice to meet you.
Yup, right this way is fine.
All right, David.
Nice to meet you.
AUDIENCE: Nice to meet you.
DAVID MALAN: So behind
door number one here
we have a whole bunch
of pieces of paper.
And I would like you--
there's no prize at the
end of this, sadly--
but I would like you to
find for us the number 50.
Nope, that's the number 8.
AUDIENCE: Are they in order?
Is it--
DAVID MALAN: That's the number 42.
That's the number 4.
50!
All right.
Applause for Katarina if we could.
[APPLAUSE]
OK, just wait one sec.
[APPLAUSE CONTINUES]
So, why did you ask me
if they were in order?
AUDIENCE: That way I knew if it
would be to the right or the left.
DAVID MALAN: Yeah, and
so, indeed, they weren't.
These numbers are all
pretty random in this case.
So, she actually did as well as
anyone could do with the board.
And maybe you could
have been gotten lucky,
but not by any kind of
formal process or algorithm.
You just took three,
four steps that time
because you had no other
information to go on.
But if we instead do give you that
assumption-- now, dramatically,
behind door number 2--
[LAUGHTER]
--we have seven more pieces of paper.
But I do tell you this time
they're sorted from small
to large, like a phone book,
like a contacts in a phone,
where are you instinctively
going to start, probably?
AUDIENCE: Probably in the middle.
DAVID MALAN: All right, so
start in the middle, sure.
Because you don't know how small
or how big the numbers get.
But find me the number
50 with this set of data.
(WHISPERING) That's not the middle.
AUDIENCE: The middle.
DAVID MALAN: [LAUGHS]
[LAUGHTER]
AUDIENCE: This is the middle?
DAVID MALAN: That's a bug, if you will.
OK, 16, not 50.
But what do you now know?
AUDIENCE: That this is to the right.
DAVID MALAN: Exactly.
We don't have to even bother
looking at any of these numbers.
So find me 50.
AUDIENCE: So in the right.
DAVID MALAN: In the
middle of the right half.
Nope, still 42.
But where are you going to go next?
AUDIENCE: The only way, so this--
DAVID MALAN: And why that
way and not this one?
AUDIENCE: Because this
is going to be smaller.
DAVID MALAN: Good, because
42 is smaller than 50, so--
excellent.
A round of-- [LAUGHS]
[APPLAUSE]
Congratulations.
[APPLAUSE CONTINUES]
So, it seems to be a very
powerful thing if you're
allowed to assume from the
get-go that your data is sorted.
Well, that's great.
But someone how to do that sorting.
Right, Verizon sorted the phone book,
or Apple or Google sorted your contacts,
so that you can then benefit by looking
up Mike Smith quickly, or all of us
are familiar with like
auto-complete these days,
where you just start typing characters.
It turns out that phone
that you're using,
when you're searching for someone,
is probably using this same algorithm
that Katarina did.
It looks roughly in the
middle of your contacts.
And then if you're looking
for Smith in your phone,
your phone is going to look only at the
contacts below or only at the contacts
above, so that it can find things much
more quickly for you, logarithmically,
so to speak--
but more on that down the road.
But if we need to
actually sort the humans,
it kind of invites the question,
well, that's all fine and good
if searching is fast--
but how fast is sorting?
Right, how did I, before class, or
Colton, who lent a hand before class,
sort those numbers?
And indeed, just to prove that
this was not fixed, these numbers--
same numbers, it turns out--
but they are, in fact, sorted.
But they didn't start that way.
So actually, why don't we do this?
Rather than answer it in the abstract,
can we get eight volunteers this time?
OK, 1, come on down--
2, 3, 4, 5, 6, 7--
no one from this side?
Really?
OK, 8.
Come on down.
All right.
And if you could just
greet Colton over there.
He's going to hand you a prop.
And go ahead and take a seat
wherever you would like.
OK, Take a seat wherever.
All right.
And just to make this
a little more personal,
should we just fly through the line?
Do you want to just say your name
and your college and pass it on?
AUDIENCE: I'm Victoria.
And I'm in Trumbull.
DAVID MALAN: Victoria.
AUDIENCE: I'm Will.
I'm in Pauli Murray.
AUDIENCE: Elena, I'm in Hopper.
AUDIENCE: Andy, Berklee.
AUDIENCE: Adam, Morris.
[FAINT CHEER]
AUDIENCE: Tieger, Pearson.
[SCATTERED CHEERING]
AUDIENCE: I'm Valentina.
I'm at the School of Management.
AUDIENCE: I'm also Victoria.
I'm in Ben Franklin.
[SCATTERED CHEERING]
DAVID MALAN: Excellent.
Glad to have you all up here.
OK.
Oh, key step now-- if you could go
put on what you have been given.
All right.
So hopefully, you did
not sort yourselves.
Good, because then we'd have
very little to talk about.
So I see a 7 an 8, a 4, a 3, a 5, a
2, a 6, and 1, all the way at the end.
So this invites a question.
How do we go about sorting these folks?
And now notice, it's actually
deliberate that we gave them chairs,
because it turns out in a
computer, you have stuff called
memory-- and more on that before long.
But memory is in
specific, physical places,
just like our humans are in
specific, physical places.
And for now, we're going to have
each of them representing a number.
And numbers might take
eight bits or more.
So each of them represents one or more
bytes, but in a very specific location.
So I see that you're
clearly out of order.
So why are they out of order?
What does it mean for these
humans to be out of order?
Like, identify the problem, not
just intuitively, but precisely.
Yeah, sure.
AUDIENCE: We randomly sat down.
DAVID MALAN: You randomly sat down.
OK, what does "random" mean, though?
AUDIENCE: It means you're not
actually sorting as you go.
Like, you couldn't search and
sort, but we just sat down.
DAVID MALAN: OK, you just sat
down, so you're not sorted.
And just could you or someone clarify
what would it mean to be sorted?
Yeah?
AUDIENCE: Like my location doesn't
tell you anything relative to me.
So like, if we were
actually in order, you'd
know that anything to the
right of me is bigger than me.
DAVID MALAN: Good.
So what I'm hearing here is, rather
than use words with which we're already
familiar, like random or
such or sorted, now you've
given us a more formal definition.
Everyone to your left should
be smaller, presumably.
And everyone to your
right should be larger.
And because we have a 3 to your
left and a 2 to your right,
it's not sorted because the 2 should
obviously be somewhere to your left.
So let's just start
at the beginning here.
So we see 7 and 8.
Are they in order?
Are 7 and 8 in order?
AUDIENCE: Which way we go, yes.
DAVID MALAN: Yeah.
The goal is, to be clear--
good qualification--
from smallest to largest, as before.
So 7 and 8 are in order.
I still have some problems ahead
of me, but they're in order.
And there's really no
work to be done here.
What about 8 and 4?
Are they in order?
AUDIENCE: No.
DAVID MALAN: No, so would it improve
the situation if we had them kindly
swap seats?
AUDIENCE: Yes.
DAVID MALAN: OK, so yeah, if
you wouldn't mind, 4 and 8.
All right.
And now let me keep going.
8 and 3, are they out of order?
Yeah, so if you wouldn't mind as well.
And let's see, number 8 and 5,
if you wouldn't mind, yet again.
I'm sorry, you're going to
be doing this frequently.
8 and 2.
[CHUCKLES] How about 8 and 6?
And sure, how about 8 and 1?
OK, thank you.
So, problem solved, right?
AUDIENCE: No.
DAVID MALAN: No.
But it is better.
In what sense is the
problem better off now?
Yeah, 8 is where she should
be all the way at the end.
So in some sense, I can kind of
sort of ignore, if we could here,
number 8, because that
problem is solved.
So the problem has gotten smaller.
It didn't get halved, per se.
This problem is not as easily solved
as finding Mike Smith in a sorted phone
book.
But we did solve or chip
away at this problem.
And honestly, if we
repeat this process I
think it's going to kind
of play itself out, right.
Because 7 and 4, if you
wouldn't mind swapping.
And then 7 and 3, if you
wouldn't mind swapping.
And 7 and 5, 7 and 2, 7 and 6, 7 and 1--
problem's getting better.
But notice, I'm using the
same algorithm, right.
I'm just looking at two
people side-by-side.
And if they're out of order, swap them.
And it's taking me some time.
I've got to do it again.
But notice, I don't have
to go all the way back.
I can kind of focus now, if you
will, only on these six problems.
So 4 and 3, if you could swap.
4 and 5.
Oh, oh, oh.
That was a bug.
OK, you're good.
4 and 5.
OK, but 5 and 2-- out of order.
5 and 6-- nope.
OK, 6 and 1.
6 and 1-- out of order.
And now 6 is in place.
And let's go ahead and just
kind of fast forward just
so we can get the dramatic reveal.
What should you look like if we just
expedite this algorithm in the end?
[LAUGHS]
OK, correct.
But what was that algorithm?
Right, and so there's a difference.
Like, all of us have a nice
intuition for solving the problem.
But if you really sat
down and tried to explain
to like your younger sibling,
who's really young cognitively
and just needs to be
told exactly what to do
or you tell a robot these days
exactly what it needs to do,
you really need to
distill this intuition
into something a little more formal.
And that was just one way.
But it's probably not the only way.
In fact, could we do one thing?
Could you just randomize yourself?
And I'm not technically giving
you a definition of that,
but odds are we'll just kind
of make a mess of the problem
and we somehow have an
intuition for that too.
But just to point out that there's
not one solution to a problem.
And we won't even touch
on all of them today.
You know, I claim there's an
easier way, or there's at least
another way to solve this problem.
You know what, I could just
look through the numbers--
my goal being smallest to largest--
let me just find the smallest element
and not walk back and forth like I was,
swapping things, which felt like
it was messy and it worked out.
But it just felt like a lot of labor.
So, all right, 6 is pretty small so far.
7, you're bigger than that so--
oh, 5 is better.
Ooh, 2 is really good.
3, 2, 1.
OK, I found the small--
oh, I should double check.
Yep, I found the smallest element.
So if you wouldn't mind standing up.
Where does she belong?
Yeah, all the way over.
Yeah, could you?
So I can just move my data around.
And if number 1 wouldn't
mind coming here.
Now, I've coincidentally
made part of the problem
better by moving a bigger number up.
But I really just needed to make room.
But I could have done that differently.
And actually, would you mind
swapping back for just a second?
I don't have to just summarily
evict number 6 to make room.
What else could we have done that
might have been a little more polite?
Or if you imagine being
in the theater or a movie.
Yeah, I saw this.
So if number 1, could you sit up?
And then if you guys wouldn't mind,
could you just kind of scoot down?
This would be another way to do.
Which of those ways is better?
The first way or the second way?
You say first, why?
AUDIENCE: It uses less memory.
DAVID MALAN: Uses less--
well, I still see eight
people and eight seats.
So I don't think I've
added any seats or people.
AUDIENCE: The number
of operations is less.
DAVID MALAN: Operations.
The number of operations
would seem to be less,
because even though
you're independent beings,
there was just more
physical effort going on.
And frankly, I think my first
instinct of just swapping 1 and 8
was just a little more efficient.
But it's no more correct
than this approach.
So again, there's this tension between
correctness, perhaps, and efficiency,
such to gain a bit more
efficiency, you just
have to think a little harder about the
problem and try to avoid doing work.
And so that's kind of the great irony.
By pouring more time into writing
your code or writing your algorithm,
can you over the long run really solve
problems and save time significantly?
So if we could, a round of
applause here for our volunteers.
[APPLAUSE]
And if you want to go off that way.
[APPLAUSE CONTINUES]
You're welcome to keep the shirts.
We just need the numbers back.
And so where are we
going to go from here?
So again, we can we started the story by
talking about data and representation--
just 0s and 1s--
and all of a sudden we're
solving problems algorithmically.
It turns out those
algorithms have fancy names.
The first one, most people
call "bubble sort"--
"bubble sort" in the sense that the
big numbers kind of sort of bubble
up from left to right, making their
way to their correct position.
The other one is called
"selection sort,"
typically, because what I was doing
was selecting the smallest element.
And even though we
didn't finish that story
because it would get pretty
tedious pretty quickly,
the next step would have
been, let me find the number 2
and put him or her in place.
Then let me find 3, put it in
place, and so forth, iteratively,
again and again selecting
the next smallest element.
And thus far, we've done
this very physically.
We've done this very verbally, very
much in pseudocode, if you will,
like the text here on the board.
And so this class is ultimately going
to be about expressing these same ideas,
but using languages-- language is
called Scratch, which we'll soon see,
which is actually a
graphical programming
language via which you program
by dragging and dropping
things that look like puzzle
pieces that you can only
lock together and interconnect and
make things like a cat move up,
down, left, right, or meow on the screen
if it makes logical sense to do so.
But then very quickly,
we're going to transition
to a more traditional but more textual
language called C, where you actually
type code that is just
English-like syntax that's
even more succinct and more terse
than my pseudocode code before,
but it has a formal definition.
And it's way smaller than English.
So in some sense, even though
the logic might take time
to get comfortable with and
remembering the syntax might take time,
there's far fewer words and
symbols in a language like C
than there is in English or
whatever your native language is.
So realize that even though it might
look like the proverbial Greek to you
at first, it will get more
familiar quite quickly.
From there, we'll introduce
languages like Python,
with which you might be familiar at
least in name-- very much in vogue
and allows you to solve
problems in the data sciences
and medical sciences
and arts and humanities.
If you have data that
you want to manipulate
or images that you want to analyze
or text that you want to search,
we'll introduce JavaScript,
a language that's
used on browsers typically these days
to actually make more interactive user
interfaces and make better
experiences for users
and help him or her actually navigate
data or actually access information
or content that you
want to provide to them.
SQL or "Sequel," this
is a database language.
Once you've got some really juicy data
or lots of users, lots of customers,
or lots of medical records
that you want to search,
turns out you can use another language
called SQL to actually search that data
and store it more efficiently.
And so rather than be a course about
just one language or one way of solving
problems, well, we give
you this whole arsenal
via which you can actually solve
problems using the best tool ultimately
for the job.
We'll start by writing the
simplest of programs next time,
whereby you'll literally drag
a couple of puzzle pieces--
an orange one and a purple one.
They'll lock together.
You'll click Play.
And the cat will meow.
And we'll very quickly do more.
And a week after, you'll be doing
something very similar in spirit,
but much more powerfully.
And ironically, we're going to
take away the menus and the buttons
and all the graphical
components with which you're
very familiar on Macs and PCs, expose
you to things like a terminal window,
so to speak-- a black and
white window that's just text,
but at which you can actually
start expressing yourself
more succinctly, more powerfully.
It's just going to be a
bit of a learning curve
until you get comfortable with what
otherwise is an unfamiliar space.
In just a moment, we'll adjourn
outside where the TAs and staff have
a wonderful assortment of cake
for you, as is our tradition,
so that you can say hello
and ask any questions of us.
There's no homework
this week until next,
but we thought we would end
with something more visual.
I'm going to go ahead and dim the
lights for our final 60 seconds
here, because even though we focused
today on ideas and on problem solving,
it turns out, that as Natalie
and Benedict introduced,
there's a whole team around you and
a whole academic and social support
structure that we've captured here
thanks to last year's students
in both New Haven and Cambridge.
[MUSIC - P!NK, "WHAT ABOUT US"]
SONG: (SINGING) What about us?
What about us?
La da da.
We are searchlights,
we can see in the dark.
We are rockets, pointed up at the stars.
We are billions of beautiful hearts.
And you sold us down the river too far.
What about us?
What about all the times you
said you had the answers?
What about us?
What about all the broken
happy every afters?
What about us?
What about all the plans
that ended in disaster?
What about us?
What about trust?
What about us?
What about us?
What about us?
What about us?
What about us?
[MUSIC CONTINUES]
DAVID MALAN: All right.
That's it for CS50.
We'll see you next week.
[CHEERING AND APPLAUSE]
