[MUSIC PLAYING]
DAVID MALAN: So odds are you know
someone who is or perhaps are
yourself a computer person.
But what does that mean?
Well I propose that a
computer person is simply
someone who's really good
at computational thinking.
Thinking methodically, or more
formally, thinking algorithmically,
whereby you take inputs to some
problem, you solve that problem
and produce output, but you do so
step by step by step, along the way
defining any of the
terms or ideas that you
need to use so that anyone
else following along
can understand exactly
what it is you're doing.
Computer science itself is really
about computational thinking, then.
It's not about programming,
with which it's
often conflated, although programming
is a wonderfully useful tool with which
you can solve problems,
but computer science itself
is about solving problems themselves.
And it can be in any number
of domains, whether it's
software or hardware or artificial
intelligence or machine learning
or data science or beyond.
That's the wonderful thing
about computer science--
it's just so wonderfully
applicable to other fields.
It's all about equipping
you with a mental model,
so to speak, a set of
concepts, as well as
some practical skills via which
you can bring to bear solutions
to problems in other domains entirely.
Now what, though, does it
mean to solve a problem?
I propose that we think about
problem-solving quite simply as this.
If you've got some problem to
be solved, you have an input.
And the goal is to solve that problem
and come up with some solution, which
we'll call simply our output.
And in between those inputs
and outputs hopefully
are the step-by-step instructions
via which we can solve that problem.
Now how we're going to solve that
problem we'll have to come back to.
For now we'll consider it to
be the proverbial black box.
So we don't quite know or need to
know right now just how it works,
but that it can work, and
we'll come back to this soon.
But for now, we do need a way
to represent these inputs,
and we do need a way to
represent our outputs.
Right now I happen to
be speaking English
and you happen to be hearing
English, and so we've sort of
tacitly agreed that we'll represent our
words using this particular language.
Now computers tend not to use
English, they have their own system,
if you will, with what you might
be familiar even if you don't quite
speak it yourself, and indeed,
there are different ways
you can represent information.
For instance, let's consider the
simplest of problems-- for instance,
counting the number of people in a room.
I could do this old-school.
I don't need computers or English, I
can simply use my physical hand and say,
I'm starting with zero
people and now I'll
count one, and then two, and then
three, and then four, and then five,
and then--
I'm out of fingers, but I can at
least employ my other hand, maybe
a couple of feet and get
as high as 10, maybe even
20 using this physical system.
This is pretty equivalent,
actually, to just keeping
score counting the old-school
way with hash marks.
Right now we've not counted
anything, but as soon
as I start drawing some lines,
I might represent the number 1,
or 2 people in the room, or 3, or 4.
Or by convention I can draw a slash just
to make super clear that this is now
representing 5, and then I
can do another five of those
to count up the 10 and beyond.
Now these systems of hashes, as well as
this system of counting on my fingers,
actually has a name.
That of unary notation.
Uno implying one, simply signifying
that I only have one digit--
pun not intended-- via
which I can count things.
This takes the form of my fingers on my
hand or these hash marks on the screen.
But it doesn't get me very
far, because of course
on my one hand with five fingers,
I can only count up to five total.
And even on the board it feels pretty
inefficient, because for every number
I want to count higher,
I have to draw yet
another line on the screen, which
just continue to accumulate.
But suppose we were a little
more clever about this
and we thought about this
problem from a different angle.
Maybe I could take into account
not just the number of fingers
that I've raised, but the
order in which I've raised them
or the pattern via which I've
permuted them rather than just taking
into account how many of them are up.
If I take that approach, maybe I can
actually count even higher than five
on just one hand before resorting
to another hand or feet.
Now how could I do that?
Well instead of counting
up 0 to 1 to 2 to 3 to 4
to 5, why do I instead start
with some patterns instead?
So I'll still start with 0, I'll
still start with 1, but now for 2,
I'm not just going to
raise the second finger,
I'm instead going to
go from 0 to 1 to 2,
putting down that first finger
or thumb, simply representing 2
with this one finger.
And when I'm ready to
represent the number 3,
I'll bring that finger back up.
And so using just two fingers now have
I represented four possible values--
0, 1, 2, and 3.
From 0 to 3, of course,
is four total values,
and so I've counted as high as 3 now
using not three fingers, but only two.
How, though, can I count higher than 3?
Well, I just need to raise
an additional finger.
And so let's start at 0, to
1, to 2, to 3, to 4-- whoops--
to 5, to 6, and now to 7.
And if it didn't hurt so much, I could
continue counting as high as 8 and 9
and 10 and beyond by simply permuting
my fingers in different ways.
So how high can I now
count on this one hand?
Using unary notation with fingers just
up or down, I can count as high as 5--
from 0 to 5.
But with these same five fingers,
if I instead use not unary,
but let's call it binary
notation where we're actually
taking into account whether the fingers
are up or down and the pattern thereof,
well now it would seem
that each of these fingers
can be in one of two possible
states or configurations.
They can either be up or they can be
down, or just one of them can be up
or can be down, or just one
other can be up or down.
So if there's two possible states or
configurations for each of these five
fingers, that's like 2 times
2 times 2 times 2 times 2,
or 2 to the power of 5 for five fingers,
which is actually 32 possible patterns.
Which is to say with my
one human hand, now can I
count using not unary, but binary from
0 to 31, which is 32 possible values.
Now what's the goal at hand?
It's quite simply to
represent information.
Because if we have
inputs and outputs, we
need to decide a priori how we're
going to represent those values.
Now it turns out in computers,
binary is wonderfully well-suited.
Because after all, what's the one
physical input to any computer
you have, whether it's your
laptop or desktop or these days,
even a phone in your pocket?
Well odds are, at the end of the day--
or even perhaps earlier in the day--
you need to physically plug that device
into the wall and actually recharge it.
And somehow or other there's
electricity or electrons
flowing from the wall that are
stored in the battery in your device
if it has a battery, and that is the
input that drives your entire machine's
operation.
And so if that's the
only possible input,
it's pretty straightforward to imagine
that you might have your computer
plugged in or not,
power is flowing or not,
you've got a charge in
your battery or you don't.
And so this binary world where something
can be in two possible states--
on or off, plugged in or not--
is wonderfully conducive to now using
binary in the context of computers
to represent information.
After all, if I want
to represent a number,
I might simply turn on the lights.
And so my phone here has a light built
in, and right now that light is off,
but if I consider this then
to be off, we'll call it a 0.
And if I turn this flashlight now on,
I can be said to be representing a 1.
And so simply with a simple light bulb
can I represent two possible values
just like my finger can be up and down.
But computers don't necessarily
use lots of little light bulbs,
they use something else
that's called a transistor.
A transistor is a tiny little
switch, and that switch, of course,
can be either on or off--
letting electricity in or not.
And so when you have a
computer, inside of which
is a motherboard and CPU and
bunches of other pieces of hardware,
one of the underlying
components ultimately
are these things called transistors.
And computers today have millions
of them inside, each of which
can be on or off-- so far
more than my five fingers,
and that's ultimately how a computer
can go about representing information.
So long as it has access to some
physical supply of electricity can
it use that electricity
to turn these switches
on or off in distinct patterns,
thereby representing 0, 1, 2, 3, 4,
all the way up to 31 or even
millions or billions or beyond.
And so computers use binary.
Computers speak binary-- only
0's and 1's or off and on,
because it's so wonderfully
well-conducive to the physical world
on which they're ultimately based.
We humans, meanwhile,
tend to communicate
not only in English and
other spoken languages,
but in a numeric system called decimal.
So decimal, dec meaning 10,
has 10 digits at its disposal--
0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
whereas binary has just two--
0 and 1-- and unary, you
might say, has just one--
1, the hash mark that
I drew on the board.
But if I only have 0's and
1's at my disposal, how can
I possibly count to 2 or 3 or beyond?
Well, we need some way of mapping
these 0's and 1's to the more familiar
numbers that we ourselves know.
So how can we go about doing this?
It's one thing to describe it in
terms of patterns on your fingers,
but again, a device just has
switches that it can turn on and off.
Well it turns out even if
you don't yet speak binary
or have never really
thought about doing so,
turns out that it's not all that
dissimilar to the system of numbers
with which we grew up.
In fact, in grade school,
you and I probably
grew up learning how to represent
numbers in rather the same way.
For instance, consider this--
clearly the number 123.
But why is that?
It's not inherently the number 123.
Really, this is just three
symbols on the screen.
We of course immediately
recognize them as 1 and 2 and 3,
and we infer from that, oh, this is the
number 123, but why is that exactly?
Well if you're like me,
you probably grew up
representing numbers
in the following way,
even if it's been quite some time
since you thought about it like this.
With the symbol here, 1, 2,
and 3, if you're like me,
you probably grew up thinking
of this rightmost digit
as being in the ones place,
and this middle digit
is being in the tens place,
and this leftmost digit
as being in the one hundredths place.
And if it were a longer
number, you'd have
the one thousandths place and
ten thousandths place and beyond.
Now how do we get from 1, 2, 3 to 123?
Well, the arithmetic we were taught says
to do 100 times 1, and then 10 times
2, and then 1 times 3, and then
add all three of those products
together, giving us, of course, 100
plus 20 plus 3, which of course is 123.
Now that actually doesn't feel like
much progress, because we started at 123
and we got to 123, but
that's not quite that.
We actually started with
the pattern of symbols--
1, 2, 3, like a pattern of fingers, and
ultimately ascribe mathematical meaning
to those symbols as 123.
And it turns out computers, even though
they speak binary and therefore only
have 0's and 1's at their disposal-- no
2 or 3 or 4 or 5 or 6 or 7 or 8 or 9,
they count in exactly the same way,
and they represent inputs and outputs
in fundamentally the same way.
Consider this.
Suppose that I have 3 bits at my
disposal, 3 switches or light bulbs,
if you will.
And those light bulbs or
switches are all initially off.
I claim that I'll represent
those states as 0, 0, 0.
And together, those three 0's represent
the decimal number we ourselves know
is 0.
Now why is that?
Well, if here if we consider this
to be the ones place as before,
but the middle digit we won't
consider being in the tens place,
we're going to consider this
as being in the twos place,
and this leftmost digit as
being in the fours place.
Now why?
Well in our decimal
system, there's a reason
that we have the ones place, the
tens place, the hundredths place,
and beyond--
those are actually all powers of 10.
10 to the 0, 10 to the 1,
10 to the 2, and beyond.
Now in binary, bi meaning two, you only
have two digits at your disposal, 0
and 1, and so instead of using powers
of 10, we're going to use powers of 2.
And indeed we have.
2 to the 0 is 1; 2 to the 1 is 2, 2
to the 2 is 4, and if we kept going,
we'd get 8, 16, 32, 64, and beyond.
And so this pattern of symbols
or this pattern of light bulbs
or this pattern of switches, all
of which I'm proposing are off,
simply represents now the
number we humans know as 0.
Why?
Well we have 4 times 0 plus
2 times 0 plus 1 times 0,
which of course gives me 0 plus 0 plus
0, for a total numeric value of 0.
So in binary, 0, 0, 0 or all
three switches off represents 0,
and indeed, that's what we would expect.
But we've instead we did this.
Suppose that using binary, and
therefore these same three places--
ones place, twos place,
and fours place and beyond,
we had a different pattern of symbols.
How, for instance, might we represent 1?
Well, if we had a pattern that's
0, 0, and 1 in binary, that's
going to translate, of course, to the
decimal number we all know and agree
is the number 1.
How do I represent the number 2?
Well, if I start from
scratch again, I'm going
to represent the decimal number we know
as 2 as a pattern that's 0, 1, and 0.
A switch that's off, another
that's on, another that's off.
Why?
4 times 0 is 0, 2 times
1 is 2, 1 times 0 is 0,
and so we're left with total of 2.
And if we want to
represent 3, I don't even
need to change the value
or state of those switches,
I can simply turn this
one from off to on,
and now I'm representing the number 3.
If I want to count up to 4, I can
simply start fresh in the fours place,
in the twos place, in
the ones place here--
1, 0, and 0.
And then lastly, suppose that
I want to count up even higher.
7 can simply be 1, 1, and 1, because
that gives me a 4, a 2, and a 1.
4 plus 2 is 6, plus 1 is 7.
But what if I want to
represent the number 8?
To represent the number 8, it would
seem that I need a little more space.
And so I might need to introduce
the eights place so that I
can have a 1, a 0, a 0, and a 0, eighth
place being 2 to the 3, but to do this,
I do indeed need more hardware.
I need another switch, another light
bulb, and another physical device
inside of my computer.
Now fortunately, our computers today
have millions of these tiny transistors
and switches, so it's not all that
problematic to count as high as 8,
but the implication is, that in order
to represent larger and larger values,
do we need more physical storage?
And indeed, thematic and computer
science is exactly that constraint.
You can only do so much
computationally, perhaps,
if you have a finite amount of
memory or a finite amount of hardware
or switches.
And we'll see, then, that there is
non-trivial implications for how
much data you can represent, and
how many problems, therefore,
you can solve.
So that's it for numbers.
Using just 0's and 1's can we count
as high as we want and represent
any number of values so long as we
have enough bits at our disposal.
But numbers only get
us so far in computing.
After all, it's not just
Excel and calculators
with which we use computers.
It's of course to send
information textually,
and to use letters of the alphabet,
and compose text messages and emails
and documents and more, but how,
then, do you represent letters
if all you have at the end of
the day are these 0's and 1's
underneath the hood, so to speak?
Well to do that, we all
simply need to agree,
just like we've agreed to
speak in here English for now,
on a particular system by which to
represent letters of the alphabet.
Now we don't have all that many
choices at hand because if we only have
0's and 1's-- switches that
can be turned on and off--
that doesn't give us
terribly many options.
But what we could do as humans is just
collectively agree that you know what?
We are going to use a certain pattern
of 0's and 1's to represent capital A.
And we're going to use a different
pattern of 0's and 1's to represent
capital B. A different
pattern of 0's and 1's
to represent C and D and all the
way through Z. And you know what?
If we care about
lowercase letters, we'll
use slightly different patterns of 0's
and 1's to represent those as well.
And this is exactly what
humans did decades ago.
We standardized on
something called ASCII--
the American Standard Code
for Information Interchange,
which was the result of people
agreeing that we are going to use--
you know what?
The number 65 in decimal
to represent capital A,
and the number 66 to
represent capital B,
and the number 97 to
represent lowercase a,
and 98 to represent lowercase b,
and everything before and beyond.
And so this system, this code, ASCII,
is simply a collective agreement
that whenever you're in the context
of a text-based program and not
a numeric-based program,
any patterns of bits,
such as those that might
represent 65 in a calculator,
should instead be interpreted
in the context of Microsoft Word
or an SMS message or iMessage as
actually representing a letter.
So how bits are interpreted
is entirely context-dependent.
Depending on the program you have
opened, the same pattern of bits
might be interpreted in different ways--
as numbers or letters
or even something else.
So how, then, do we represent
so many other symbols
that aren't just A through Z?
Well it turns out that
the American Standard
Code for Information
Interchange was indeed
fairly skewed toward American
representation of letters,
particularly English.
But there are so many more characters
with accents and other symbology, let
alone punctuation marks,
and in foreign languages,
too, are there hundreds if not
thousands of distinct characters
that you need to represent
ideally in order to send that text
and write that document.
But ASCII alone couldn't handle it.
Because ASCII tended to use 7 or
maybe in some contexts 8 bits total,
and with 8 bits--
or binary digits, if you will--
can you represent how
many possible values--
2 times 2 times 2 times 2
times 2 times 2 times 2 times
2 for 8 possible bits, each
of which can be a 0 or 1.
That only gives you
256 possible letters.
Now that's more than enough for 26
letters of the English alphabet, A
through Z, both uppercase
and lowercase, but once you
start adding in punctuation and
accents and other characters,
you very quickly run out.
And so the world
produced Unicode instead,
which doesn't just use 8
bits or 1 byte, if you will--
1 byte simply meaning
quite simply a bit,
but instead introduced a variable
length encoding of letters.
And so some letters might
use 8 bits or 1 byte.
Other letters might
use 2 bytes or 16 bits.
Other letters might
use 3 bytes or even 4.
And via 4 possible bytes maximally--
2 to the 32 bits-- can you actually
represent 4 billion possible values,
which is a huge number of values.
But how does the system
of encoding actually work?
Let's consider by way of an example.
In both ASCII and
Unicode, ASCII now being
a subset of Unicode insofar as
it only uses 1 byte per letter,
you might represent A with--
capital A with 65, capital
B with 66, and so forth.
And dot-dot-dot implies we've handled
the rest as well and they all happen
to be back-to-back-to-back.
So suppose in the
context of a text message
you happen to receive a pattern of
0's and 1's that if you actually
did the math in the ones
place and the twos place
and so forth, actually worked out
to be the decimal number 72, 73, 33.
Well what text message
have you just received?
Of course at the end of the day,
computers only understand binary,
and that information is
sent from phone to phone
over the internet-- more
on that another time.
But those 0's and 1's interpreted in
the context of a text messaging program
will be interpreted not
as binary, not as decimal,
but probably as ASCII or
more generally Unicode.
And so if you've received a text message
that says, 72, 73, 33, well what might
that spell?
Well if we consider our
excerpt of a chart here,
that's 72 and 73, of course,
translates to H and an I,
and you wouldn't know it from that
preceding chart, but 33 it turns out--
just represents a very
excited exclamation point,
because that, too, has a
representation in ASCII and Unicode.
Now underneath the hood are patterns
of 0's and 1's, but we don't need
to think about things at that level.
Indeed, what we've done
here by introducing ASCII,
and in turn, Unicode,
is introduce what's
called an abstraction, which is a
technique in computer science via which
we can think about a problem more
usefully at a higher level as opposed
to the lowest level in which
it's actually implemented.
After all, it'd be much easier to
receive a message that you view
as H-I-!
than actually as a pattern
of 0's and 1's that you then
have to think about and
translate in your head
to the actual message that was intended.
So an abstraction is just a way of
taking a fairly low level if not very
complicated representation
and thinking about it
in a higher, more useful
level that lends itself
more readily to communication, or
more generally to problem-solving.
Now what about all of those
other characters on the screen?
After all, on a typical
board might you have
letters and numbers and punctuation,
but also these accented characters, too.
Well thanks to Unicode, you indeed have
as many as 4 billion possibilities,
and it's no surprise, perhaps,
that proliferating on the internet
these days are a little
friendly faces called emojis
that you might have received
yourself in text messages, emails,
or some other form.
Now these emojis here, though
they might look like smiley faces,
are actually characters that companies
like Google and Microsoft and Apple
have simply decided to
represent pictorially.
Underneath the hood, anytime
you receive one of these emojis,
you're actually receiving
a pattern of bits--
0's and 1's that in the
context of an email or text
message or some other program,
are being interpreted as
and therefore displayed
as a corresponding picture
that Google and Microsoft
and Apple and others
have decided how to present visually.
And indeed, different platforms present
these same patterns of 0's and 1's
differently depending on how
the artist at those companies
have decided to draw these emoji.
Consider this one, for instance-- face
with tears of joy is its formal name,
but more technically, this is encoded
by a very specific decimal number.
Capital A is a 65, capital B is 66,
what is this face with tears of joy?
Well it is the number 128,514.
After all, if you've got
four billion possibilities,
surely we could use
something within that range
to represent this face
and any number of others.
But underneath the hood,
if you actually looked
at a text message in which you
received a face with tears of joy,
what you've actually received is this
pattern of 0's and 1's somehow encoded
on your phone or your computer.
And so in the context of a
calculator or Microsoft Excel
might we interpret some
pattern of bits as numbers.
And in the context of a
text messaging program
or Google Docs or
Microsoft Word might we
interpret that same pattern of bits
as instead characters or even emoji.
But what if we want to represent
different types of media altogether?
After all, the world of computing is not
composed of just numbers and letters.
There's also colors and images and
videos and audio files and more,
but at the end of the day, if
we only have at our disposal
0's and 1's, how do we go about
representing all of those richer media?
Well as before, we humans just
need to agree collectively
to represent those media using
0's and 1's in some standard way.
Perhaps one of the most familiar
acronyms, even if you've never
really thought about what it is that you
might see on a computer, is this one--
RGB.
It stands for red, green, and blue.
And this refers to the process
by which many computers represent
colors on the screen.
In fact, what is a color?
Well, you might represent the color
black and white simply by a single bit.
1 might represent black and 0 might
represent white, or vice versa.
We in that case only have what's
called 1-bit color, whereby
the bit is either on or off implying
black or white or white or black.
But with RGB, you actually
use more bits than just one.
Conventionally you would
use 8 bits to represent
red, 8 bits to represent green,
and 8 bits to represent blue.
So 24 bits total, the first 8 of which
or first byte is red, second is green,
third is blue.
Now what does that mean?
Well in order to represent
a color, you simply
decide, how much red do
you want to add to the mix?
How much green and how much blue?
Combining essentially
those wavelengths of light
in order to see a disparate
color on the screen.
And if you think about red, green, and
blue now as being single bytes each,
that means you have a possible
range of values of 0 to 255.
After all, if a single byte is 8
bits, and with 8 bits, each of which
can be a 0 or 1-- so 2 times 2 times 2
times 2 times 2 times 2 times 2 times
2 gives you 256 values.
So if you start from 0, you
can count as high as 255.
Suppose that you had 72 as your amount
of red for a color, and 73 for green,
and 33 for blue.
That is to say in the
context of a text messaging
program, this same pattern of bits--
72, 73, 33 presented as decimal
represented a textural message of HI!.
But suppose you were to see
that same pattern of bits
instead in a photo-editing program like
Photoshop, or in the context of opening
an image on the screen.
Well that same pattern of bits,
or in turn, decimal numbers, just
represents some amount of
red, some amount of green,
and some amount of blue.
So 72 is a decent
amount-- a medium amount
of red out of 255 total values,
73 is a medium amount of green,
and 33 is a little bit of blue.
If you combine now all of these
three values in your mind's eye
by overlapping them, what you
get is this shade of yellow.
Which is to say that if
you wanted to represent
this particular shade
of yellow would you
use three bytes whose values
are respectively 72, 73, and 33.
And in the context of Photoshop
or some other graphical program
would that pattern be interpreted as and
displayed as this color on the screen.
Now this color is deliberately
presented here as just 1 square--
1 dot, if you will, or
more properly, 1 pixel.
Because what is an image?
Well if you've ever looked really
close on your computer screen
or on your phone or even on
your TV, you might actually
see all of the thousands
of dots that compose it.
This is getting harder to do
on today's modern hardware,
because if you have something
like a retina display,
that means these dots or pixels
or ever so tiny and ever so close
together that it's actually hard
now for the human eye to see them,
but they are in fact there.
Any image on the screen,
any photo on the screen
is really just a pattern
of dots or pixels--
a grid of dots from left
to right, top to bottom.
And every one of those dots
or pixels on the screen
probably has 24 bits representing
its particular color.
Indeed, a picture is essentially
just a pattern of color so small
that you don't really
see all of those dots,
but you see the net effect of a
beautiful photo or image or something
else altogether.
So consider how we might see these.
When Apple or Google or Microsoft or
any other company that supports emojis
presents those emojis on
the screen, we of course
see them as images because
Apple and Microsoft and Google
have decided what images
shall be displayed when
it receives a certain pattern of bits.
But how are they storing
those images and how
is your Mac or PC or phone
displaying that image to you?
Well it's hard to see where those
bits are let alone the pixels in it
at this particular size, but if I go
ahead and zoom in and zoom in and zoom
in a little more still,
you can begin to see
the individual dots or squares or
pixels that compose even this one emoji.
And so just to display this smiling
face, this face with tears of joy,
do you need 24 bits for this
pixel, 24 bits for this pixel,
24 bits for this pixel, 24 bits for
this pixel, and so on and so forth?
So if you've ever noticed that when you
download an image or download a photo
or receive one in the mail, odds
are it's not in the order of bytes.
It's probably kilobytes
for thousands of bytes,
or megabytes for millions of bytes.
How in the world does a simple photo
have so many bytes or in turn bits?
It's because every one
of the dots in that image
takes up some amount of space.
So to represent yellow might we use
a certain pattern of 0's and 1's or 3
bytes.
To represent black or
gray or any other number--
colors on the screen might we represent
those using different patterns still.
Now that's it for images,
but what about videos?
A video is really just
a sequence of images
being presented to you so quickly
that you and your human eye
don't notice that you're really just
watching image after image after image.
In fact, it's conventional
in Hollywood and elsewhere
to display as many as 24
images or frames per second,
or perhaps even 30 frames
or images per seconds.
And so even though they are
in fact distinct images,
you are seeing them so quickly,
and the characters on the screen
are moving within each
frame or image ever
so slightly, that it creates
ultimately the illusion of movement.
And if you think back
to childhood, you might
have had one of those
old-school flip books
on which was drawn some kind of
cartoon one frame or image at a time.
And if you flipped through
that flip book one page
ever so quickly again and
again and again and again,
letting the physics of
it all take its toll,
you might see a character or the picture
in the book actually appearing to move,
but it's not.
All of those pages are just images and
all of those images are not moving,
but when you see them so fast
and so quickly in succession
does it appear to create
that same movement.
So these things here,
Animojis in Apple's world,
are just videos ultimately that
track your facial movements and such,
but all they really are are a
pattern of bits, each set of which
represents an image.
And each of those images is
displayed to you on your phone
so quickly that you see
the illusion of movement.
And so here again is an abstraction.
What is a video?
Well, a video is a collection of images.
Well what's an image?
An image is a collection of pixels.
Well what's a pixel?
A pixel is simply some number of
bits representing some amount of red
and green and blue.
Well what is a bit?
It's simply a representation
digitally of something being present,
like electricity or not.
And so it's much more
powerful now to be operating
at this level of abstraction-- talking
about videos for what they are, and not
getting into the weeds of the lowest
level that at the end of the day,
it's really just some
form of electricity
that we call bits, that
we in turn call pixels,
that we in turn called images,
that we in turn call videos.
And we can operate in a number
of these layers of abstraction
in order to represent
inputs to some problem.
To create a film, to convert
a film, to display a film
might be just one of
the problems to solve.
All right, so we now have a
way to represent information,
whether it's binary, decimal, or
some other approach altogether.
So it's time to solve problems.
But how do we go about solving problems?
What is inside of this black box?
Well that's where our
so-called algorithms
are-- step-by-step instructions
for solving some problem.
And to solve these
problems, we simply need
to decide first what problem to solve.
Well let's consider a familiar one,
like that of looking up someone's name
and number in a phone book.
Now these days, the phone book, of
course, takes a more digital form,
but at the end of the day, these
two formats are pretty equivalent.
After all, here in my hand is a
phone book with maybe 1,000 pages,
on which are bunches of names
and numbers sorted alphabetically
from A to Z.
On my phone, meanwhile, while I
might have to click an icon instead,
do I have contacts that
are similarly sorted
from A to Z by first name or last
name, and touching any one of them
pulls up its number.
So this one, though, is a
little more physical for us
to see the solution to a problem,
like finding, say, Mike Smith.
Mike Smith may or may not be in
this phone book, but if he is,
my goal quite simply is to find him.
So let me try this.
Let me try opening this
book, and then one step
at a time looking down for Mike Smith.
And if I don't see him,
move on to the next page.
If I still don't see him, move on
to the next page, and next page,
and next page, and so
forth, one page at a time.
I propose that we consider
first-- is this algorithm correct?
Opening the phone book, looking
down, turning page, and repeating.
Well, at some point, if
Mike is in this phone book,
I am going to reach him, at
which point I can call him.
And if Mike Smith is
not in this phone book,
I'm eventually not going
to reach him, but I
am going to hit the end of
the phone book, at which point
I'll presumably stop.
But it doesn't feel terribly
efficient, however correct it may be.
Why is it not efficient?
Well I'm starting at the
beginning of the phone book.
I know it's alphabetical,
and yet I'm turning one page
at a time through the A's, through
the B's, through the C's and beyond,
just waiting and waiting until
I finally reach Mike Smith.
Well how might I do this better?
Well, I learned in grade
school not only how
to count by ones, but
also an account by twos.
So instead of 1 page, 2 page, 3
page, 4, why don't instead do 2,
4, 6, 8, 10, 12-- flying
through the phone book.
It certainly sounds faster, and it
is, but is that approach correct?
Well, I propose that
there could be a problem.
I might just get unlucky, and once I do
reach the S section of the phone book,
what if by bad luck Mike Smith is
sandwiched between two of those pages?
And therefore I hit the T section
and Z section and run out of pages?
I might conclude incorrectly
that Mike Smith is not,
in fact, in this phone book.
But do I have to throw out
the algorithm altogether?
Probably not, right?
I could be a little clever here
and improve this algorithm,
maintaining its efficiency,
but fixing its correctness.
After all, if I do see a page on which
there are last names starting with T,
while I can stop and say, well we
know Mike is not farther than this
because Smith starts with S,
so I can double-back hopefully
just one page or few,
and therefore fix what's
otherwise an incorrect
approach in this algorithm.
In fact, I can be even
smarter than that.
As soon as I see someone's name
that starts with S-N instead of S-M,
I can similarly flip back
one page and therefore ensure
that I can go twice as fast
through most of the phone book,
and only once I hit that section do I
have to double-back one or terribly few
pages.
So I get the overall performance
gains, and I also solve the problem.
But honestly, if you even
use this technology anymore,
odds are you don't start at the
beginning turning one or two
pages at a time.
If you're like me, you
probably more instinctively
open up roughly to say the middle.
You'll look down and you realize,
oh, I haven't found Mike yet
because I'm actually
here in the M section.
But what do you know about
where you've just jumped?
Well Mike Smith is
indeed to the right here,
because he's in the name starting
with S. But if I'm in the M section,
I do know something else.
I know that Mike is not in the
left-hand section of this book--
he is not among the A through M's.
And so I can both literally and
figuratively tear this problem in half,
throw half of it away, thereby
leaving myself with just,
say, 500 pages out of 1,000.
So whereas that first algorithm I
took one byte at a time out of it--
one page at a time, the second algorithm
I took two bytes at a time out of it--
two at a time.
Well with this algorithm, I just
took 500 bytes out of the problem
all at once, thereby getting closer
to the output to this problem
much more quickly.
What do I then do?
Well, I simply am probably going
to apply that same algorithm again
and again and again-- dividing
and conquering this phone book,
if you will.
Jumping roughly to the middle
now, looking down-- dah, darn it!
I ended up in the T section
now, so I've gone too far,
but I know Mike is not in the
right-hand side of this book--
he's not among the T's through Z's.
So I can again tear the problem
in half, throw that half away,
thereby leaving me with just a
quarter of the original problem,
say 250 pages down, down
from 1,000, finding Mike ever
so much more quickly than before.
And if I repeat and I repeat and I
repeat, each time actually looking down
looking for Mike, I should hopefully
find myself ultimately left
with just a single page on which Mike
either is or is not, at which point
I can call him or quit.
So just how much more quickly did I
find Mike Smith via this algorithm
than the first two?
Well in a 1,000-page phone book, I might
get unlucky and Mike's not even there,
and so in the worst case, I might
look in as many as 1,000 pages.
In the second algorithm, I might also
get unlucky and not ever find Mike
because he's just not there, and
therefore look at as many 500 pages.
But in this third algorithm
where I'm iteratively
dividing and conquering again and
again, splitting the problem in half
so I go from a very large input to a
much smaller to an even smaller problem
still again and again, how many times
might I split that phone in half?
Well if I start off with roughly
1,000 pages and I split it in half,
that gives me 500,
250, 125, and so forth.
Turns out that I can split 1,000
pages roughly 10 times in total
before I'm left with
just that final page.
And so consider the
implications of that.
First algorithm-- 1,000 steps, maybe.
Second algorithm-- 500 steps, maybe.
Third algorithm-- 10 steps, maybe,
to find Mike Smith before I can call
or decide he's not there.
And that's a pretty powerful technique.
And if we extrapolate from this
relatively small example or phone book
to maybe a much larger
phone book, say a phone book
that a bit nonsensically has
4 billion pages in it, well,
how many steps might those
three algorithms take?
The first algorithm
might take 4 billion.
The second algorithm
might take 2 billion.
The third algorithm might
take 4 billion divided
by 2 divided by 2 divided by 2--
you can divide 4 billion and half
roughly 32 times only, at which point
you're left with just that one page.
So 4 billion steps for the
first algorithm, 2 billion
steps for the second algorithm, but
32 steps for the third algorithm.
And simply by harnessing this intuition
that we all probably already have
in formalizing it in more methodical
form-- in algorithmic form,
if you will, can we solve problems
using some of the building
blocks that we already
have at our disposal.
And indeed, problem-solving
in computer science
is all about implementing
these algorithms
and applying them to
problems of interest to you.
But how do we think about just
how good this algorithm is?
It sort of stands to reason that
it has to be minimally correct.
But how efficient is it?
I've been talking in
terms of absolute numbers.
1,000, 500, or 10, or 4
billion, 2 billion, and 32.
But it would be nice to compare
algorithms in a more general way
so that I can use the same
searching algorithm, if you will,
to find not Mike Smith,
but someone else.
Or to search not a phone book, but
Google or search engine more generally.
And so computer scientists also
have a more formal technique
via which to describe the
efficiency of algorithms.
And we might consider these not even
with numbers or formulas, per se,
but with simple visual
representations thereof.
Here, for instance, is a simple plot.
On the x or horizontal axis is
going to be the size of the problem.
How many pages are in the phone
book, which we'll generally call n,
where n is a number for
a computer scientist.
Much like a mathematician
might go with x or y
or z, computer scientists might
tend to start counting with n.
On the vertical or
y-axis here, we'll say
this is going to be the
time to solve a problem,
be it a phone book or something else.
And that might be seconds
or minutes or page turns
or some other quantifiable
unit of measure.
So how do I go about describing
that first algorithm where
I turn one page at a time?
Well I propose that it manifests a
linear relationship, or n, or a line,
a straight line on the chart.
Why is it a straight line?
Well, if Verizon or the local
phone company, for instance,
next year adds just one more page to
that phone book, that means for me,
it might take me one more
unit of measure of time
to find someone like Mike Smith, because
we've gone from 1,000 to 1001 pages.
So the slope of this line indeed
can be straight or linear.
That second algorithm now where
I'm flying through the phone book
twice as fast is really fundamentally
the same algorithm or same shape.
It just so happens that for
a given size of the problem,
it's not going to take
this many seconds n,
it's going to take me n
over 2 seconds because I'm
doing twice as much work at a time.
And so this line, too,
is linear or straight,
but we might describe it in terms of n
over 2, where n is the number of pages
but I only have to look at half of them
except for maybe that very last one
if I double-back, and so its
shape is fundamentally the same.
So better, but not fundamentally
better, it would seem.
But that third and final algorithm
has a fundamentally disparate shape,
depicted here with our green curved line
which has a logarithmic slope to it.
So if n is the number of
pages, and the algorithm in use
is division and conquering, it turns
out that has a logarithmic slope.
Now what does that actually mean?
Well it turns out that if Verizon
adds one more page to the phone book
next year, that is a drop in the
bucket for that final algorithm where
I'm just tearing the problem in half.
But more powerfully, if Verizon
doesn't just add one page,
perhaps to neighboring towns merged
together and form one massive phone
book next year, going from
1,000 pages to 2,000 pages
all in one big, fat
book, well how many more
steps does it take that third
algorithm to search for anyone in it?
Just one.
Because with 2,000 pages, you can take
1,000-page byte out of that problem all
in one fell swoop even though
it's twice as big as before.
And so even as the size of the problem
gets bigger and bigger and bigger
and even farther away,
the amount of time
it takes to solve a problem
using that algorithm only
increases ever so slightly.
That's not a one-to-one
or a one-to-two ratio,
it's instead said to be logarithmic.
And this, then, is a
fundamentally different curve--
it's a fundamentally better algorithm in
that the problem can grow exponentially
large, and the amount of time
it takes to solve it itself
grows far less quickly than that.
Now it's one thing to talk
about all these algorithms.
At some point we need
to put them to paper,
or more technically, program
them into a computer.
So how do we go about formalizing these
step-by-step instructions in such a way
that all of us can agree that
they are correct, all of us
can discuss their efficiency,
and most importantly, all of us
can program a device to
execute these steps for us?
Well let me propose
that we first implement
an algorithm like that of searching a
phone book in what's called pseudocode.
Pseudocode is not a formal programming
language, it has no one definition.
It generally means using short,
succinct phrases, often in English,
to describe your algorithm
step-by-step-by-step in such a way that
you yourself understand it, and anyone
reading it can also understand it.
Now along the way do you have
to make certain assumptions,
because you need to decide at what layer
of abstraction to actually operate.
And we'll take a look at
where the opportunities there
are, but let me propose
that we implement
this algorithm for
searching for Mike Smith,
for instance, in the following way.
Now this is an algorithm, or really, a
program, albeit written in pseudocode.
And even though written in
pseudocode or English, it
manifests certain constructs
that we're actually
going to see in any number of today's
popular programming languages,
a programming language being
an English-like language
that humans have decided
can be used in a certain way
to tell a computer what to do.
Now this pseudocode
is just for us humans,
but what are some of the
fundamental constructs
in it that we'll see in
actual programming languages?
Well first highlighted
in yellow here are
all of the verbs or actions that more
technically we'll call functions.
Statements that tell the computer,
or in this case, human what to do.
Pickup, open to, look at, call,
open or open, or quit-- all of them
calls to actions.
In the context of a computer
with these same types of verbs,
we more traditionally call it functions.
What else is laying
inside of this program?
Well these pictured here in yellow.
If, else if, else if,
else, well these represent
what are called conditions or branches.
Forks in the road, so
to speak, via which
you make a decision to go this way
or this way or this way or that.
But how do you decide
down which road to go?
You need what are called
Boolean expressions.
Named after mathematician
Boole, these Boolean expressions
are short phrases that either
have yes or no answers,
or true or false answers, or if you
will, 1 or 0-- on or off answers.
Smith is among names,
Smith is earlier in book.
Smith is later in book-- each of
them could be asked effectively
with a question mark
to which the answer is
yes or no, and based
on that answer can you
go down any one of those four roads.
And then lastly is this--
go back to step 2 or go back to step 2.
Highlighted in yellow here's an
example of a loop, a deliberate
cycle that gets you to do
something again and again
and again until some
condition is no longer true
and you eventually quit
or call Mike instead.
And so looping in a program
allows you to write less code,
but do something again
and again and again just
by referring back to some
work that you've already done.
Now these constructs, though we've
seen them in the context of pseudocode
and searching a phone book, can be
found in any number of languages
and programs, be it
Python or C++ or Java.
And so when it comes to
programming a computer, or more
generally solving a
problem with a computer,
we'll ultimately express
ourselves in fundamentally
the same way, with functions,
conditions, and Boolean expressions,
and loops, and many
other constructs still,
but we'll do it in a way that's
methodical, we'll do it in a way
wherein we've all
agreed how to represent
the inputs to that process
and the outputs thereto,
and ultimately use that representation
and these layers of abstraction
in order to solve our problems.
But sometimes too much abstraction
is not always a good thing.
Sometimes it's both
necessary and quite helpful
to get into the weeds of the lower level
implementation details, so to speak.
And let's see if we
can't see this together.
Take out if you could a pen or
pencil, as well as a sheet of paper,
and in just a moment allow me
to program you, if you will.
I'll use a bit of verbal
pseudocode to program
you to draw a particular
picture on that sheet of paper.
The onus is on me to choose an
appropriate level of abstraction
so that you're on the
same page, if you will,
as I, so that ultimately what I program
is what you implement correctly.
Let's try.
All right.
First, go ahead if you
would and draw a square.
All right.
And next to that square to the right,
go ahead if you could and draw a circle.
To the right of that circle, go
ahead and draw a diagonal line
from bottom left to top right.
All right, and lastly,
go ahead if you would
and draw a triangle to
the right of that line.
All right.
Now hopefully you're quite
proud of what you just drew,
and it's perfectly correct as you
followed my instructions step-by-step.
So surely what you drew looks like this?
Well that's a square, to a circle to
the right, a straight line from bottom
left to top right, to the
right of which is a triangle.
Now with some probability, you
probably didn't draw quite this.
Fortunately there's no one there to
see, but where might you have gone wrong
or maybe where might I have gone wrong?
Well I didn't exactly specify just how
big that square should be, let alone
where it should be on the page.
And you might have quite reasonably
drawn it right in the center--
maybe the top left or the top right.
But if you drew in a place
that I didn't intend,
you might not have had enough
room for that actual square
unless you flipped
perhaps the paper around.
As for the circle, I said
draw it to the right,
but I didn't technically say that it
should be just as wide or just as high
as that square, so there might
have been a disparity there.
As for that straight line, I said
from bottom left to top right.
I knew that I meant from
the bottom of the circle
to the top right of what
would then be the triangle,
but maybe you started from here
and all the way up to here.
I was making an assumption,
and that, perhaps,
was not necessarily precise enough.
And then lastly, the triangle.
Pretty sure that I learned
growing up that there's
all sorts of different triangles.
Some have right angles, some have
acute angles or obtuse angles,
or some triangles or even isosceles.
Now I happen to intend this one
and you might have drawn just that,
but if you did, you
probably got a bit lucky.
And unfortunately when
it comes to computing,
you don't want to just get lucky
when programming your computer,
you want to actually ensure that
what you write is what it does.
And in fact, if you've
ever seen on your Mac or PC
a spinning beach ball
or hourglass indicating
that something is taking quite long,
or the computer freezes or reboots,
odds are that's because some
programmer, now like myself,
might not have been specific
enough to the computer
as to what to do in all circumstances,
and so sometimes unforeseen behavior
happens.
The computer starts thinking forever.
The computer reboots
or crashes or freezes,
which is simply the result of
one or more lines of code--
not in pseudocode, but in Java
or C++ or something else--
not having anticipated some user's
input or some particular direction.
So maybe it would help
to operate not quite
at this high of a level of abstraction.
And indeed, all of these
shapes are abstractions.
A square, a circle, a line, and a
triangle are abstractions in the sense
that we've ascribed words to them
that have some semantic meaning to us,
but what really is a square?
Well again, if we go
back to grade school,
a square is a rectangle, all of
whose sides are of equal length
and whose angles are all right angles.
But very quickly, my own
eyes start to glaze over
when you get into the weeds of that
implementation, and so all of us
have agreed since to
just call it a square.
And same for circle
and line and triangle,
but even then, there are variations.
So let's get a bit more
into the weeds this time
and let me take an approach
with the second of two pictures,
programming you this time at
a much lower level of detail.
Go ahead and take out another
sheet of paper or flip over the one
that you already have.
Toward the top middle of that sheet of
paper, roughly one inch from the top,
go ahead and put down the
tip of your pen or pencil.
Keeping your pen or pencil
applied to the paper,
start to draw from the
top middle of the paper
down toward the left at roughly a
45 degree angle for two or so inches
and then stop.
Keeping the tip of your
pen or pencil on the paper,
go ahead and draw another line
perhaps three inches straight
down toward the bottom of the
paper, stopping two or so inches
shy of the bottom of the paper.
Then, if you would,
hook a right, this time
moving at a 45-degree angle toward
the bottom-middle of the paper,
stopping ultimately one or so
inches from the bottom of the paper.
Then bear right yet again, this
time going up at a 45-degree angle
upward toward the middle of the paper,
this time extending a couple of inches,
and then stopping at the same height
your previous line started at.
Now draw a vertical line
about three inches high,
stopping exactly where
two lines ago started.
And if you're on the same
page, no pun intended, go ahead
and hook a left this time, going back
a 45-degree angle toward the top-middle
of the page, hopefully
closing the loop, if you will,
and connecting your very first
dot to the last place you stopped.
At this point, you hopefully
have a shape on your page
that has six total sides.
What comes next?
Go ahead and from the top-middle of the
page follow the line you already drew
down to the left at a 45-degree angle
and stop where you made a corner
before--
before you went straight
down on the page.
And instead of following
that vertical line,
go ahead and go down 45 degrees
to the right a couple of inches,
stopping once you're directly
below the top-middle of the dot you
initially drew.
Then go ahead and do two things
from the point you're now at.
Go ahead and draw the opposite line up
and to the right at a 45-degree angle,
connecting that line to one of
the corners you previously made.
And then from that same initial
point, draw a vertical line
down to the bottom-middle of the page
where you had another angle still.
Lift up your pen or pencil and take
pride in what you've just drawn,
because it is most surely exactly this.
Was it?
I don't know, because that was quite
painful for me to say in this program
because I was operating
at such a low level
really with no abstractions other
than line and point and angle.
I was instead directing you much like a
paintbrush on a page to go up and down
and left and right,
collectively creating
what will be an abstraction that I
would dare say call a cube, but a cube
that I did not name by name.
Had I said draw a cube, frankly
we learned from last time
that that could have opened
up multiple possibilities.
What should the angles be?
What should the size be?
What should the rotation there of it be?
And so an abstraction alone
might not have been enough
if I wanted you to draw
precisely this cube.
But if I do want you to be ever so
correct and consistent with what
I had in mind on the
screen here, I really
did need to go down to such
a low level, so to speak,
that I was talking in terms of
edges' length and angles therein.
And even then, I might
not have been as clear
as I might have been-- more
precise measurements and angles
and such might have been
ideal so that you ultimately
end up precisely with the same thing.
So if you veered off-track, not a
problem, my fault more so than yours.
But what's the takeaway here?
Well at some point it
would be nice not to have
to write programs at such a low
level again and again and again.
Wouldn't it be nice if just one
of us, the best artist among us
could implement code--
write code-- that
implements a cube, that
implements a square, a circle,
a line, and a triangle?
But when using those drawing
functions, if you will,
that someone else has
implemented, you should probably
get into the habit of asking
me additional questions.
You want a square.
Well how long should each edge
be and where should its position
be on the page or the canvas?
Same goes for circle
or line or triangle--
what are the angles and
lengths and positioning be?
And if you can parametrize
your functions in that way--
in other words, write your functions
in such a way that they themselves
take input so that what a function does
is at the end of the day produce output
subject to those inputs, we
have then solved a problem.
Not necessarily the one
problem at hand, but we've
solved other people's problems.
And indeed, that's what the world
of programming ultimately is--
writing code to solve
problems, and ideally
solving those problems
once and only once.
And whether it's internally
within your firm or your company,
writing code in such a way that
other people-- maybe yourself
and your colleagues can then use it--
or in the case of open source software,
anyone in the world can use it.
So that in an ideal world,
only one of us humans
ever need write code in some language to
draw a circle or cube or square or line
or triangle, and everyone
else in the world
can stand on his or her
shoulders, use that code
to build their tool
or product on top it.
This, then, was computational thinking.
You have inputs, you want outputs,
and in between, you have algorithms.
You just first need to decide how to
represent those inputs and outputs,
and you have to decide for yourself
at what level of abstractions to work.
