[MUSIC PLAYING]
>> DAVID J. MALAN: All right, this is CS50.
And this is week one.
So recall that last time in week zero,
we focused on computational thinking.
And we transitioned from that to
Scratch, a graphical programming
language from our friends
at MIT's Media Lab.
>> And with Scratch, did we explore
ideas like functions, and conditions,
and loops, and variables, and even
events, and threads, and more.
And today, we're going to
continue using those ideas,
and really taking them for
granted, but translate them
to another language known as C. Now,
C is a more traditional language.
It's a lower level
language, if you will.
>> It's purely textual.
And so at first glance, it's
all going to look rather cryptic
if you've never programmed before.
We're going to have
semi-colons, and parentheses,
and curly braces, and more.
But realize that even
though the syntax is
about to look a little unfamiliar
to most of you, see past that.
And try to see the ideas
that are, indeed, familiar,
because here in week one what
we'll begin to do is to compare,
initially, Scratch versus C.
>> So, for instance, recall that when we
implemented the first of our programs
last time, we had a block that looked
a little something like this-- when
green flag clicked, and then we had
one or more puzzle pieces below it,
in this case, say, hello world.
So, indeed, in Scratch,
when I click that green flag
to run my program, so
to speak, these are
the blocks that get executed, or run.
And, specifically, Scratch
said, hello, world.
>> Now, I could have specified
different words here.
But we'll see that, indeed, many
of these blocks-- and indeed,
in C many functions-- can be
parametrized or customized
to do different things.
In fact, in C if we
want to convert, now,
this Scratch program
to this other language,
we're going to write a
little something like this.
>> Granted, there is some unfamiliar
syntax there most likely, int,
and parentheses, and void.
But printf-- even though you would
think it would just be print.
But print means print
formatted, as we'll soon see.
This literally will print
to the screen whatever
is inside of those parentheses, which
of course in this case is, hello world.
>> But you'll notice some other
syntax, some double quotes,
that the parentheses at the end,
the semi-colon and the like.
So there's a bit of overhead,
so to speak, both cognitively
and syntactically, that we're going
to have to remember before long.
But realize that with practice,
this will start to jump out at you.
>> In fact, let's focus on that one
function specifically-- in this case,
say hello world.
So say is the function.
Hello world is its parameter,
or argument, its customisation.
>> And the equivalence in C is just
going to be this one line here,
where printf is equivalent to, say,
the double quoted string, hello
world is equivalent, of course,
to what's in the white box there.
And the backslash n, though a little
strange and absent from Scratch,
simply is going to have the effect we'll
see in a computer, like my Mac or a PC,
of just moving the
cursor to the next line.
It's like hitting
Enter on your keyboard.
>> So we'll see that again before long.
But first, let's take a look at this
other example in the case of loops.
We had this forever loop last time,
which was a series of puzzle pieces
that did something literally
forever-- in this case,
say, hello world, hello world,
hello world, hello world.
So it's an infinite loop by design.
>> In C, if we want to implement this
same idea, we might simply do this.
While true, printf hello world-- now
while, just semantically, kind of
conjures up the idea of doing
something again, and again, and again,
and for how long?
Well, true-- recall that
true is just on or one.
>> And true is, of course, always true.
So it's kind of a meaningless
statement just to say true.
But indeed, this is deliberate,
because if true is just always true,
than while true just implies,
if a little indirectly,
that the following lines of code
in between those curly braces
should just execute again, and again,
and again, and never actually stop.
>> But if you do want your
loop to stop, as we
did last time with something like
this, repeat the following 50 times,
in C we can do the same with what's
called a for loop-- the keyword
not being while, but for.
And then we have some new syntax here,
with int i equals 0, i less than 50,
i++.
And we'll come back to that.
But this is simply how we would
translate the set of Scratch blocks
to a set of C lines of code.
>> Meanwhile, consider variables.
And, in fact, we just
saw one a moment ago.
And in the case of Scratch, if we
wanted to declare a variable called i
for i being integer, just a number,
and we want to set it to some value,
we would use this orange
block here-- set i to 0.
>> And we'll see today and
beyond, just like last week,
programmers do almost always
start counting from zero, really
by convention.
But also because recall from
our discussion of binary,
the smallest number you can
represent with any number of bits
is just going to be 0 itself.
And so we'll generally start
initializing even our variables to 0.
>> And in C to do the same,
we're going to say int
for integer, i just by convention.
I could have called this variable
anything I want, just like in Scratch.
And then equals 0 just assigns
the value 0 from the right
and puts it into the variable, or the
storage container there, on the left.
And the semi-colon as we'll see-- and
we've seen a few of these already--
just means end of thought.
Proceed to do something else
on the lines that follow.
>> Now, what about Boolean expressions?
Recall that in Scratch,
these were expressions
that are either true
or false-- questions,
really, that are either true or false.
So in the case of Scratch, we might
ask a simple question like this,
is i less than 50?
So i, again, is an integer.
Maybe we're using it
in a Scratch program
to keep track of a score
or something like that.
So this syntax here in Scratch
just means, is i less than 50?
Well, thankfully, something is
simple in C. And to translate,
this we would simply say i less
than 50, using the familiar key
on your keyboard.
>> Meanwhile, if you wanted to
say something more general,
like, well, is x less than y where each
of x and y are themselves variables?
We can do the same thing
in C, so long as we've
created these variables already.
And we'll see how to
do that before long.
We would simply say x less than y.
>> So you're starting to
see some similarities.
And those folks who made
Scratch were certainly
inspired by some of these basic ideas.
And you'll see this kind of
syntax in many languages--
not just Scratch, not
just C, but Python,
and JavaScript, and
other languages still.
>> Let's consider another construct
from C, the notion of a condition,
doing something conditionally.
If something is true, do this.
If something else is true, do that.
It's sort of the programming
equivalent of a fork in the road.
Maybe it's a two-way fork,
a three-way fork, or more.
And in Scratch, we might have
seen something like this.
>> So this one's a big one.
But consider the relative
simplicity of the logic.
If x is less than y, then say x is less
than y, else if x is greater than y,
then say x is greater than y.
And then, logically, if
you think back to Scratch
or just your own human intuition,
well, if x is not greater than y, and x
is not less than y, then of course
x is going to be equal to y.
So in this case, by nesting
those Scratch blocks,
can we achieve a three
way fork in the road?
>> Meanwhile, if we want to
do that in C, it arguably
looks a little simpler-- at least
once you get familiar with the syntax.
If x is less than y,
printf x is less than y.
Else if x is greater than y,
printf x is greater than y.
Else printf x is equal to y-- and,
again, with those backslash ends just
for those new lines so that if you
actually ran this kind of program
it would just move
your cursor ultimately
to the next line of the screen.
>> Now, meanwhile Scratch had other
more sophisticated features, only
some of which we're going to
initially move over to the world of C.
And one of them was
called a list in Scratch.
And this was a special
type of variable that
allowed you to store multiple things
in it back, to back, to back, to back.
>> In C, it doesn't have
lists, per se, but something
that are more generally
called arrays, although we'll
come back later this semester
to looking at something
called a list, or really a linked list.
But for now, the closest
equivalent in C for us
is going to be something
called an array.
And an array is simply a
special type of variable
that allows you to store data
back, to back, to back, to back.
>> And, indeed, in Scratch,
if we wanted to access
the first element of an array or
a list-- and I'm going to call it,
by convention, argv, argument
vector, but more on that before long.
If I want to get at the first element
of argv, in the world of Scratch
you actually do typically
start counting from 1.
>> And so I might get item 1 of argv.
That's just how MIT implemented
the notion of lists.
But in C, I'm going to
more simply just say, argv,
which again is the name of my
list-- or to be clear, an array.
And if I want the first
elements, I'm going
to use square brackets, which you
might not often used under a keyboard.
>> But 0 just means, get me the first.
So on occasion and as
time passes, we're going
to start to see these dichotomies
between Scratch and C,
whereby Scratch uses one.
We in C use 0 here.
But you'll quickly see
once you understand
the foundations of each language, that
these things start to get all the more
familiar through practice and practice.
>> So let's actually look now at a program.
Here shall be the first of our C
source code for complete programs.
And the program we're going
to offer for consideration
is the one that's equivalent
to that earlier Scratch piece.
>> So in here, we have what's
arguably the simplest C program
you can write that
actually does something.
Now, we'll look past,
for now, has include,
standard io.h, and these angle
brackets, and int, and void,
and the curly braces, and the like.
>> And let's just focus on
what, at least intuitively,
might jump out at you already.
In fact, main, I don't
necessarily know what this is,
but much like Scratch had that when
green flag clicked puzzle piece,
so does C as a programming language
have a main piece of code that
gets executed by default. And, indeed,
it's literally going to be called main.
>> So main is a function.
And it's a special function that exists
in C that when you run a program,
it is main that gets run by
default. In the world of Scratch,
it was usually when green flag
clicked that got run by default.
>> Meanwhile, we've seen this before,
printf or print formatted, that's
going to be a function that comes with
C, along with a whole bunch of others,
that will from time and time
again, in order to do exactly
as its name suggests, print something.
What do we want to print?
Well, we'll see that
by enclosing characters
like these-- hello world,
backslash n in double quotes,
we can tell printf exactly
what to print on the screen.
>> But in order to do
that, we unfortunately
need to take something that is
already cryptic to us humans,
but at least it's somewhat readable--
sharp include, standard io.h, int,
main, void, printf, all of the magical
incantations we just saw on the screen.
But we actually have to
go more arcane still.
We first need to translate the code
that we write into machine code.
And recall from last week that machines,
at least the ones we know here,
at the end of the day only
understand zeros and ones.
>> And my God, if we had to write these
zeros and ones to actually program,
it would very, very quickly
take the fun out of anything.
But it turns out, per last week,
that these patterns of zeros and ones
just have special meaning.
In certain contexts,
they might mean numbers.
>> In some contexts, they might mean
letters, or colors, or any number
of other abstractions there upon.
But just as your computer has
a CPU, Central Processing Unit,
or the brains inside of your computer.
It's usually Intel
inside, because that's
one of the biggest companies
that makes CPUs for computers.
>> Well, Intel CPUs and others
simply have decided in advance
that certain patterns of zeros and
ones shall mean specific things.
Certain patterns of zeros and ones
will mean, print this to the screen,
or add these two numbers, or
subtract these two numbers,
or move this piece of data from
my computer's memory over here,
or any number of other very low level,
but ultimately useful, operations.
But, thankfully, we humans are not going
to need to know this level of detail.
Indeed, just like last time, where we
abstracted again, and again, and again,
building from very low level
primitives like zeros and ones
to higher level concepts
like numbers, and letters,
and colors, and more,
so can we as programmers
stand on the shoulders of
others who have come before us
and use software that other
people have written before us--
namely programs called compilers.
>> C is a language that
is usually compiled,
which means converted from
source code to machine code.
In particular, what this means
is that if you've got your source
code that you yourself write, as we soon
will in just a moment on the screen,
and you want to convert it
ultimately to machine code--
those zeros and ones that
only your Mac or your PC
understands-- you've got a first
feed that source code in as
input to a special
program called a compiler,
the output of which we
shall see is machine code.
And, indeed, last time we talked
about, really, at the end of the day,
problem solving.
You've got inputs.
And you've got outputs.
And you've got some kind
of algorithm in the middle.
>> Algorithms can surely be
implemented in software,
as we saw with pseudocode last week
and as we'll see with actual code
this week.
And so a compiler really just
has a set of algorithms inside
of it that know how to
convert the special keywords,
like main, and printf,
and others that we just
saw into the patterns of zeros and
ones that Intel inside and other CPUs
actually understands.
So how do we do this?
Where do we get a compiler?
>> Most of us here have a Mac or a PC.
And you're running Mac OS, or
Windows, or Linux, or Solaris,
or any number of other
operating systems.
And, indeed, we could
go out onto the web
and download a compiler
for your Mac or your PC
for your particular operating system.
But we would all be on
different pages, so to speak.
We'd have slightly
different configurations.
And things wouldn't work all the same.
And, indeed, these days
many of us don't use
software that runs only on our laptops.
Instead, we use something
like a browser that
allows us to access web-based
applications in the cloud.
And later this semester,
we will do exactly that.
We will write applications or
software using code-- not C,
but other languages like Python and
JavaScript-- that run in the cloud.
>> And to do that, we ourselves
during the semester
will actually use a cloud-based
environment known as CS50 IDE.
This is a web-based programming
environment, or integrated development
environment, IDe, that's built atop some
open source software called Cloud 9.
And we've made some pedagogical
simplifications to it
so as to hide certain features in
the first weeks that we don't need,
after which you can
reveal them and do most
anything you want with the environment.
>> And it allows us, too, to
pre-install certain software.
Things like a so-called CS50
library, which we'll soon see
provides us in C with some
additional functionality.
So if you go to, ultimately, CS50.io,
you'll be prompted to log in,
and once you do and create
an account for free,
you will be able to access an
environment that looks quite like this.
>> Now, this is in the default mode.
Everything is nice and
bright on the screen.
Many of us have a habit of
working on CS50 piece that's
quite late into the night.
And so some of you might prefer to
turn it into night mode, so to speak.
>> But, ultimately, what you're
going to see within CS50 IDE
is three distinct areas--
an area on the left where
your files are going to be in the
cloud, an area on the top right
where your code is going to be editable.
You'll be able to open
individual tabs for any program
that you write this semester inside
of that top right hand corner.
And then most arcanely,
and yet powerfully,
is going to be this thing at the
bottom known as a terminal window.
>> This is an old school
Command Line Interface,
or CLI, that allows
you to execute commands
on the computer-- in this case,
the computer in the cloud--
to do things like compile your code
from source code to machine code,
to run your programs, or to start your
web server, or to access your database,
and any number of other techniques
that we'll start to use before long.
But to get there, we're
going to actually have
to go online and start playing.
And to do that, let's first
start tinkering with main,
and write the main part of a program.
And let's use that function
printf, which we used earlier,
simply to say something.
>> So here I am already inside of CS50 IDE.
I've logged in advance.
And I full screened the window.
And so, ultimately, you
too in coming problems
will follow similar steps that
will provide online documentation.
So you don't need to worry about
absorbing every little technical step
that I do here today.
>> But you'll get a screen like this.
I happen to be in night mode.
And you can brighten everything
up by disabling night mode.
And at the end of the
day, you're going to see
these three main areas-- the file
browser at left, the code tabs up top,
and the terminal window at the bottom.
>> Let me go ahead and
write my first program.
I'm going to preemptively go to File,
Save, and save my file as hello.c.
Indeed, by convention, any program we
write that's written in the C language
should be named something
dot c, by convention.
So I'm going to name it hello.c, because
I just want to say hello to the world.
Now I'm going to zoom
out and click Save.
And all I have here now is a tab
in which I can start writing code.
>> This is not going to compile.
This means nothing.
And so even if I converted
this to zeros and ones,
the CPU is going to have no
idea what's going around.
But if I write lines that do match
up with C's conventions-- C being,
again, this language-- with syntax like
this, printf hello world-- and I've
gotten comfortable with
doing this over time.
So I don't think I made
any typographical errors.
>> But, invariably, the very first
time you do this, you will.
And what I am about to do might very
well not work for you the first time.
And that's perfectly OK,
because right now you
might just see a whole lot of newness,
but over time once you get familiar
with this environment, and
this language, and others,
you'll start to see things that
are either correct or incorrect.
>> And this is what the
teaching fellows and course
assistants get so good at over time, is
spotting mistakes or bugs in your code.
But I claim that there
are no bugs in this code.
So I now want to run this program.
>> Now on my own Mac or PC, I'm in
the habit of double clicking icons
when I want to run some program.
But that's not the model here.
In this environment, which is CS50 IDE.
We are using an operating
system called Linux.
Linux is reminiscent of another
operating system, generally known
as Unix.
And Linux is particularly known for
having a Command Line Environment, CLI.
Now, we're using a specific
flavor of Linux called Ubuntu.
And Ubuntu is simply a
certain version of Linux.
>> But these Linux's these days do actually
come with graphical user interfaces.
And the one we happen to
be using here is web-based.
So this might look even a
little different from something
you yourself might have
seen or run in the past.
>> So I'm going to go ahead
now and do the following.
I've saved this file as hello.c.
I'm going to go ahead and
type clanghello.c So Clang
for the C language is a compiler.
It's pre-installed in CS50 IDE.
And you can absolutely download and
install this on your own Mac or PC.
>> But, again, you wouldn't have all of
the pre-configuration done for you.
So for now, I'm just
going to run clanghello.c.
And now notice this syntax
here will eventually
realize just means that I'm in a
folder or directory called Workspace.
This dollar sign is just convention
for meaning, type your commands here.
>> It's what's called a prompt, just
by convention is dollar sign.
And if I go ahead now and click
Enter, nothing seems to have happened.
But that's actually a good thing.
The less that happens on
your screen, the more likely
your code is to be correct,
at least syntactically.
>> So if I want to run this
program, what do I do?
Well, it turns out that the
default name by convention
for programs when you don't specify a
name for your program is just a.out.
And this syntax too, you'll
get familiar with before long.
>> Dot slash just means, hey, CS50
IDE, run a program called a.out
that's inside my current directory.
That dot means the current directory.
And we'll see what other such sequences
of characters means before long.
>> So here we go, Enter, hello world.
And you'll notice, that what happened?
Not only did it print hello world.
It also moved the
cursor to the next line.
>> And why was that?
What was the code that we wrote before
that ensured that the cursor would
go on the next line?
Funny thing about a
computer is it's only going
to do literally what you tell it to do.
>> So if you tell it to printf hello,
comma, space, world, close quote,
it's literally only going
to print those characters.
But I had this special character
at the end, recall, backslash n.
And that's what ensured
that the character went
to the next line of the screen.
>> In fact, let me go and do this.
Let me go ahead and delete this.
Now, notice that the
top of my screen there's
a little red light in
the tab indicating,
hey, you've not saved your file.
So I'm going to go ahead with control
S or command S, save the file.
Now it goes-- went for a moment-- green.
And now it's back to
just being a close icon.
>> If I now run clanghello.c again,
Enter, dot slash, a.out, Enter,
you'll see that it still worked.
But it's arguably a little buggy.
Right now, my prompt-- workspace,
and then that dollar sign,
and then my actual prompt--
is all on the same line.
So this certainly an aesthetic bug,
even if it's not really a logical bug.
>> So I'm going to undo what I just did.
I'm going to rerun a.out.
Notice I've added the
newline character back.
I've saved the file.
>> So I'm going to rerun a.out, and--
dammit, a bug, a bug meaning mistake.
So the bug is that even though
I added the backslash n there,
re-saved, re-ran the program,
the behavior was the same.
Why would that be?
>> I'm missing a step, right?
That key step earlier was that you have
to-- when you change your source code,
it turns out also run
it through the compiler
again so you get new machine code.
And the machine code,
the zeros and ones,
are going to be almost identical, but
not perfectly so, because we need,
of course, that new line.
>> So to fix this, I'm going to need
to rerun clanghello.c, enter, dot
slash, a.out.
And now, hello world is back
to where I expect it to be.
So this is all fine and good.
But a.out is a pretty stupid name for a
program, even though it happens to be,
for historical reasons, the
default-- meaning assembly outputs.
>> But let me go ahead here
and do this differently.
I want my hello world program
to actually be called hello.
So if it were an icon on my
desktop, it wouldn't be a.out.
It would be called hello.
>> So to do this, it turns out
that Clang, like many programs,
supports command line arguments,
or flags, or switches,
which simply influence its behavior.
Specifically, Clang supports a dash o
flag, which then takes a second word.
In this case, I'll arbitrarily,
but reasonably, call it hello.
But I could call it anything
I want, except a.out, which
would be rather besides the point.
>> And then just specify the name
of the file I do want to compile.
So now even though at the beginning
of the command I still have Clang,
at the end of the command
I still have the filename,
I now have these command line
arguments, these flags that are saying,
oh, by the way, output-o, a file
called hello, not the default a.out.
>> So if I hit Enter now, nothing
seems to have happened.
And, yet, now I can do dot slash hello.
So it's the same program.
The zeros and ones are
identical at the end of the day.
>> But they're in two
different files-- a.out,
which is the first version
and just foolishly named,
and now hello, which is a much
more compelling name for a program.
But, honestly, I am never
going to remember this again,
and again, and again.
And, actually, as we write
more complicated programs,
the commands you're
going to have to write
are going to get even
more complicated still.
>> And so not to worry.
It turns out that humans before
us have realized they too
had this exact same problem.
They too did not enjoy having to
type fairly long, arcane commands,
let alone remember them.
And so humans before us have made
other programs that make it easier
to compile your software.
>> And, indeed, one such
program is called Make.
So I'm going to go ahead and do this.
I'm going to undo everything I
just did in the following way.
Let me type LS.
And you'll notice three things--
a.out, and a star, hello
and a star, and hello.c.
Hopefully, this should
be a little intuitive,
insofar as earlier there was
nothing in this workspace.
There was nothing that I had
created until we started class.
>> And I created hello.c.
I then compiled it, and called it a.out.
And then I compiled it again slightly
differently and called it hello.
So I have three files in this directory,
in this folder called Workspace.
Now, I can see that as well
if I zoom out actually.
>> If I zoom out here and
look at that top right hand
corner, as promised the left
hand side of your screen
is always going to show you
what's in your account, what's
inside of CS50 IDE.
And there is three files there.
>> So I want to get rid of a.out and hello.
And as you might
imagine intuitively, you
could sort of control click
or right click on this.
And this little menu pops up.
You can download the file, run
it, preview it, refresh, rename,
or what not.
>> And I could just delete,
and it would go away.
But let's do things with a command
line for now, so as to get comfortable
with this, and do the following.
I'm going to go ahead and remove
a.out by typing literally rma.out.
It turns out, the command for
removing or deleting something,
is not remove or delete.
>> It's more succinctly RM, just to save
you some keystrokes, and hit Enter.
Now we're going to be somewhat
cryptically remove regular file a.out.
I don't really know what an
irregular file would be yet.
But I do want to remove it.
>> So I'm going to type y for yes.
Or I could type it out, and hit Enter.
And, again, nothing seems to happen.
But that is, generally, a good thing.
>> If I type LS this time,
what should I see?
Hopefully, just hello and hello.c.
Now, as an aside, you'll
notice this star, asterisk,
that's at the end of my programs.
And they're also showing up in green.
That is just CS50 IDE's way
of cluing you into the fact
that that's not source code.
That's an executable, a runnable
program that you can actually run
by doing dot slash, and then it's name.
>> Now, let me go ahead and remove
this, rm hello, Enter, remove regular
file hello, yes.
And now if I type LS,
we're back to hello.c.
Try not to delete your
actual source code.
Even though there are features
built into CS50 IDE where
you can go through your revision history
and rewind in time if you accidentally
delete something, do be mindful
as per these prompts yes or no,
of what you actually want to do.
And if I go up to the top
left hand corner here,
all that remains is hello.c.
So there's bunches of
other commands that you
can execute in the world of Linux,
one of which is, again, Make.
And we're going to Make
my program now as follows.
>> Instead of doing clang,
instead of doing clang-o,
I'm going to simply
literally type, make hello.
And now notice, I am
not typing make hello.c.
I am typing make hello.
>> And this program Make that
comes with CS50 IDE, and more
generally with Linux,
is a program that's
going to make a program called Hello.
And it's going to assume, by convention,
that if this program can be made,
it's going to be made from a source
code file ending in dot c, hello.c.
>> So if I hit Enter now, notice that
the command that gets executed
is actually even longer
before than before.
And that's because we've
preconfigured CS50 IDE to have
some additional features built in that
we don't need just yet, but soon will.
But the key thing to realize
is now I have a Hello program.
>> If I type LS again, I
have a hello program.
And I can run it with
dot slash a.out, no,
because the whole point of this
exercise was dot slash hello.
And now I have my hello world program.
So moving forward,
we're almost always just
going to compile our programs
using the command Make.
And then we're going to run them by
dot slash, and the program's name.
But realize what Make is doing for
you, is it is itself not a compiler.
It's just a convenience program
that knows how to trigger a compiler
to run so that you yourself can use it.
>> What other commands exist in
Linux, and in turn the CS50 IDE?
We'll soon see that there's a
CD command, Change Directory.
This allows you within
your command line interface
to move forward, and back,
and open up different folders
without using your mouse.
>> LS we saw, which stands for list
the files in the current directory.
Make Dir, you can
probably start to infer
what these mean now-- make directory,
if you want to create a folder.
RM for remove, RM Dir for
remove directory-- and these,
again, are the command line
equivalents of what you
could do in CS50 IDE with your mouse.
But you'll soon find
that sometimes it's just
a lot faster to do
things with a keyboard,
and ultimately a lot more powerful.
>> But it's hard to argue that
anything we've been doing so far
is all that powerful, when all
we've been saying is, hello world.
And, in fact, I hardcoded the
words hello world into my program.
There is no dynamism yet.
Scratch was an order of magnitude
more interesting last week.
>> And so let's get there.
Let's take a step toward that by
way of some of these functions.
So not only does C come with printf,
and bunches of other functions
some of which we'll see
over time, it doesn't
make it all that easy right out
of the gate in getting user input.
>> In fact, one of the weaknesses
of languages like C,
and even Java and yet
others, is that it doesn't
make it easy to just get things like
integers from users, or strings, words,
and phrases, let alone things like
floating point values, or real numbers
with decimal points, and really
long numbers, as we'll soon see.
So this list of functions here, these
are like other Scratch puzzle pieces
that we have pre-installed in CS50
IDE that we'll use for a few weeks
as training wheels of sorts, and
eventually take them off, and look
underneath the hood, perhaps, at
how these things are implemented.
>> But to do this, let's
actually write a program.
Let me go ahead now.
And I'm going to create a new
file by clicking this little plus,
and clicking New File.
>> I'm going to save this next
one as, let's say, string.c,
because I want to play with strings.
And string in C is just
a sequence of characters.
So now let's go ahead
and do the following.
>> Include standard IO.h-- and
it turns out standard IO,
IO just means input and output.
So it turns out that
this line here is what
is the neighboring us to use printf.
Printf, of course, produces output.
So in order to use printf, it turns
out you have to have this line of code
at the top of your file.
>> And we'll come back to what
that really means before long.
It turns out that in
any C program I write,
I've got to start it with
code that looks like this.
And you'll notice CS50 IDE, and
other integrated development
environments like it,
are going to try as best
they can to finish your thought.
In fact, a moment ago if I undo
what I just did, I hit Enter.
>> I then hit open curly
brace, hit Enter again.
And it finished my thought.
It gave me a new line, indented no less
for nice stylistic reasons we'll see.
And then it automatically gave me
that curly brace to finish my thought.
Now, it doesn't always
guess what you want to do.
But in large part, it does
save you some keystrokes.
So a moment ago, we ran this program--
hello, world, and then compiled it,
and then ran it.
But there's no dynamism here.
What if we wanted to
do something different?
Well, what if I wanted to actually
get a string from the user?
I'm going to use a puzzle piece
called exactly that-- get string.
>> Turns out in C that when you don't want
to provide input to a puzzle piece,
or more properly to a function, you
literally just do open parenthesis,
close parenthesis.
So it's as though there's
no white box to type into.
The say block before
had a little white box.
We don't have that white box now.
>> But when I call get string, I
want to put the result somewhere.
So a very common paradigm in C is to
call a function, like get string here,
and then store its return value.
It's the result of its
effort in something.
>> And what is the
construct in programming,
whether in Scratch or now C, that we
can use to actually store something?
Called it a variable, right?
And in Scratch, we don't really
care what was going in variables.
>> But in this case, we actually do.
I'm going to say string.
And then I could call
this anything I want.
I'm going to call it
name, gets get string.
>> And now even if you're
a little new to this,
notice that I'm lacking some detail.
I'm forgetting a semi-colon.
I need to finish this thought.
So I'm going to move my cursor,
and hit semi-colon there.
And what have I just done?
In this line of code,
number 5 at the moment,
I'm calling get string with no inputs.
So there's no little white
box like the Save block has.
>> I'm just saying, hey,
computer, get me a string.
The equal sign is not really
an equal sign, per se.
It's the assignment
operator, which means,
hey, computer, move the value
from the right over to the left.
And in the left, I have the following.
>> Hey, computer, give me a string--
a sequence of characters.
And call that string Name.
And I don't even have to call it Name.
>> I could call it, conventionally,
something like S,
much like we used i to
call the variable i.
But now I need to do something with it.
It would be pretty stupid to
try compiling this code, running
this program, even though
I'm getting a string,
because it's still just
going to say hello world.
>> But what if I do want to change this.
Why don't I do this?
Percent s, comma s.
And this is a little cryptic still.
>> So let me make my variables more clear.
Let me name this variable Name.
And let's see if we can't tease
apart what's happening here.
>> So on line five, I'm getting a string.
And I'm storing that string,
whatever the user has typed in
at his or her keyboard,
in a variable called Name.
And it turns out that
printf doesn't just
take one argument in double
quotes, one input in double quotes.
>> It can take two, or three, or more, such
that the second, or third, or fourth,
are all the names of variables,
or specifically values,
that you want to plug into,
dynamically, that string in quotes.
In other words, what
would be wrong with this?
If I just said hello name, backslash
n, saved my file, compiled my code,
and ran this, what would happen?
>> It's just going to say, hello
name, literally N-A-M-E,
which is kind of stupid because
it's no different from world.
So anything in quotes is
what literally gets printed.
So if I want to have
a placeholder there,
I actually need to use
some special syntax.
And it turns out if you read the
documentation for the printf function,
it will tell you that
if you use percent s,
you can substitute a value as follows.
>> After a comma after that
double quote, you simply
write the name of the
variable that you want
to plug in into that format
code, or format specifier,
percent s for strings.
And now if I've saved my file,
I go back down to my terminal.
And I type Make String,
because, again, the name of this
file that I chose before is string.c.
>> So I'm going to say Make String, enter.
Oh my goodness, look at all of
the mistakes we've made already.
And this is-- what, this is really
like a six, seven line program?
So this is where it can very
quickly get overwhelming.
>> This terminal window has
now just regurgitated
a huge number of error messages.
Surely, I don't have more error
messages than I have lines of code.
So what is going on?
>> Well, the best strategy
to do anytime you
do encounter an overwhelming
list of errors like that,
is scroll back, look for the command
you just ran, which in my case
is make string.
Look at what make did, and that's that
long Clang command, no big deal there.
>> But the red is bad.
Green is trying to be
gentle and helpful.
But it's still bad, in this case.
But where is it bad?
>> String.c, line five, character five.
So this is just common convention.
Something colon something means
line number and character number.
Error, use of undeclared
identifier string.
Did you mean standard in?
>> So, unfortunately, Clang
is trying to be helpful.
But it's wrong, in this case.
No, Clang, I did not mean standard IO.
I meant that on line one, yes.
>> But line five is this one here.
And Clang does not
understand S-T-R-I-N-G.
It's an undeclared identifier, a
word it just has never seen before.
And that's because C, the language
we're writing code in right now,
does not have variables called strings.
>> It doesn't, by default, support
something called a string.
That's a CS50 piece of
jargon, but very conventional.
But I can fix this as follows.
>> If I add one line of code
to the top of this program,
include CS50.h, which is another file
somewhere inside of CS50 IDE, somewhere
on the hard drive, so to speak,
of the Ubuntu operating system
that I'm running, that
is the file that's
going to teach the operating
system what a string is, just
like standard io.h is the file
in the operating system that's
going to teach it what printf is.
>> Indeed, we would have gotten
a very similar message
if IO had admitted standard
IO.h and tried to use printf.
So I'm going to go ahead and just
take Control L to clear my screen.
Or you can type clear and it will
just clear the terminal window.
But you can still scroll back in time.
>> And I'm going to rerun Make String.
Cross my fingers this time, Enter.
Oh my God, it worked.
it shows me a long cryptic command
that is what Make generated via Clang,
but no error messages.
So realize, even though
you might get completely
overwhelmed with the
number of error messages,
it just might be this annoying cascading
effect, where Clang doesn't understand
one thing, which means it then
doesn't understand the next word,
or the next line.
And so it just chokes on your code.
But the fix might be simple.
And so always focus on the
very first line of output.
And if you don't
understand it, just look
for keywords that might be
clues, and the line number,
and the character, where
that mistake might be.
>> Now let me go ahead and type
dot slash, string, enter.
Hm, it's not saying hello anything.
Why?
Well, recall, where is it running?
>> It's probably stuck at the moment
in a loop, if you will, on line six,
because Get String by design,
written by CS50 staff,
is literally meant to just sit
there waiting, and waiting,
and waiting for a string.
All we mean by string is human input.
So you know what?
Let me go ahead.
And just on a whim, let me
type my name, David, enter.
Now I have a more dynamic program.
It said, hello David.
>> If I go ahead and run this again,
let me try say Zamila name, enter.
And now we have a dynamic program.
I haven't hard coded world.
I haven't hard coded
name, or David, or Zamila.
>> Now it's much more like the programs
we know, where if it take input,
it produces slightly different output.
Now, this is not the best
user experience, or UX.
I run the program.
>> I don't know what I'm supposed
to do, unless I actually look at
or remember the source code.
So let's make the user
experience a little better
with the simplest of things.
Let me go back into this
program, and simply say printf.
>> And let me go ahead and say name, colon,
and a space, and then a semi-colon.
And just for kicks, no backlash n.
And that's deliberate,
because I don't want
the prompt to move to the next line.
>> I want to, instead, do this, make string
to recompile my code into new machine
code dot slash string.
Ah, this is much prettier.
Now I actually know what the computer
wants me to do, give it a name.
>> So I'm going to go ahead and type
in Rob, enter, and hello, Rob.
So, realize, this is still, at the end
of the day, only a nine line program.
But we've taken these baby steps.
>> We wrote one line with which we
were familiar, printf, hello world.
Then we undid a little bit of that.
And we actually used get string.
And we tossed that value in a variable.
And then we went ahead and improved
it further with a third line.
And this iterative process of
writing software is truly key.
In CS50, and in life in general,
you should generally not sit down,
have a program in mind, and try writing
the whole damn thing all at once.
>> It will, inevitably, result in way
more errors than we ourselves saw here.
Even I, to this day, constantly
make other stupid mistakes,
are actually harder mistakes
that are harder to figure out.
But you will make more mistakes the more
lines of code you write all at once.
And so this practice of,
write a little bit of code
that you're comfortable with, compile
it, run it, test it more generally,
then move on-- so just like we kept
layering and layering last week,
building from something very
simple to something more complex,
do the same here.
Don't sit down, and try to
write an entire problem.
Actually take these baby steps.
>> Now, strings aren't all
that useful unto themselves.
We'd actually, ideally, like to
have something else in our toolkit.
So let's actually do exactly that.
>> Let me go ahead now and whip up
a slightly different program.
And we'll call this int.c, for integer.
I'm going to, similarly,
include CS550.h.
I'm going to include standard IO.
And that's going to be pretty common
in these first few days of the class.
>> And I'm going to ready
myself with a main function.
And now instead of getting a string,
let's go ahead and get an int.
Let's call it i, and call it get
int, close parens, semi-colon.
And now let's do
something with it, printf.
>> Let's say something like
hello, backslash n, comma i.
So I'm pretty much mimicking
what I did just a moment ago.
I have a placeholder here.
I have comma i here, because I want
to plug i into that placeholder.
>> So let's go ahead and try
compiling this program.
The file is called int.c.
So I'm going to say, make int, enter.
Oh my God, but no big deal, right?
There's a mistake.
>> There's a syntactic mistake
here such that the program can't
be compiled inside int.c, line
seven, character 27, error format
specifies type char
star, whatever that is.
But the argument type is int.
>> So here, too, we're not going to--
even though today is a lot of material,
we're going to overwhelm you with
absolutely every feature of C,
and programming more generally,
in just these first few weeks.
So there's often going to be jargon
with which you're not familiar.
And, in fact, char star is something
we're going to come back to
in a week or two's time.
>> But for now, let's see if we can
parse words that are familiar.
Formats-- so we heard format
specifier, format code before.
That's familiar.
Type-- but the argument has type int.
Wait a minute, i is an int.
>> Maybe percent s actually
has some defined meaning.
And, indeed, it does.
An integer, if you want
printf to substitute it,
you actually have to use a
different format specifier.
And you wouldn't know this
unless someone told you,
or you had done it before.
But percent i is what
can be commonly used
in printf for plugging in an integer.
You can also use percent
d for a decimal integer.
But i is nice and simple here.
So we'll go with that.
>> Now let me go ahead and
rerun make int, Enter.
That's good, no errors.
Dot slash int-- OK, bad user experience,
because I haven't told myself
what to do.
But that's fine.
I'm catching on quickly.
>> And now let me go ahead and
type in David, OK, Zamila, Rob.
OK, so this is a good thing.
This time, I'm using a function,
a puzzle piece, called get int.
And it turns out-- and we'll
see this later in the term--
the CS50 staff has implemented
get string in such a way
that it will only physically
get a string for you.
>> It has implemented get int in
such a way that it will only
get an integer for you.
And if you, the human,
don't cooperate, it's
literally just going to
say retry, retry, retry,
literally sitting there looping, until
you oblige with some magical number,
like 50, and hello 50.
>> Or if we run this again
and type in 42, hello 42.
And so the get int function
inside of that puzzle piece
is enough logic, enough thought,
to figure out, what is a word?
And what is a number?
Only accepting, ultimately, numbers.
>> So it turns out that this
isn't all that expressive.
so far.
So, yay, last time we
went pretty quickly
into implementing games, and animation,
and artistic works in Scratch.
And here, we are being content
with hello world, and hello 50.
>> It's not all that inspiring.
And, indeed, these first few
examples will take some time
to ramp up in excitement.
But we have so much more
control now, in fact.
And we're going to very
quickly start layering
on top of these basic primitives.
>> But first, let's understand
what the limitations are.
In fact, one of the things
Scratch doesn't easily
let us do is really look
underneath the hood,
and understand what a
computer is, what it can do,
and what its limitations are.
And, indeed, that lack of
understanding, potentially, long-term
can lead to our own mistakes-- writing
bugs, writing insecure software that
gets hacked in some way.
>> So let's take some steps toward
understanding this a little better by
way of, say, the following example.
I'm going to go ahead and implement
real quick a program called Adder.
Like, let's add some numbers together.
And I'm going to code some corners
here, and just copy and paste
where I was before, just
so we can get going sooner.
So now I've got the basic beginnings
of a program called Adder.
>> And let's go ahead and do this.
I'm going to go ahead and
say, intx gets get int.
And you know what?
Let's make a better user experience.
>> So let's just say x is, and effectively
prompt the user to give us x.
And then let me go ahead and say, printf
how about y is, this time expecting
two values from the user.
And then let's just go ahead and
say, printf, the sum of x and y is.
And now I don't want to do percent s.
I want to do percent i, backslash
n, and then plug in sum value.
>> So how can I go about doing this?
You know what?
I know how to use variables.
Let me just declare a new one, int z.
>> And I'm going to take a guess here.
If there are equal signs in this
language, maybe I can just do x plus y,
so long as I end my
thought with a semi-colon?
Now I can go back down here, plug in z,
finish this thought with a semi-colon.
And let's see now, if these
sequences of lines-- x is get int.
Y is get int.
>> Add x and y, store the value in z--
so, again, remember the equal sign
is not equal.
It's assignment from right to left.
And let's print out that the sum
of x and y is not literally z,
but what's inside of z.
So let's make Adder --
nice, no mistakes this time.
Dot slash Adder, enter,
x is going to be 1.
>> Y is going to be 2.
And the sum of x and y is 3.
So that's all fine and good.
>> So you would imagine that math
should work in a program like this.
But you know what?
Is this variable, line
12, even necessary?
You don't need to get in the habit
of just storing things in variables
just because you can.
And, in fact, it's generally
considered bad design
if you are creating a variable, called
z in this case, storing something in it,
and then immediately
using it, but never again.
Why give something a name
like z if you're literally
going to use that
thing only once, and so
proximal to where you created
it in the first place,
so close in terms of lines of code?
So you know what?
It turns out that C is pretty flexible.
If I actually want to
plug-in values here,
I don't need to declare a new variable.
I could just plug-in x plus
y, because C understands
arithmetic, and mathematical operators.
>> So I can simply say, do this math,
x plus y, whatever those values are,
plug the resulting
integer into that string.
So this might be, though
only one line shorter,
a better design, a better program,
because there's less code, therefore
less for me to understand.
And it's also just cleaner,
insofar as we're not
introducing new words,
new symbols, like z,
even though they don't really
serve much of a purpose.
>> Unfortunately, math isn't
all that reliable sometimes.
Let's go ahead and do this.
I'm going to go ahead
now and do the following.
>> Let's do printf, percent i, plus percent
i, shall be percent i, backslash n.
And I'm going to do this-- xyx plus y.
So I'm just going to rewrite
this slightly differently here.
Let me just do a quick sanity check.
Again, let's not get ahead of ourselves.
Make adder, dot slash adder.
x is 1, y is 2, 1 plus 2 is 3.
So that's good.
But let's complicate this now
a bit, and create a new file.
>> I'm going to call this one,
say, ints, plural for integers.
Let me start where I was a moment ago.
But now let's do a few other lines.
Let me go ahead and do the following,
printf, percent i, minus percent i,
is percent i, comma x, comma yx minus y.
So I'm doing slightly
different math there.
Let's do another one.
So percent i times percent
i is percent i, backslash n.
Let's plug-in x, and y, and x times y.
We'll use the asterisk on
your computer for times.
>> You don't use x. x is
a variable name here.
You use the star for multiplication.
Let's do one more.
Printf percent I, divided
by percent i, is percent i,
backslash n. xy divided by y--
so you use the forward slash in C
to do division.
And let's do one other.
Remainder of percent i, divided
by percent i, is percent i.
xy-- and now remainder
is what's left over.
When you try dividing a
denominator into a numerator,
how much is left over that
you couldn't divide out?
>> So there isn't really,
necessarily, a symbol
we've used in grade school for this.
But there in C. You can
say x modulo y, where
this percent sign in this context--
confusingly when you're inside
of the double quotes,
inside of printf, percent
is used as the format specifier.
>> When you use percent outside of
that in a mathematical expression,
it's the modulo operator for modular
arithmetic-- for our purposes
here, just means, what is the
remainder of x divided by y?
So x divided by y is x slash y.
What's the remainder of x divided by y?
It's x mod y, as a programmer would say.
>> So if I made no mistakes here, let me
go ahead and make ints, plural, nice,
and dot slash ints.
And let's go ahead and
do, let's say, 1, 10.
All right, 1 plus 10 is 11, check.
1 minus 10 is negative 9, check.
>> 1 times 10 is 10, check.
1 divided by 10 is--
OK, we'll skip that one.
Remainder of 1 divided by 10 is 1.
That's correct.
But there's a bug in here.
>> So the one I put my
hand over, not correct.
I mean, it's close to 0.
1 divided by 10, you know, if we're
cutting some corners, sure, it's zero.
But it should really be 1/10,
0.1, or 0.10, 0.1000, or so forth.
>> It should not really be zero.
Well, it turns out that the computer is
doing literally what we told it to do.
We are doing math like x divided by y.
And both x and y, per the lines
of code earlier, are integers.
>> Moreover, on line 15, we are
telling printf, hey, printf plug-in
an integer, plug-in an integer,
plug-in an integer-- specifically
x, and then y, and then x
divided by y. x and y are ints.
We're good there.
>> But what is x divided by x?
x divided by y should be,
mathematically, 1/10, or 0.1,
which is a real number, a real number
having, potentially, a decimal point.
It's not an integer.
>> But what is the closest
integer to 1/10, or 0.1?
Yeah, it kind of is zero.
0.1 is like this much.
And 1 is this much.
So 1/10 is closer to
0 than it is to one.
>> And so what C is doing for us--
kind of because we told it to--
is truncating that integer.
It's taking the value, which again is
supposed to be something like 0.1000,
0 and so forth.
And it's truncating everything
after the decimal point
so that all of this
stuff, because it doesn't
fit in the notion of an integer, which
is just a number like negative 1, 0, 1,
up and down, it throws away everything
after the decimal point because you
can't fit a decimal point
in an integer by definition.
>> So the answer here is zero.
So how do we fix this?
We need another solution all together.
And we can do this, as follows.
>> Let me go ahead and create a new
file, this one called floats.c.
And save it here in the
same directory, float.c.
And let me go ahead and copy
some of that code from earlier.
>> But instead of getting
an int, let's do this.
Give me a floating point value
called x. where a floating point
value is just literally
something with a floating point.
It can move to the left, to the right.
It's a real number.
>> And let me call not
get int, but get float,
which also was among the menu
of options in the C250 library.
Let's change y to a float.
So this becomes get float.
>> And now, we don't want to plug in ints.
It turns out we have to use percent
f for float, percent f for float,
and now save it.
And now, fingers crossed, make
floats, nice, dot slash floats.
x is going to be one 1. y
Is going to be 10 again.
>> And, nice, OK my addition is correct.
I was hoping for more,
but I forgot to write it.
So let's go and fix this logical error.
>> Let's go ahead and grab the following.
We'll just do a little copy and paste.
And I'm going to say minus.
>> And I'm going to say times.
And I'm going to say divided.
And I'm not going to do modulo,
which is not as germane here,
divided by f, and times plus--
OK, let's do this again.
>> Make floats, dot slash floats,
and 1, 10, and-- nice, no, OK.
So I'm an idiot.
So this is very common
in computer science
to make stupid mistakes like this.
>> For pedagogical purposes,
what I really wanted to do
was change the science here
to plus, to minus, to times,
and to divide, as you hopefully
noticed during this exercise.
So now let's re-compile this
program, do dot slash floats.
>> And for the third time, let's
see if it meets my expectations.
1, 10, enter, yes, OK, 1.000,
divided by 10.000, is 0.100000.
And it turns out we can control how many
numbers are after those decimal points.
We actually will.
We'll come back to that.
>> But now, in fact, the math is correct.
So, again, what's the takeaway here?
It turns out that in C, there are
not only just strings-- and, in fact,
there aren't really, because we
add those with the CS50 library.
But there aren't just ints.
>> There are also floats.
And it turns out a bunch of other data
types too, that we'll use before long.
Turns out if you want a single
character, not a string of characters,
you can use just a char.
>> Turns out that if you want a bool,
a Boolean value, true or false only,
thanks to the CS50 library, we've
added to C the bool data type as well.
But it's also present in
many other languages as well.
And it turns out that sometimes you
need bigger numbers then come by default
with ints and floats.
>> And, in fact, a double is a number
that uses not 32 bits, but 64 bits.
And a long long is a number that
uses not 32, bits but 64 bits,
respectively, for floating point
values and integers, respectively.
So let's actually now
see this in action.
>> I'm going to go ahead here
and whip up one other program.
Here, I'm going to go ahead
and do include CS50.h.
And let me go, include standard IO.h.
>> And you'll notice something
funky is happening here.
It's not color coding things in
the same way as it did before.
And it turns out, that's because I
haven't given the thing a file name.
>> I'm going to call this one
sizeof.c, and hit Save.
And notice what happens to my very
white code against that black backdrop.
Now, at least there's
some purple in there.
And it is syntax highlighted.
>> That's because, quite simply, I've
told the IDE what type of file
it is by giving it a name, and
specifically a file extension.
Now, let's go ahead and do this.
I'm going to go ahead and very
simply print out the following-- bool
is percent LU.
>> We'll come back to
that in just a moment.
And then I'm going to
print size of bool.
And now, just to save
myself some time, I'm
going to do a whole
bunch of these at once.
And, specifically, I'm going to
change this to a char and char.
This one, I'm going to change
to a double and a double.
>> This one, I'm going to change
to a float and a float.
This one, I'm going to
change to an int and an int.
And this one, I'm going
to change to a long long.
And it's still taking
a long time, long long.
>> And then, lastly, I gave
myself one too many, string.
It turns out that in C, there's
the special operator called
size of that's literally
going to, when run,
tell us the size of
each of these variables.
And this is a way, now,
we can connect back
to last week's discussion
of data and representation.
>> Let me go ahead and compile
size of dot slash size of.
And let's see.
It turns out that in C,
specifically on CS50 IDE,
specifically on the
operating system Ubuntu,
which is a 64-bit operating
system in this case,
a bool is going to
use one byte of space.
That's how size is measured,
not in bits, but in bytes.
And recall that one byte is eight bits.
So a bool, even though you
technically only need a 0 or 1,
it's a little wasteful
how we've implemented it.
It's actually going to use a whole
byte-- so all zeros, are maybe
all ones, or something like that,
or just one 1 among eight bits.
>> A char, meanwhile, used for a character
like an Ascii character per last week,
is going to be one character.
And that synchs up with our notion of
it being no more than 256 bits-- rather,
synchs up with it being no
longer than 8 bits, which
gives us as many as 256 values.
A double is going to
be 8 bytes or 64 bits.
>> A float is 4.
An int is 4.
A long, long is 8.
And a string is 8.
But don't worry about that.
We're going to peel back that layer.
It turns out, strings can
be longer than 8 bytes.
>> And, indeed, we've written
strings already, hello world,
longer than 8 bytes.
But we'll come back to
that in just a moment.
But the take away here is the following.
>> Any computer only has a finite
amount of memory and space.
You can only store so many
files on your Mac or PC.
You can only store so many programs in
RAM running at once, necessarily, even
with virtual memory, because
you have a finite amount of RAM.
>> And just to picture-- if
you've never opened up a laptop
or ordered extra memory
for a computer, you
might not know that
inside of your computer
is something that looks
a little like this.
So this is just a common company named
Crucial that makes RAM for computers.
And RAM is where programs
live while they're running.
>> So on every Mac or PC, when you double
click a program, and it opens up,
and it opens some Word document
or something like that,
it stores it temporarily in
RAM, because RAM is faster
than your hard disk, or
your solid state disk.
So it's just where programs go
to live when they're running,
or when files are being used.
>> So you have things that look
like this inside of your laptop,
or slightly bigger things
inside of your desktop.
But the key is you only have a
finite number of these things.
And there's only a finite amount of
hardware sitting on this desk right
here.
>> So, surely, we can't store
infinitely long numbers.
And, yet, if you think back to
grade school, how many digits can
you have to the right
of a decimal point?
For that matter, how many digits can
you have to the left of a decimal point?
Really, infinitely many.
>> Now, we humans might only
know how to pronounce million,
and billion, trillion, and
quadrillion, and quintillion.
And I'm pushing the limits of my
understanding-- or my-- I understand
numbers, but my
pronunciation of numbers.
But they can get infinitely large with
infinitely many digits to the left
or to the right of a decimal point.
>> But computers only have a
finite amount of memory,
a finite number of transistors, a
finite number of light bulbs inside.
So what happens when
you run out of space?
In other words, if you
think back to last week
when we talked about numbers
themselves being represented in binary,
suppose that we've got
this 8-bit value here.
>> And we have seven 1's and one 0.
And suppose that we want
to add 1 to this value.
This is a really big number right now.
>> This is 254, if I remember
the math from last week right.
But what if I change
that rightmost 0 to a 1?
The whole number, of
course, becomes eight 1's.
So we're still good.
>> And that probably represents
255, though depending on context
it could actually represent
a negative number.
But more on that another time.
This feels like it's about
as high as I can count.
>> Now, it's only 8 bits.
And my Mac, surely, has way
more than 8 bits of memory.
But it does have finite.
So the same argument applies, even if we
have more of these ones on the screen.
>> But what happens if you're
storing this number, 255,
and you want to count 1 bit higher?
You want to go from 255 to 256.
The problem, of course, is that if you
start counting at zero like last week,
you can't count as high
as 256, let alone 257,
let alone 258,m because what
happens when you add a 1?
If you do the old grade school
approach, you put a 1 here,
and then 1 plus 1 is 2, but that's
really a zero, you carry the 1,
carry the 1, carry the 1.
All of these things,
these 1's, go to zero.
And you wind up, yes, as someone
pointed out, a 1 on the left hand side.
But everything you can
actually see and fit in memory
is just eight 0's, which is to say
at some point if you, a computer,
tried counting high enough up, you're
going to wrap around, it would seem,
to zero, or maybe even negative
numbers, which are even lower than zero.
>> And we can kind of see this.
Let me go ahead and write
a real quick program here.
Let me go ahead and write
a program called Overflow.
Include CS50.h, include
standard IO.h-- oh,
I really missed my syntax highlighting.
So let's save this as overflow.c.
>> And now int main void--
and before long, we'll
come back to explaining why
we keep writing int main void.
But for now, let's just do
it, taking it for granted.
Let's give myself an int,
and initialize it to 0.
>> Let's then do for int i get zero--
actually, let's do an infinite loop
and see what happens.
While true, then let's print out n
is percent i, backslash n, plug-in n.
But, now, let's do n gets n plus 1.
>> So in other words, on each
iteration of this infinite loop,
let's take n's value,
and add 1 to it, and then
store the result back in n on the left.
And, in fact, we've seen syntax
slightly like this, briefly.
A cool trick is instead
of writing all this out,
you can actually say an n plus equals 1.
>> Or if you really want to be fancy,
you can say n plus plus semi-colon.
But these latter two are just
what we'd call syntactic sugar
for the first thing.
>> The first thing is more explicit,
totally fine, totally correct.
But this is more common, I'll say.
So we'll do this for just a moment.
>> Let's now make overflow, which sounds
rather ominous, dot slash overflow.
Let's see, n's getting pretty big.
But let's think, how big can n get?
>> n is an int.
We saw a moment ago with the size of
example that an int is four bytes.
We know from last week, four bytes is
32 bits, because 8 times 4, that's 32.
That's going to be 4 billion.
>> And we are up to 800,000.
This is going to take forever to
count as high as I possibly can.
So I'm going to go ahead,
as you might before long,
and hit Control C-- frankly, Control
C, a lot, where Control C generally
means cancel.
Unfortunately, because this
is running in the cloud,
sometimes the cloud is
spitting out so much stuff,
so much output, it's going to
take a little while for my input
to get to the cloud.
So even though I hit
Control C a few seconds ago,
this is definitely the side
effect of an infinite loop.
>> And so in such cases, we're
going to leave that be.
And we're going to add another
terminal window over here
with the plus, which of course doesn't
like that, since it's still thinking.
And let's go ahead and be
a little more reasonable.
>> I'm going to go ahead and do
this only finitely many times.
Let's use a for loop,
which I alluded to earlier.
Let's do this.
Give me another variable int i gets 0.
i is less than, let's say, 64 i++.
And now let me go ahead and print
out n is percent i, comma n.
And then n-- this is still
going to take forever.
Let's do this.
>> n gets n times 2.
Or we could be fancy
and do times equals 2.
But let's just say n
equals itself, times 2.
In other words, in this
new version of the program,
I don't want to wait forever
from like 800,000 to 4 billion.
Let's just get this over with.
>> Let's actually double n each time.
Which, recall, doubling is the
opposite of having, of course.
And whereas last week we have
something again, and again,
and again, super fast,
doubling will surely
get us from 1 to the biggest possible
value that we can count to with an int.
>> So let's do exactly this.
And we'll come back to this before long.
But this, again, is just like
the repeat block in Scratch.
And you'll use this before long.
>> This just means count from zero
up to, but not equal, to 64.
And on each iteration of this
loop, just keep incrementing i.
So i++-- and this general construct
on line 7 is just a super common way
of repeating some lines of
code, some number of times.
Which lines of code?
These curly braces, as you
may have gleaned from now,
means, do the following.
>> It's in like Scratch, when
it has the yellow blocks
and other colors that kind of
embrace or hug other blocks.
That's what those curly
braces are doing here.
So if I got my syntax right-- you
can see the carrot symbol in C means
that's how many times I was
trying to solve this problem.
So let's get rid of that one
altogether, and close that window.
And we'll use the new one.
Make overflow, dot slash
overflow, Enter, all right,
it looks bad at first.
But let's scroll back in time,
because I did this 64 times.
>> And notice the first time, n is 1.
Second time, n is 2,
then 4, then 8, then 16.
And it seems that as soon as
I get to roughly 1 billion,
if I double it again, that
should give me 2 billion.
But it turns out, it's
right on the cusp.
>> And so it actually overflows
an int from 1 billion
to roughly negative 2
billion, because an integer,
unlike the numbers we
were assuming last week,
can be both positive and negative
in reality and in a computer.
And so at least one of those
bits is effectively stolen.
So we really only have 31 bits,
or 2 billion possible values.
>> But for now, the takeaway is quite
simply, whatever these numbers are
and whatever the math is,
something bad happens eventually,
because eventually you are trying to
permute the bits one too many times.
And you effectively go from all
1's to maybe all 0's, or maybe
just some other pattern that it
clearly, depending on context,
can be interpreted as a negative number.
And so it would seem the highest I
can count in this particular program
is only roughly 1 billion.
But there's a partial solution here.
You know what?
>> Let me change from an
int to a long long.
And let me go ahead here
and say-- I'm going to have
to change this to an unsigned long.
Or, let's see, I never remember myself.
>> Let's go ahead and make overflow.
No, that's not it, LLD, thank you.
So sometimes Clang can be helpful.
I did not remember what the format
specifier was for a long long.
>> But, indeed, Clang told me.
Green is some kind of good,
still means you made a mistake.
It's guessing that I meant LLD.
>> So let me take it's advice, a long
long decimal number, save that.
And let me rerun it, dot
slash overflow, Enter.
And now what's cool is this.
>> If I scroll back in time, we still start
counting at the same place-- 1, 2, 4,
8, 16.
Notice, we get all the
way up to 1 billion.
But then we safely get to 2 billion.
>> Then we get to 4 billion,
then 8 billion, 17 billion.
And we go higher, and
higher, and higher.
Eventually, this, too, breaks.
>> Eventually, with a long long,
which is the 64-bit value, not
a 32-bit value, if you count
too high, you wrap around 0.
And in this case, we happen to
end up with a negative number.
>> So this is a problem.
And it turns out that this
problem is not all that arcane.
Even though I've deliberately
induced it with these mistakes,
it turns out we see it kind of all
around us, or at least some of us do.
>> So in Lego Star Wars, if
you've ever played the game,
it turns out you can go around
breaking things up in LEGO world,
and collecting coins, essentially.
And if you've ever played
this game way too much time,
as this unnamed individual
here did, the total number
of coins that you can collect
is, it would seem, 4 billion.
>> Now, with it's actually rounded.
So LEGO was trying to
keep things user friendly.
They didn't do it exactly 2 to
the 32 power, per last week.
But 4 billion is a reason.
It seems, based on this information,
that LEGO, and the company that
made this actual software, decided
that the maximum number of coins
the user can accumulate
is, indeed, 4 billion,
because they chose in their code
to use not a long long, apparently,
but just an integer, an unsigned
integer, only a positive integer, whose
max value is roughly that.
Well, here's another funny one.
So in the game Civilization, which
some of you might be familiar, with
it turns out that years ago there
was a bug in this game whereby
if you played the role
of Gandhi in the game,
instead of him being very pacifist,
instead was incredibly, incredibly
aggressive, in some circumstances.
In particular, the way that Civilization
works is that if you, the player,
adopt democracy, your
aggressiveness score gets
decremented by two, so minus
minus, and then minus minus.
>> So you subtract 2 from
your actual iterating.
Unfortunately, if your iterating is
initially 1, and you subtract 2 from it
after adopting democracy
as Gandhi here might
have done, because he was very passive--
1 on the scale of aggressiveness.
But if he adopts democracy, then
he goes from 1 to negative 1.
>> Unfortunately, they were
using unsigned numbers,
which means they treated even negative
numbers as though they were positive.
And it turns out that the
positive equivalent of negative 1,
in typical computer programs, is 255.
So if Gandhi adopts
democracy, and therefore has
his aggressiveness score decreased,
it actually rolls around to 255
and makes him the most
aggressive character in the game.
So you can Google up on this.
And it was, indeed, an
accidental programming bug,
but that's entered quite
the lore ever since.
>> That's all fun and cute.
More frightening is when actual
real world devices, and not games,
have these same bugs.
In fact, just a year ago an article came
out about the Boeing 787 Dreamliner.
>> And the article at first
glance reads a little arcane.
But it said this, a software
vulnerability in Boeing's
new 787 Dreamliner jet has
the potential to cause pilots
to lose control of
the aircraft, possibly
in mid-flight, the FAA officials
warned airlines recently.
It was the determination
that a model 787
airplane that has been powered
continuously for 248 days
can lose all alternating current, AC,
electrical power due to the generator
control units, GCUs, simultaneously
going into fail safe mode.
It's kind of losing me.
But the memo stated, OK, now I got that,
the condition was caused by a software
counter internal to
the generator control
units that will overflow after
248 days of continuous power.
We are issuing this
notice to prevent loss
of all AC electrical
power, which could result
in loss of control of the airplane.
>> So, literally, there is some integer,
or some equivalent data type,
being used in software
in an actual airplane
that if you keep your airplane
on long enough, which apparently
can be the case if you're just running
them constantly and never unplugging
your airplane, it seems, or
letting its batteries die,
will eventually count up, and up,
and up, and up, and up, and up.
>> And, by nature, a
finite amount of memory
will overflow, rolling back to
zero or some negative value,
a side effect of which is the
frighteningly real reality
that the plane might need
to be rebooted, effectively,
or might fall, worse, as it flies.
So these kinds of issues
are still with us,
even-- this was a 2015 article,
all the more frightening
when you don't necessarily
understand, appreciate, or anticipate
those kinds of errors.
>> So it turns out there's one other
bad thing about data representation.
It turns out that even floats are
kind of flawed, because floats, too,
I proposed are 32 bits, or
maybe 64 if you use a double.
But that's still finite.
>> And the catch is that if you can
put an infinite number of numbers
after the decimal point,
there is no way you
can represent all the possible
numbers that we were taught
in grade school can exist in the world.
A computer, essentially, has to
choose a subset of those numbers
to represent accurately.
>> Now, the computer can
round maybe a little bit,
and can allow you to roughly store
any number you might possibly want.
But just intuitively, if you
have a finite number of bits,
you can only permute them
in so many finite ways.
So you can't possibly
use a finite number
of permutation of bits,
patterns of zeros and ones,
to represent an infinite
number of numbers,
which suggests that computers might
very well be lying to us sometimes.
>> In fact, let's do this.
Let me go back into CS50 IDE.
Let me go ahead and
create a little program
called Imprecision, to show that
computers are, indeed, imprecise.
>> And let me go ahead and start with
some of that code from before,
and now just do the following.
Let me go ahead and do printf, percent
f, backslash n, 1 divided by 10.
In other words, let's dive in deeper
to 1/10, like 1 and divided by 10.
Surely, a computer can represent 1/10.
>> So let's go ahead and make imprecision.
Let's see.
Format specifies type double.
But the argument has type int.
What's going on?
>> Oh, interesting, so it's a
lesson learned from before.
I'm saying, hey, computer show
me a float with percent f.
But I'm giving it 2 ints.
So it turns out, I can fix
this in a couple of ways.
>> I could just turn one into 1.0, and
10 into 10.0, which would, indeed,
have the effect of converting
them into floats-- still hopefully
the same number.
Or it turns out there's something
we'll see again before long.
You could cast the numbers.
>> You can, using this parenthetical
expression, you can say,
hey, computer, take this
10, which I know is an int.
But treat it, please,
as though it's a float.
But this feels unnecessarily complex.
>> For our purposes today,
let's just literally
make them floating point values
with a decimal point, like this.
Let me go ahead and rerun, make
imprecision, good, dot slash
imprecision, enter.
OK, we're looking good.
>> 1 divided by 10, according to my
Mac here, is, indeed, 0.100000.
Now, I was taught in grade school there
should be an infinite number of 0's.
So let's at least try
to see some of those.
It turns out that printf is a little
fancier still than we've been using.
It turns out you don't have to specify
just percent f, or just percent i.
You can actually specify
some control options here.
>> Specifically, I'm going
to say, hey, printf,
actually show me 10 decimal points.
So it looks a little weird.
But you say percent,
dot, how many numbers
you want to see after the
decimal point, and then f
for flat, just because that's
what the documentation says.
Let me go ahead and save that.
>> And notice too, I'm getting
tired of retyping things.
So I'm just setting the up and
down arrow on my keys here.
And if I keep hitting up, you
can see all of the commands
that I made, or incorrectly made.
>> And I'm going to go ahead now and
not actually use that, apparently.
Make imprecision, dot
slash imprecision-- so
what I was taught in
grade school checks out.
Even if I print it to 10 decimal
places it, indeed, is 0.10000.
But you know what?
>> Let's get a little greedy.
Let's say, like, show me 55
points after the decimal.
Let's really take this
program out for a spin.
Let me remake it with make
imprecision, dot slash, imprecision.
>> And here we go.
Your childhood was a lie.
Apparently, 1 divided by 10 is indeed
0.100000000000000005551115123--
>> What is going on?
Well, it turns out, if you kind of
look far enough out in the underlying
representation of this
number, it actually
is not exactly 1/10, or 0.1 and
an infinite number of zeros.
Now, why is that?
>> Well, even though this is a simple
number to us humans, 1 divided by 10,
it's still one of infinitely many
numbers that we could think up.
But a computer can only represent
finitely many so numbers.
And so, effectively, what the
computer is showing us is its closest
approximation to the number
we want to believe is 1/10,
or really 0.10000 ad infinitum.
>> Rather, though, this is
as close as it can get.
And, indeed, if you look
underneath the hood,
as we are here by looking
55 digits after the decimal,
we actually see that reality.
Now as an aside, if you've
ever seen the movie--
most of you probably haven't--
but Superman 3 some years ago,
Richard Pryor essentially leveraged this
reality in his company to steal a lot
of fractions and fractions of pennies,
because the company-- as I recall,
it's been a while-- was essentially
throwing away anything that didn't fit
into the notion of cents.
>> But if you add up all these
tiny, tiny, tiny numbers again,
and again, and again, you can, as in
his case, make a good amount of money.
>> That same idea was ripped off by
a more recent, but still now older
movie, called Office Space,
where the guys in that movie,
did the same thing, screwed it up
completely, ended up with way too much
money in their bank account.
It was all very suspicious.
But at the end of the day,
imprecision is all around us.
>> And that, too, can be
frighteningly the case.
It turns out that Superman 3
and Office Space aside, there
can be some very real
world ramifications
of the realities of imprecise
representation of data
that even we humans to
this day don't necessarily
understand as well as we should,
or remember as often as we should.
And, indeed, the following clip is
from a look at some very real world
ramifications of what happens if you
don't appreciate the imprecision that
can happen in numbers representation.
>> [VIDEO PLAYBACK]
>> -Computers, we've all come to accept
the often frustrating problems that
go with them-- bugs, viruses,
and software glitches,
for small prices to pay
for the convenience.
But in high tech and high speed
military and space program applications,
the smallest problem can
be magnified into disaster.
>> On June 4th, 1996, scientists prepared
to launch an unmanned Ariane 5 rocket.
It was carrying scientific
satellites designed
to establish precisely how the
earth's magnetic field interacts
with solar winds.
The rocket was built for
the European Space Agency,
and lifted off from its facility
on the coast of French Guiana.
>> -At about 37 seconds into
the flight, they first
noticed something was going wrong.
The nozzles were swiveling in
a way they really shouldn't.
Around 40 seconds into the flight,
clearly, the vehicle was in trouble.
>> And that's when they made
a decision to destroy it.
The range safety officer, with
tremendous guts, pressed the button,
blew up the rocket, before it could
become a hazard to the public safety.
>> -This was the maiden
voyage of the Ariane 5.
And its destruction took
place because of a flaw
embedded in the rocket's software.
-The problem on the Ariane was
that there was a number that
required 64 bits to express.
And they wanted to convert
it to a 16-bit number.
They assumed that the
number was never going
to be very big, that most of those
digits in a 64-bit number were zeroes.
They were wrong.
>> -The inability of one
software program to accept
the kind of number generated by
another was at the root of the failure.
Software development had become a
very costly part of new technology.
The Ariane rocket have been very
successful, so much of the software
created for it was also
used in the Ariane 5.
>> -The basic problem was that the Ariane
5 was faster, accelerated faster.
And the software hadn't
accounted for that.
>> -The destruction of the rocket
was a huge financial disaster,
all due to a minute software error.
But this wasn't the first
time data conversion problems
had plagued modern rocket technology.
>> -In 1991, with the start
of the first Gulf War,
the Patriot Missile
experienced a similar kind
of number conversion problem.
And as a result, 28 people,
28 American soldiers,
were killed, and about
100 others wounded,
when the Patriot, which was supposed
to protect against incoming scuds,
failed to fire a missile.
>> -When Iraq invaded Kuwait, and America
launched Desert Storm in early 1991,
Patriot Missile batteries were deployed
to protect Saudi Arabia and Israel
from Iraqi Scud missile attacks.
The Patriot is a US medium-range
surface to air system, manufactured
by the Raytheon company.
>> -The size of the Patriot interceptor
itself is about roughly 20 feet long.
And it weighs about 2,000 pounds.
And it carries a warhead of about,
I think it's roughly 150 pounds.
And the warhead itself is
a high explosive, which
has fragments around it.
The casing of the warhead is
designed to act like buckshot.
>> -The missiles are carried
four per container,
and are transported by a semi trailer.
>> -The Patriot anti-missile system
goes back at least 20 years now.
It was originally designed
as an air defense missile
to shoot down enemy airplanes.
In the first Gulf War,
when that war came along,
the Army wanted to use it to
shoot down scuds, not airplanes.
>> The Iraqi Air Force was
not so much of a problem.
But the Army was worried about scuds.
And so they tried to
upgrade the Patriot.
>> -Intercepting an enemy
missile traveling at mach 5
was going to be challenging enough.
But when the Patriot
was rushed into service,
the Army was not aware of an
Iraqi modification that made
their scuds nearly impossible to hit.
>> -What happened is the scuds that
were coming in were unstable.
They were wobbling.
The reason for this was
the Iraqis, in order
to get 600 kilometers
out of a 300 kilometer
range missile, took weight
out of the front warhead.
They made the warhead lighter.
>> So now the Patriot is
trying to come at the Scud.
And most of the time, the
overwhelming majority of the time,
it would just fly by the Scud.
Once the Patriot system operators
realized the Patriot missed its target,
they detonated the Patriot's warhead
to avoid possible casualties if it
was allowed to fall to the ground.
>> -That was what most people saw,
those big fireballs in the sky,
and misunderstood as
intercepts of Scud warheads.
>> -Although in the night
skies, Patriots appeared
to be successfully
destroying Scuds, at Dhahran,
there could be no mistake
about its performance.
There, the Patriot's radar system
lost track of an incoming Scud,
and never launched due
to a software flaw.
It was the Israelis who first discovered
that the longer the system was on,
the greater the time discrepancy
became, due to a clock embedded
in the system's computer.
>> -About two weeks before
the tragedy in Dhahran,
the Israelis reported to
the Defense Department
that the system was losing time.
After about eight hours or running,
they noticed that the system
was becoming noticeably less accurate.
The Defense Department responded by
telling all of the Patriot batteries
to not leave the systems
on for a long time.
They never said what a long time was--
eight hours, 10 hours, 1,000 hours.
Nobody knew.
>> -The Patriot battery
stationed at the barracks
at Dhahran and its flawed internal
clock had been on over 100 hours
on the night of February 25th.
>> -It tracked time to an accuracy
of about a tenth of a second.
Now, a tenth of a second
is an interesting number,
because it can't be expressed
in binary exactly, which
means it can't be expressed exactly
in any modern digital computer.
It's hard to believe.
>> But use this as an example.
Let's take the number one third.
One third cannot be
expressed in decimal exactly.
One third is 0.333
going on for infinity.
>> There is no way to do that with
absolute accuracy in decimal.
That's exactly the kind of problem
that happened in the Patriot.
The longer the system ran, the
worse the time error became.
>> -After 100 hours of operation, the
error in time was only about one third
of a second.
But in terms of targeting a
missile traveling at mach 5,
it resulted in a tracking
error of over 600 meters.
It would be a fatal error
for the soldiers on what
happened is a Scud launch was
detected by early Warning satellites
and they knew that the Scud was
coming in their general direction.
They didn't know where it was coming.
>> -It was now up to the radar
component of the Patriot system
defending Dhahran to locate and keep
track of the incoming enemy missile.
>> -The radar was very smart.
It would actually track
the position of the Scud,
and then predict where it probably
would be the next time the radar sent
a pulse out.
That was called a range gate.
>> -Then, once the Patriot
decides enough time has
passed to go back and check the next
location for this detected object,
it goes back.
So when it went back to the wrong
place, it then sees no object.
And it decides that there was no
object, it was a false detection,
and drops the track.
>> -The incoming Scud disappeared
from the radar screen.
And seconds later, it
slammed into the barracks.
The Scud killed 28, and was the last
one fired during the first Gulf War.
>> Tragically, the updated software
arrived at Dhahran the following day.
The software flaw had
been fixed, closing
one chapter in the troubled
history of the Patriot missile.
>> [VIDEO PLAYBACK]
DAVID J. MALAN: So this is all to
say that these issues of overflow
and imprecision are all too real.
So how did we get here?
We began with just talking about printf.
Again, this function that
prints something to the screen,
and we introduced thereafter
a few other functions
from the so-called CS50's library.
And we'll continue to
see these in due time.
And we, particularly, used get string,
and get int, and now also get float,
and yet others still will we encounter
and use ourselves before long.
>> But on occasion, have
we already seen a need
to store what those functions hand back?
They hand us back a string,
or an int, or a float.
And sometimes we need to put that
string, or int, or float, somewhere.
>> And to store those things, recall just
like in Scratch, we have variables.
But unlike in Scratch,
in C we have actual types
of variables-- data
types, more generally--
among them, a string, an int, a
float, and these others still.
>> And so when we declare variables in C,
we'll have to declare our data types.
This is not something we'll
have to do later in the semester
as we transition to other languages.
But for now, we do need
to a priori in advance,
explain to the computer what type
of variable we want it to give us.
>> Now, meanwhile, to print
those kinds of data types,
we have to tell printf what to expect.
And we saw percent s for strings,
and percent i for integers,
and a few others already.
And those are simply requirements
for the visual presentation
of that information.
>> And each of these can actually be
parametrized or tweaked in some way,
if you want to further control
the type of output that you get.
And, in fact, it turns out that not only
is there backslash n for a new line.
There's something else called backslash
r for a carriage return, which
is more akin to an
old school typewriter,
and also Windows used for many years.
>> There's backslash t for tabs.
Turns out, that if you want to
double quote inside of a string,
recall that we've used
double quote double
quote on the left and the right
ends of our strings thus far.
That would seem to confuse things.
>> If you want to put a double quote in
the middle of a string-- and, indeed,
it is confusing to see.
And so you have to escape, so to
speak, a double quote with something
like, literally, backslash double quote.
And there's a few other still.
And we'll see more of those
in actual use before long.
>> So let's now transition from
data, and representation,
and arithmetic operators, all
of which gave us some building
blocks with which to play.
But now let's actually give
us the rest of the vocabulary
that we already had
last week with Scratch
by taking a look at some other
constructs in C-- not all of them.
But the ideas we're
about to see really just
to emphasize the translation from
one language, Scratch, to another, C.
>> And over time, we'll pick up
more tools for our toolkit,
so to speak, syntactically.
And, indeed, you'll see that the ideas
are now rather familiar from last week.
So let's do this.
>> Let's go ahead and whip up a program
that actually uses some expressions,
a Boolean expression.
Let me go ahead here
and create a new file.
I'll call this condition.c.
>> Let me go ahead and
include the CS50 library.
And let me go ahead and include
standard IO.h for our functions,
and printf, and more respectively.
Let me give myself that boilerplate of
int main void, whose explanation we'll
come back to in the future.
>> Now let me go ahead and give
myself an int via get int.
Then let me go ahead and do this.
I want to say if i is less-- let's
distinguish between positive, negative,
or zero values.
>> So if i is less than zero, let me
just have this program simply say,
negative, backslash n, else
if i is greater than zero.
Now I'm, of course, going to say
printf positive, backslash n.
And then else if-- I could do this.
>> I could do if i equals 0.
But I'd be making at
least one mistake already.
Recall that the equal sign is
not equal, as we humans know it.
>> But it's the assignment operator.
And we don't want to take 0 on the
right and put it in i on the left.
So to avoid this confusion, or
perhaps misuse of the equals sign,
humans decided some years ago
that in many programming languages
when you want to check for equality
between the left and the right,
you actually use equals equals.
So you hit the equals sign twice.
When you want to assign from right
to left, you use a single equal sign.
So we could do this-- else
if i equals equals zero.
>> I could then go and
open my curly braces,
and say, printf 0, backslash n, done.
But remember how these
forks in the road can work.
And, really, just think about the logic.
i is a number.
It's an integer, specifically.
And that means it's going to be less
than 0, or greater than 0, or 0.
So there is kind of this
implied default case.
>> And so we could, just like
Scratch, dispense with the else if,
and just say else.
Logically, if you the
programmer know there's only
three buckets into which a
scenario can fall-- the first,
the second, or the third
in this case-- don't
bother adding the additional precision
and the additional logic there.
Just go ahead with the
default case here of else.
>> Now, let's go ahead
after saving this, make
conditions dot slash conditions--
not a great user interface,
because I'm not prompting the
user, as I mentioned earlier.
But that's fine.
We'll keep it simple.
Let's try the number 42.
And that's positive.
Let's try the number
negative 42, negative.
>> Let's try the value 0.
And, indeed, it works.
Now, you'll see with problems before
long, testing things three times,
probably not sufficient.
You probably want to test some
bigger numbers, some smaller
numbers, some corner cases, as
we'll come to describe them.
>> But for now, this is a
pretty simple program.
And I'm pretty sure, logically,
that it falls into three cases.
And, indeed, even though we just
focused on the potential downsides
of imprecision and overflow, in
reality where many of CS50's problems,
we are not going to worry
about, all the time,
those issues of overflow and
imprecision, because, in fact, in C,
it's actually not all that
easy to avoid those things.
If you want to count up
bigger, and bigger, and bigger,
it turns out there are techniques you
can use, often involving things called
libraries, collections of code, that
other people wrote that you can use,
and other languages like
Java and others, actually
make it a lot easier
to count even higher.
So it really is some of these dangers
a function of the language you use.
And in the coming weeks, we'll
see how dangerous C really
can be if you don't use it properly.
But from there, and with
Python, and JavaScript, will
we layer on some additional protections,
and run fewer of those risks.
>> So let's make a little more
interesting logic in our program.
So let me go ahead and create
a program called Logical
just so I can play with some
actual logic, logical.c.
I'll just copy and paste some
code from earlier so I get back
to this nice starting point.
>> Let me this time do char C. I'm
going to give it a name of C
just because it's conventional,
get a character from the user.
And let's pretend like
I'm implementing part
of that Rm program, the remove
program before that prompted the user
to remove a file.
How could we do this?
>> I want to say, if C equals
equals, quote unquote,
y, then I'm going to assume
that the user has chosen yes.
I'm just going to print yes.
If it were actually writing
the removal program,
we could remove the file
with more lines of code.
But we'll keep it simple.
>> Else if c equals equals n--
and now here, I'm going to say,
the user must have meant no.
And then else, you know what?
I don't know what else
the user is going to type.
So I'm just going to say that
that is an error, whatever
he or she actually typed.
>> So what's going on here?
There is a fundamental difference
versus what I've done in the past.
Double quotes, double quotes, double
quotes, and, yet, single quotes,
single quotes.
It turns out in C, that when
you want to write a string,
you do use double quotes, just as we've
been using all this time with printf.
>> But if you want to deal with just a
single character, a so-called char,
then you actually use single quotes.
Those of you who've programmed
before, you might not have
had to worry about this
distinction in certain languages.
In C, it does matter.
And so when I get a char and I want
to compare that char using equals
equals to some letter like y or n, I do,
indeed, need to have the single quotes.
>> Now, let's go ahead and do this.
Let's go ahead and do make
logical dot slash logical.
And now I'm being prompted.
So, presumably, a better user experience
would actually tell me what to do here.
But I'm going to just blindly
say y for yes, OK, nice.
>> Let's run it again, n for no, nice.
Suppose like certain people I know,
my caps lock key is on all too often.
So I do capital Y, enter, error.
OK, it's not exactly what I'm expecting.
Indeed, the computer
is doing literally what
I told it to do-- check for
lowercase y and lowercase n.
This doesn't feel like good
user experience, though.
Let me ask for and accept
either lower case or upper case.
So it turns out, you might want
to say something like in Scratch,
like literally or C equals
equals capital single quoted y.
Turns out, C does not have
this literal keyword or.
>> But it does have two vertical bars.
You have to hold Shift usually,
if you're using a US keyboard,
and hit the vertical bar
key above your return key.
But this vertical bar
vertical bar means or.
>> If, by contrast, we wanted
to say and, like in Scratch,
we could do ampersand ampersand.
That makes no logical sense here,
because a human could not possibly
have typed both y and lowercase y
and capital Y as the same character.
So or is what we intend here.
>> So if I do this in both places, or c
equals equals capital N, now rerun,
make logical, rerun logical.
Now, I can type y.
And I can do it again with
capital Y, or capital N.
And I could add in additional
combinations still.
>> So this is a logical
program insofar as now
I'm checking logically for
this value or this value.
And I don't have to, necessarily,
come up with two more ifs or else ifs.
I can actually combine some of the
related logic together in this way.
So this would be better
designed than simply
saying, if C equals lower case y,
print yes, else if c equals capital Y,
print yes, else if c equals
lower-- in other words,
you don't have to have
more and more branches.
You can combine some of the equivalent
branches logically, as in this way.
>> So let's take a look at just one
final ingredient, one final construct,
that C allows.
And we'll come back in the
future to others still.
And then we'll conclude by looking
at not the correctness of code--
getting code to work-- but the design
of code, and plant those seeds early on.
>> So let me go ahead and
open up a new file here.
You know what?
I'm going to re-implement
that same program,
but using a different construct.
>> So let me quickly give myself
access to include CS50.h
for the CS50 library,
standard Io.h for printf.
Give me my int main void.
And then over here, let
me go ahead and do this.
>> Char c gets get char, just like before.
And I'm going to use a new construct
now-- switch, on what character?
So switch is kind of like
switching a train tracks.
Or, really, it is kind of
an if else, if else if,
but written somewhat differently.
>> A switch looks like this.
You have switch, and then what
character or number you want to look at,
then some curly braces like in
Scratch, just say do this stuff.
And then you have different cases.
>> You don't use if and else.
You literally use the word case.
And you would say something like this.
>> So in the case of a lowercase y,
or in the case of a capital Y,
go ahead and print out yes.
And then break out of the switch.
That's it.
We're done.
>> Else if, so to speak,
lower case n, or capital N,
then go ahead and print
out no, and then break.
Else-- and this kind of is the
default case indeed-- printf error--
and just for good measure, though
logically this break is not necessary
because we're at the end
of the switch anyway,
I'm now breaking out of the switch.
So this looks a little different.
>> But, logically, it's
actually equivalent.
And why would you use
one over the other?
Sometimes, just personal preference,
sometimes the aesthetics,
if I glance at this
now, there's something
to be said for the
readability of this code.
I mean, never mind the fact that this
code is new to many of us in the room.
>> But it just kind of is pretty.
You see lowercase y, capital Y,
lower case n, capital N default,
it just kind of jumps
out at you in a way
that, arguably, maybe
the previous example
with the ifs, and the vertical bars,
and the else ifs, might not have.
So this is really a matter of personal
choice, really, or readability,
of the code.
>> But in terms of functionality, let me
go ahead and make a switch, dot slash
switch, and now type in lowercase y,
capital Y, lowercase n, capital N,
David, retry because that's
not a single character.
Let's do x, error, as expected.
And, logically-- and this is something
I would encourage in general-- even
though we're only scratching the
surface of some of these features.
>> And it might not be obvious when you
yourself sit down at the keyboard,
how does this work?
What would this do?
The beautiful thing about having
a laptop, or desktop, or access
to a computer with a compiler,
and with a code editor like this,
is you can almost always answer these
questions for yourself just by trying.
>> For instance, if the rhetorical
question at hand were,
what happens if you forget
your break statements?
Which is actually a
very common thing to do,
because it doesn't look
like you really need them.
They don't really complete your
thought like a parenthesis or a curly
brace does.
Let's go ahead and
recompile the code and see.
So make switch, dot slash switch.
Let's type in lower case
y, the top case, Enter.
So I typed y.
>> The program said yes, no, error,
as though it was changing its mind.
But it kind of was, because what happens
with a switch is the first case that
match essentially means, hey computer,
execute all of the code beneath it.
And if you don't say break, or
don't say break, or don't say break,
the computer is going to blow
through all of those lines
and execute all of them until
it gets to that curly brace.
So the brakes are, indeed, necessary.
But a takeaway here is, when
in doubt, try something.
Maybe save your code first,
or save it in an extra file
if you're really worried about
messing up and having to recover
the work that you know is working.
>> But try things.
And don't be as afraid, perhaps,
of what the computer might do,
or that you might break something.
You can always revert back
to some earlier version.
>> So let's end by looking
at the design of code.
We have this ability now to write
conditions, and write loops,
and variables, and call functions.
So, frankly, we're kind of back at
where we were a week ago with Scratch,
albeit with a less compelling textual
environment than Scratch allows.
>> But notice how quickly we've acquired
that vocabulary, even if it's
going to take a little while to sink in,
so that we can now use this vocabulary
to write more interesting programs.
And let's take a baby step
toward that, as follows.
Let me go ahead and
create a new file here.
>> I'm going to call this
prototype.c, and introduce
for the first time, the ability
to make your own functions.
Some of you might have
done this with Scratch,
whereby you can create your
own custom blocks in Scratch,
and then drag them into place
wherever you'd like in C.
And in most programming
languages, you can do exactly
that-- make your own functions,
if they don't already exist.
>> So, for instance, let me go ahead
and include CS50.h, and include
standard IO.h, int main void.
And now we have a
placeholder ready to go.
I keep printing things
like people's names today.
And that feels like--
wouldn't be nice if there
were a function called print name?
I don't have to use printf.
I don't have to remember
all the format codes.
Why don't I, or why
didn't someone before me,
create a function called print
name, that given some name,
simply prints it out?
>> In other words, if I say, hey,
computer, give me a string
by asking the user for such,
via CS50's get string function.
Hey, computer, put that string in
the variable in the left hand side,
and call it s.
And then, hey computer, go ahead
and print that person's name, done.
>> Now, it would be nice, because
this program, aptly named,
tells me what it's supposed to do
by way of those function's names.
Let me go and make prototype, Enter.
And, unfortunately,
this isn't going to fly.
>> Prototype.c, line 7, character
5, error, implicit declaration
of function print name
is invalid in C99, C99
meaning a version of C
that came out in 1999.
That's all.
>> So I don't know what
all of this means yet.
But I do recognize error in red.
That's pretty obvious.
>> And it seems that with
the green character here,
the issue is with print name, open
paren s, close paren, semi-colon.
But implicit declaration of
function we did see briefly earlier.
This means, simply, that Clang
does not know what I mean.
>> I've used a vocabulary word that it's
never seen or been taught before.
And so I need to teach it
what this function means.
So I'm going to go ahead and do that.
>> I'm going to go ahead and implement
my own function called Print Name.
And I'm going to say, as follows, that
it does this, printf, hello, percent
s, backslash n, name, semi-colon.
So what did I just do?
>> So it turns out, to
implement your own function,
we kind of borrow some of
the same structure as main
that we've just been
taken for granted, and I
know just copying and
pasting pretty much what
I've been writing in the past.
But notice the pattern here.
Int, Main, Void, we'll tease apart
before long what that actually means.
>> But for today, just
notice the parallelism.
Void, print name,
string name, so there's
a purple keyword, which
we're going to start
calling a return type, the name of
the function, and then the input.
So, actually, we can distill
this kind of like last week
as, this is the name or the
algorithm of the code we're
going to write-- the
algorithm underlying
the code we're going to write.
>> This is its input.
This is its output.
This function, print name, is
designed to take a string called name,
or whatever, as input, and then void.
It doesn't return anything,
like get string or get int does.
So it's going to hand me something back.
It's just going to have a
side effect, so to speak,
of printing a person's name.
So notice, line 7, I
can call print name.
Line 10, I can define
or implement print name.
But, unfortunately, that's not enough.
>> Let me go ahead and
recompile this after saving.
Whoa, now, I've made it
worse, it would seem.
So implicit declaration of
function print name is invalid.
And, again, there's more errors.
But as I cautioned earlier, even
if you get overwhelmed with,
or a little sad to see so many
errors, focus only on the first
initially, because it might just
have had a cascading effect.
So C, or Clang more specifically,
still does not recognize print name.
>> And that's because Clang,
by design, is kind of dumb.
It only does what you tell it to do.
And it only does so in the order
in which you tell it to do.
>> So I have defined main on line four,
like we've been doing pretty often.
I've defined print name on line 10.
But I'm trying to use
print name on line seven.
>> It's too soon, doesn't exist yet.
So I could be clever, and be like,
OK, so let's just play along,
and move print name up
here, and re-compile.
Oh my God.
It worked.
It was as simple as that.
>> But the logic is exactly that.
You have to teach Clang what it
is by defining the function first.
Then you can use it.
But, frankly, this feels
like a slippery slope.
>> So every time I run
into a problem, I'm just
going to highlight and copy the code
I wrote, cut it and paste it up here.
And, surely, we could
contrive some scenarios
where one function might
need to call another.
And you just can't put every
function above every other.
>> So it turns out there's
a better solution.
We can leave this be.
And, frankly, it's generally nice,
and convenient, and good design
to put main first, because, again,
main just like when green flag clicked,
that is the function that
gets executed by default.
So you might as well put
it at the top of the file
so that when you or any
other human looks at the file
you know what's going on
just by reading main first.
So it turns out, we can tell Clang
proactively, hey, Clang, on line four,
I promise to implement
a function called Print
Name that takes a string called name
as input, and returns nothing, void.
And I'll get around to
implementing it later.
>> Here comes Main.
Main now on line 9 can use
Print Name because Clang
is trusting that, eventually,
it will encounter the definition
of the implementation of Print Name.
So after saving my file, let
me go ahead and make prototype,
looks good this time.
Dot slash, prototype, let me
go ahead and type in a name.
David, hello David, Zamila, hello
Zamila, and, indeed, now it works.
>> So the ingredient here is that we've
made a custom function, like a custom
Scratch block we're calling it.
But unlike Scratch where you can
just create it and start using it,
now we have to be a
little more pedantic,
and actually train Clang
to use, or to expect it.
Now, as an aside, why all this time have
we been just blindly on faith including
CS50.h, and including standard IO.h?
>> Well, it turns out,
among a few other things,
all that's in those dot h
files, which happen to be files.
They're header files, so to speak.
They're still written in C. But
they're a different type of file.
>> For now, you can pretty much assume
that all that is inside of CS50.h
is some one-liners like this, not
for functions called Print Name,
but for Get String, Get
Float, and a few others.
And there are similar prototypes,
one liners, inside of standard IO.h
for printf, which is now in
my own Print Name function.
So in other words, this whole time we've
just been blindly copying and pasting
include this, include
that, what's going on?
Those are just kind of clues
to Clang as to what functions
are, indeed, implemented, just
elsewhere in different files
elsewhere on the system.
>> So we've implemented print name.
It does have this side effect of
printing something on the screen.
But it doesn't actually
hand me something back.
How do we go about
implementing a program that
does hand me something back?
>> Well, let's try this.
Let me go ahead and implement
a file called return.c
so we can demonstrate how something
like Get String, or Get Int,
is actually returning
something back to the user.
Let's go ahead and define int main void.
>> And, again, in the future, we'll
explain what that int and that void
is actually doing.
But for today, we'll
take it for granted.
I'm going to go ahead and printf,
for a good user experience, x is.
And then I'm going to wait for the
user to give me x with get int.
>> And then I'm going to go ahead
and print out x to the square.
So when you only have a
keyboard, people commonly
use the little carrot
symbol on the keyboard
to represent to the power
of, or the exponent of.
So x squared is present i.
>> And now I'm going to do this.
I could just do-- what's x
squared? x squared is x times x.
>> And we did this some
time ago already today.
This doesn't feel like
all that much progress.
You know what?
Let's leverage some of that idea
from last time of abstraction.
>> Wouldn't it be nice if
there's a function called
square that does exactly that?
It still, at the end of the
day, does the same math.
But let's abstract
away the idea of taking
one number multiplied by
another, and just give it a name,
like square this value.
>> And, in other words, in
C, let's create a function
called square that does exactly that.
It's going to be called square.
It's going to take an int.
And we'll will just
call it n, by default.
>> But we could call it anything we want.
And all that it's going to
do, literally, is return
the result of n times n.
But because it is
returning something, which
is the keyword in purple we've
never seen before, I, on line 11,
can't just say void this time.
>> Void, in the example we just saw
rather of print name, just means,
do something.
But don't hand me something back.
In this case, I do want
to return n times n,
or whatever that is, that number.
>> So I can't say, hey, computer,
I return nothing, void.
It's going to return, by nature, an int.
And so that's all that's going on here.
>> The input to square
is going to be an int.
And so that we can use it, it has to
have a name, N. It's going to output
an int that doesn't need a name.
We can leave it to main, or whoever is
using me to remember this value if we
want with its own variable.
>> And, again, the only new
keyword here is Return.
And I'm just doing some math.
If I really wanted to be unnecessary,
I could say int product gets n times n.
>> And then I could say, return product.
But, again, to my point earlier of
this just not being good design--
like, why introduce a name,
a symbol, like product,
just to immediately return it?
It's a little cleaner,
a little tighter, so
to speak, just to say return n times
n, get rid of this line altogether.
>> And it's just less code to read,
less opportunity for mistakes.
And let's see if this
actually now works.
Now, I'm going to go
ahead and make return.
>> Uh-oh, implicit declaration of function.
I made this mistake before, no big deal.
Let me just type, or highlight and
copy, the exact same function prototype,
or signature, of the function up here.
Or I could move the whole function.
>> But that's a little lazy.
So we won't do that.
Now, let me make return
again, dot slash return.
>> x is 2. x squared is 4.
x is 3. x squared is 9.
And the function seems
now to be working.
So what's the difference here?
I have a function that's called square,
in this case, which I put in an input.
And I get back an output.
And yet, previously, if
I open the other example
from earlier, which
was called prototype.c,
I had print name, which
returned void, so to speak,
Or it returned nothing, and
simply had a side effect.
>> So what's going on here?
Well, consider the function
get string for just a moment.
We've been using the function
get string in the following way.
>> We've had a function get
string, like include CS50.h,
include standard IO.h, int, main, void.
And then every time I've
called get string thus far,
I've said something like, string s
gets get string, because get string--
let's call this get.c-- get string
itself returns a string that I can then
use, and say, hello, comma,
percent s, backslash n, s.
>> So this is the same example,
really, that we had earlier.
So get string returns a value.
But a moment ago, print string
does not return a value.
It simply has a side effect.
So this is a fundamental difference.
We've seen different
types of functions now,
some of which have returned
values, some of which don't.
So maybe it's string, or int, or float.
Or maybe it's just void.
>> And the difference is
that these functions that
get data and return a value are actually
bringing something back to the table,
so to speak.
So let's go ahead and
look at one final set
of examples that gives a sense, now, of
how we might, indeed, abstract better,
and better, and better, or more,
and more, and more, in order
to write, ultimately, better code.
Let's go ahead, and in the spirit
of Scratch, do the following.
>> Let me go ahead and include
CS50.h and standard IO.h.
Let me go ahead and give
myself an int, main, void.
And let me go ahead, call this cough.c.
>> And let me go ahead and just
like Scratch, print out cough/n.
And I want to do this three times.
So I'm, of course, just going
to copy and paste three times.
I'm now going to make
cough dot slash cough.
Let's give myself a little more room
here, Enter, cough, cough, cough.
>> There's, obviously, already an
opportunity for improvement.
I've copied and pasted
a few times today.
But that was only so I didn't
have to type as many characters.
I still changed what
those lines of code are.
>> These three lines are identical,
which feels lazy and indeed is,
and is probably not the right approach.
So with what ingredient
could we improve this code?
We don't have to copy and paste code.
>> And, indeed, any time you feel
yourself copying and pasting,
and not even changing code,
odds are there's a better way.
And, indeed, there is.
Let me go ahead and do a for loop,
even though the syntax might not
come naturally yet.
>> Do this three times, simply
by doing the following--
and I happen to know this from practice.
But we have a number of examples now.
And you'll see online
more references still.
>> This is the syntax on line 6, that
much like Scratch that repeats
block, repeat the following three times.
It's a little magical for now.
But this will get more,
and more familiar.
>> And it's going to repeat
line eight three times,
so that if I re-compile make cough,
dot slash cough, cough, cough, cough.
It still works the same way.
So that's all fine and good.
But that's not very abstracted.
>> It's perfectly correct.
But it feels like there
could be an opportunity,
as in the world of
Scratch, to kind of start
to add some semantics here so that
I don't just have some for loop,
and a function that says
cough, or does cough.
You know what?
Let me try to be a
little cooler than that,
and actually write a function that
has some side effects, call it cough.
>> And it takes no input, and
returns no value as output.
But you know what it does?
It does this-- printf,
quote unquote, cough.
>> And now up here, I'm going
to go ahead and for int,
i gets zero, i less than 3, i plus plus.
I'm going to not do printf, which is
arguably a low level implementation
detail.
I don't care how to cough.
I just want to use the cough function.
And I'm just going to call cough.
>> Now, notice the dichotomy.
When you call a function, if you don't
want to give it inputs, totally fine.
Just do open paren, close
paren, and you're done.
>> When you define a function, or
declare a function's prototype,
if you know in advance it's not
going to take any arguments,
say void in those parentheses there.
And that makes certain that you
won't accidentally misuse it.
Let me go ahead and make cough.
And, of course, I've made a mistake.
>> Dammit, there's that
implicit declaration.
But that's fine.
It's an easy fix.
I just need the prototype higher up
in my file than I'm actually using it.
>> So now let me make cough again, nice.
Now, it works.
Make cough, cough, cough, cough.
So you might think that we're really
just over engineering this problem.
And, indeed, we are.
This is not a good
candidate of a program
at the moment for
refactoring, and doing what's
called hierarchical decomposition,
where you take some code, and then
you kind of factor things out, so as
to ascribe more semantics to them,
and reuse it ultimately longer term.
But it's a building block toward
more sophisticated programs
that we will start
writing before long that
allows us to have the vocabulary
with which to write better code.
And, indeed, let's see if we
can't generalize this further.
>> It seems a little lame that I, main,
need to worry about this darn for loop,
and calling cough again and again.
Why can't I just tell cough,
please cough three times?
In other words, why can't I just
give input to cough and do this?
>> Why can't I just say, in
main cough three times.
And now, this is kind of magical.
It's very iterative here.
And it's, indeed, a baby step.
>> But just the ability to say on
line eight, cough three times,
it's just so much more readable.
And, plus, I don't have to know
or care how cough is implemented.
And, indeed, later in the
term and for final projects,
if you tackle a project with
a classmate or two classmates,
you'll realize that you're going to
have to, or want to, divide the work.
>> And you're going to want to decide
in advance, who's going to do what,
and in which pieces?
And wouldn't it be nice
if you, for instance,
take charge of writing main, done.
And your roommate, or your
partner more generally,
takes care of implementing cough.
>> And this division, these
walls of abstraction,
or layers of abstraction if
you will, are super powerful,
because especially for larger,
more complex programs and systems,
it allows multiple people to build
things together, and ultimately
stitch their work together in this way.
But, of course, we
need to now fix cough.
We need to tell cough
that, hey, you know what?
You're going to need to take an
input-- so not void, but int and now.
Let's go ahead and put into
cough the int. i gets zero.
>> i is less than how many times.
I said three before.
But that's not what I want.
I want cough to be generalized to
support any number of iterations.
>> So, indeed, it's n that I want,
whatever the user tells me.
Now, I can go ahead and say print cough.
And no matter what number
the user passes in,
I will iterate that many times.
>> So at the end of the day,
program is identical.
But notice all of this stuff
could even be in another file.
Indeed, I don't know at the
moment how printf is implemented.
>> I don't know at the moment how get
string, or get int, or get float
are implemented.
And I don't want to
see them on my screen.
As it is, I'm starting to focus on
my program, not those functions.
>> And so, indeed, as soon as you
start factoring code like this out,
could we even move cough
to a separate file?
Someone else could implement it.
And you and your program become the
very beautiful, and very readable,
arguably, really four
line program right there.
>> So let's go ahead now
and make one more change.
Notice that my prototype
has to change up top.
So let me fix that so
I don't get yelled at.
>> Make cough, let me run cough once
more, still doing the same thing.
But now, notice we have an
ingredient for one final version.
You know what?
I don't want to just cough, necessarily.
I want to have something more general.
So you know what?
I want to do this.
I want to have, much like Scratch
does, a say block, but not just
say something some number of times.
I want it to say a very specific string.
And, therefore, I don't
want it to just say cough.
I want it to say whatever
string is passed in.
>> So notice, I've generalized
this so that now
say feels like a good name
for this, like Scratch,
takes two arguments, unlike Scratch.
One is a string.
One is an int.
>> And I could switch them.
I just kind of like the idea of
say the string first, and then
how many times later.
Void means it still
doesn't return anything.
These are just visual side
effects, like with [? Jordan, ?]
a verbal side effect of yelling.
It still does something n times,
0 up to, but not equal to n.
This means n total times.
And then just print out
whatever that string is.
So I've really generalized
this line of code.
So now, how do I implement
the cough function?
>> I can do void cough.
And I can still take in how
many times you want to cough.
But you know what?
I can now punt to say.
>> I can call say with the
word cough, passing in n.
And if I want to also implement,
just for fun, a sneeze function,
I can sneeze some number of times.
And I can keep reusing n, because
notice that m in this context or scope
only exists within this function.
>> And n in this context only
exists within this function here.
So we'll come back to
these issues of scope.
And here, I'm just going to say,
achoo, and then n times, semi-colon.
>> And now, I just need to borrow
these function signatures up here.
So cough is correct.
Void sneeze is correct now.
>> And I still just need say.
So I'm going to say, say
string s, int n, semi-colon.
So I've over-engineered the
heck out of this program.
>> And this doesn't
necessarily mean this is
what you should do when writing
even the simplest of programs.
Take something that's obviously
really simple, really short,
and re-implement it
using way too much code.
But you'll actually see, and in
time look back on these examples,
and realize, oh, those are the steps
we took to actually generalize,
to factor something out,
until at the end of the day
my code is actually pretty reasonable.
Because if I want to cough three
times then sneeze three times,
I'm simply going to rerun this,
program make cough, and run cough.
And I have three coughs
and three sneezes.
>> And so this is a basic
paradigm, if you will,
for how we might go about
actually implementing a program.
But let's just see now what it is
we've been doing all of this time,
and what some of the final pieces
are behind this simple command.
At the end of the day, we've
been using Clang as our compiler.
We've been writing source
code, converting it
via Clang into machine code.
>> And we've been using Make just
to facilitate our keystrokes so
that we don't have to remember
those incantations of Clang itself.
But what is Make actually doing?
And, in turn, what is
Clang actually doing?
>> It turns out, though we have simplified
today's discussion by saying,
you take source code, pass it as
input to a compiler, which gives you
output of machine
code, turns out there's
a few different steps inside there.
And compiling happens to be the umbrella
term for a whole bunch of steps.
But let's just tease
this out really quickly.
>> It turns out that we've been doing
more things every time I run a program,
or every time I compile a program today.
So preprocessing refers to
this-- anything in a C program,
as we'll see again and again,
that starts with this hash symbol,
or the hashtag symbol here, means
it's a preprocessor directive.
That means, in this case, hey
computer, do something with this file
before you actually compile my own code.
>> In this case, hash include is,
essentially, C's way of saying,
hey computer, go get the contents
of CS50.h and paste them here.
Hey computer, go get the
contents of standard IO.h,
wherever that is on the
hard drive, paste it here.
So those things happen
first during preprocessing.
>> And Clang does all of this for us.
And it does it so darn
fast, you don't even
see four distinct things happening.
But that's the first such step.
>> What actually happens next?
Well, the next official
step is compiling.
And it turns out that
compiling a program
technically means going from
source code, the stuff we've
been writing today, to something
called assembly code, something
that looks a little different.
>> And, in fact, we can see this real fast.
Let me actually go into my IDE.
Let me go ahead and open hello.c, which
is the very first program with which we
began today.
And let me go ahead and run Clang a
little differently, Clang-s, hello.c,
which is actually going to
give me another file hello.s.
>> And we will probably never
again see this kind of code.
If you take a lower level
systems class like CS61,
you will see a lot more
of this kind of code.
But this is assembly language.
This is X86 assembly language
that the CPU that is underlying
CS50 IDE actually understands.
>> And cryptic as it does
look, it is something
the computer understands pretty well.
Sub q, this is a subtract.
There's movements.
>> There's calling of functions here,
x oring, a movement, an add, a pop,
a return.
So there's some very
low level instructions
that CPUs understand that
I alluded to earlier.
That is what Intel Inside.
>> There are patterns of
zeros and ones that
map to these arcanely worded, but
somewhat well-named, instructions,
so to speak.
That is what happens when
you compile your code.
You get assembly
language out of it, which
means the third step is to assemble
that assembly code into, ultimately,
machine code-- zeros and ones, not the
text that we just saw a moment ago.
>> So pre-processing does that find
and replace, and a few other things.
Compiling takes your source
code from C, source code
that we wrote, to assembly
code that we just glanced at.
Assembling takes that assembly
code to zeroes and ones
that the CPU really will
understand at the end of the day.
And linking is the last step
that happens for us-- again,
so fast we don't even
notice-- that says,
hey computer, take all of
the zeros and ones that
resulted from compiling David's code,
and his main function in this case.
>> And hey computer, go get
all of the zeros and ones
that the CS50 staff wrote
inside the CS50 library.
Mix those in with David's.
And hey computer, go get all the zeros
and ones that someone else wrote years
ago for printf.
And add those into the
whole thing, so that we've
got my zeros and ones, the
CS50 staff's zeros and ones,
the printf zeros and ones,
and anything else we're using.
>> They all get combined together into one
program called, in this case, hello.
So henceforth, we will just
use the word compiling.
And we will take for granted that when
we say, compile your program, it means,
hey do the pre-processing,
assembling, and linking.
But there's actually some juicy stuff
going on there underneath the hood.
And especially if you
get curious some time,
you can start poking
around at this lower level.
But for now, realize that
among the takeaways for today
are quite simply the
beginning of a process,
of getting comfortable with
something like hello world.
Indeed, most of what we did today
certainly won't sink in super fast.
And it will take some
time, and some practice.
And odds are, you will sort
of want to hit your keyboard
or yell at the screen.
And all of that's OK.
Though, perhaps try not to
do it in the library so much.
>> And ultimately, you'll
be able though, to start
seeing patterns, both in good code
that you've written and in mistakes
that you've made.
And much like the process of
becoming a TF or a CA is like,
you'll start to get better and
better at seeing those patterns,
and just solving your
own problems ultimately.
In the meantime, there will be plenty
of us to lend you support, and get you
through this.
And in the write-ups
for all of the problems
will you be guided through
all of the commands
that I certainly know from
a lot of practice by now,
but might have flown
over one's head for now.
And that's totally fine.
>> But, ultimately, you're going
to start to see patterns emerge.
And once you get past all of the
stupid details, like parentheses,
and curly braces, and semi-colons,
and the stuff, frankly,
that is not at all
intellectually interesting.
And it is not the objective of
taking any introductory class.
It's the ideas that are going to matter.
>> It's the loops, and the
conditions, and the functions,
and more powerfully the abstraction,
and the factoring of code,
and the good design, and the good
style, and ultimately the correctness
of your code, that's ultimately
going to matter the most.
So next week, we will take these
ideas that we first saw in Scratch
and have now translated
to C. And we'll start
to introduce the first of the
course's real world domains.
>> We'll focus on the world of security,
and more specifically cryptography,
the art of scrambling information.
And among the first
problems you yourself
will get to write beyond
playing with some of the syntax
and solving some logical
problems, ultimately before long,
is to actually scramble, or encrypt,
and ultimately decrypt information.
And everything we've done
today, will fairly low
level, is just going to allow
us to take one, and one,
and one more step above toward
writing the most interesting code yet.
>> So more on that next week.
>> [VIDEO PLAYBACK]
>> -What can you tell me about
the last time you saw him?
-What can I say, really?
I mean, it was like any other
pre-production rehearsal,
except there was something he said
at the very end that stuck with me.
>> -This was CS50.
>> -That's a cut everyone,
great job on rehearsal.
>> -That's lunch?
>> -Yeah, you and I can
grab a sandwich in a bit.
Let me just debrief with
David really quickly.
David?
David?
>> [END PLAYBACK]
