>> [MUSIC PLAYING]
>> DAVID J. MALAN: All right.
This is CS50 and this
is the start of Week 2.
And you'll recall that over
the past couple of weeks,
we've been introducing computer
science and, in turn, programming.
>> And we started the story by way of
Scratch, that graphical language
from MIT'S Media Lab.
And then most recently,
last week, did we
introduce a higher-- a
lower-level language known
as C, something that's purely textual.
And, indeed, last time we
explored within that context
a number of concepts.
>> This, recall, was the very
first program we looked at.
And this program, quite simply,
prints out, "hello, world."
But there's so much
seeming magic going on.
There's this #include 
with these angle brackets.
There's int.
There's (void).
There's parentheses, curly braces,
semi-colons, and so much more.
>> And so, recall that
we introduced Scratch
so that we could, ideally, see past
that syntax, the stuff that's really not
all that intellectually
interesting but early on
is, absolutely, a bit tricky
to wrap your mind around.
And, indeed, one of the most common
things early on in a programming class,
especially for those less
comfortable, is to get frustrated by
and tripped up by certain syntactic
errors, not to mention logical errors.
And so among our goals
today, actually, will
be to equip you with some
problem-solving techniques for how
to better solve problems themselves
in the form of debugging.
And you'll recall, too, that the
environment that we introduced
last time was called CS50 IDE.
This is web-based software that
allows you to program in the cloud,
so to speak, while keeping all of your
files together, as we again will today.
And recall that we
revisited these topics here,
among them functions, and loops, and
variables, and Boolean expressions,
and conditions.
And actually a few more that we
translated from the world of Scratch
to the world of C.
>> But the fundamental building
blocks, so to speak,
were really still the same last week.
In fact, we really just had a
different puzzle piece, if you will.
Instead of that purple
save block, we instead
had printf, which is
this function in C that
allows you to print something
and format it on the screen.
We introduced the CS50
Library, where you
have now at your disposal get_char,
and get_int, and get_string,
and a few other functions as
well, via which you can get input
from the user's own keyboard.
And we also took a look at things
like these- bool, and char,
and double, float,
int, long_long string.
And there's even other data types in C.
>> In other words, when you declare
a variable to store some value,
or when you implement a function
that returns some value,
you can specify what
type of value that is.
Is it a string, like a
sequence of characters?
Is it a number, like an integer?
Is it a floating point
value, or the like?
So in C, unlike Scratch, we actually
began to specify what kind of data
we were returning or using.
>> But, of course, we also ran into
some fundamental limits of computing.
And in particular,
this language C, recall
that we took a look at
integer overflow, the reality
that if you only have a
finite amount of memory
or, specifically, a finite number
of bits, you can only count so high.
And so we looked at this example here
whereby a counter in an airplane, ,
actually, if running long enough would
overflow and result in a software
an actual physical potential error.
>> We also looked at floating
point imprecision, the reality
that with only a finite number
of bits, whether it's 32 or 64,
you can only specify so many numbers
after a decimal point, after which you
begin to get imprecise.
So for instance, one-third in the
world here, in our human world,
we know is just an infinite number
of 3s after the decimal point.
But a computer can't necessarily
represent an infinite number of numbers
if you only allow it some
finite amount of information.
>> So not only did we equip you
with greater power in terms
of how you might express yourself at
a keyboard in terms of programming,
we also limited what
you can actually do.
And indeed, bugs and mistakes can
arise from those kinds of issues.
And indeed, among the topics today
are going to be topics like debugging
and actually looking underneath the hood
at how things were introduced last week
are actually implemented
so that you better
understand both the capabilities of and
the limitations of a language like C.
>> And in fact, we'll peel back the layers
of the simplest of data structure,
something called an array, which
Scratch happens to call a "list."
It's a little bit
different in that context.
And then we'll also introduce one of the
first of our domain-specific problems
in CS50, the world of
cryptography, the art of scrambling
or in ciphering information so
that you can send secret messages
and decode secret messages
between two persons, A and B.
>> So before we transition
to that new world,
let's try to equip you with some
techniques with which you can eliminate
or reduce at least some
of the frustrations
that you have probably encountered
over the past week alone.
In fact, ahead of you are such-- some of
your first problems in C. And odds are,
if you're like me, the first time
you try to type out a program,
even if you think logically
the program is pretty simple,
you might very well hit a wall, and
the compiler is not going to cooperate.
Make or Clang is not going
to actually do your bidding.
>> And why might that be?
Well, let's take a look at,
perhaps, a simple program.
I'm going to go ahead and save this in
a file deliberately called buggy0.c,
because I know it to
be flawed in advance.
But I might not realize that if this
is the first or second or third program
that I'm actually making myself.
So I'm going to go ahead and
type out, int main(void).
And then inside of my curly braces,
a very familiar ("hello, world--
backslash, n")-- and a semi-colon.
>> I've saved the file.
Now I'm going to go down
to my terminal window
and type make buggy0, because, again,
the name of the file today is buggy0.c.
So I type make buggy0, Enter.
>> And, oh, gosh, recall from last time
that no error messages is a good thing.
So no output is a good thing.
But here I have clearly
some number of mistakes.
>> So the first line of output
after typing make buggy0, recall,
is Clang's fairly verbose output.
Underneath the hood,
CS50 IDE is configured
to use a whole bunch of
options with this compiler
so that you don't have
to think about them.
And that's all that first line
means that starts with Clang.
>> But after that, the problems
begin to make their appearance.
Buggy0.c on line 3, character
5, there is a big, red error.
What is that?
Implicitly declaring library function
printf with type int (const char *,
...) [-Werror].
I mean, it very quickly
gets very arcane.
And certainly, at first
glance, we wouldn't
expect you to understand the
entirety of that message.
And so one of the lessons
for today is going
to be to try to notice
patterns, or similar things,
to errors you might have
encountered in the past.
So let's tease apart only
those words that look familiar.
The big, red error is clearly
symbolic of something being wrong.
>> Implicitly declaring
library function printf.
So even if I don't quite understand what
implicitly declaring library function
means, the problem surely
relates to printf somehow.
And the source of that issue
has to do with declaring it.
>> Declaring a function is
mentioning it for the first time.
And we used the terminology last week
of declaring a function's prototype,
either with one line at the top of your
own file or in a so-called header file.
And in what file did we say
last week that printf is quote,
unquote, declared?
In what file is its prototype?
>> So if you recall, the very first thing I
typed, almost every program last time--
and accidentally a moment ago started
typing myself-- was this one here--
hash-- #include <stio-- for
input/output-- dot h And indeed,
if I now save this file, I'm going
to go ahead and clear my screen,
which you can do by typing
Clear, or you can hold Control L,
just to clear your terminal window
just to eliminate some clutter.
>> I'm going to go ahead and
re-type make buggy0, Enter.
And voila, I still see that
long command from Clang,
but there's no error message this time.
And indeed, if I do ./buggy0,
just like last time,
where dot means this
directory, Slash just means,
here comes the name of the program and
that name of the program is buggy0,
Enter, "hello, world."
>> Now, how might you have
gleaned this solution
without necessarily
recognizing as many words
as I did, certainly, having
done this for so many years?
Well, realize per the first problem
set, we introduce you to a command
that CS50's own staff
wrote called help50.
And indeed, C does specification for
the problem set as to how to use this.
>> But help50 is essentially
a program that CS50's staff
wrote that allows you to run
a command or run a program,
and if you don't understand its
output, to pass its output to help50,
at which point the software
that the course's staff wrote
will look at your program's output
line by line, character by character.
And if we, the staff, recognize the
error message that you're experiencing,
we will try to provoke you with some
rhetorical questions, with some advice,
much like a TF or a CA or myself
would do in person at office hours.
>> So look to help50 if you don't
necessarily recognize a problem.
But don't rely on it
too much as a crutch.
Certainly try to understand its
output and then learn from it
so that only once or twice do you
ever run help50 for a particular error
message.
After that, you should be
better equipped yourself
to figure out what it actually is.
>> Let's do one other here.
Let me go ahead, and in another
file we'll call this buggy1.c.
And in this file I'm
going to deliberately--
but pretend that I don't
understand what mistake I've made.
>> I'm going to go ahead and do this--
#include , since I've
learned my lesson from a moment ago.
Int main(void), as before.
And then in here I'm going
to do string s - get_string.
And recall from last time that
this means, hey, computer,
give me a variable, call it s, and
make the type of that variable a string
so I can store one or more words in it.
>> And then on the right-hand
side of the equal sign
is get_string, which is a
function in the CS50 Library
that does exactly that.
It gets a function and then
hands it from right to left.
So this equal sign doesn't mean
"equals" as we might think in math.
It means assignment from right to left.
So this means, take the string from
the user and store it inside of s.
>> Now let's use it.
Let me go ahead now and as a second
line, let me go ahead and say "hello"--
not "world," but "hello,%s--
which is our placeholder, comma s,
which is our variable,
and then a semi-colon.
So if I didn't screw up too much
here, this looks like correct code.
>> And my instincts now are to compile it.
The file is called buggy1.c.
So I'm going to do make buggy1, Enter.
And darn-it, if there isn't
even more errors than before.
I mean, there's more
error messages it would
seem than actual lines in this program.
>> But the takeaway here is,
even if you're overwhelmed
with two or three or
four more error messages,
focus always on the very
first of those messages.
Looking at the top-most one,
scrolling back up as need be.
So here I typed make buggy1.
Here's that Clang output as expected.
>> And here's the first red error.
Use of undeclared identifier
string, did I mean standard in?
So standard in is
actually something else.
It refers to the user's
keyboard, essentially.
>> But that's not what I meant.
I meant string, and I meant get_string.
So what is it that I
forgot to do this time?
What's missing this time?
I have my #include ,
so I have access to printf.
>> But what do I not have
access to just yet?
Well, just like last time,
I need to tell the compiler
Clang what these functions are.
Get_string does not come
with C. And in particular, it
doesn't come in the 
header file, .
It instead comes in
something the staff wrote,
which is a different file
name but aptly named .
>> So simply by adding that one line
of code-- recall from last time
that when Clang runs, it's going
to look at my code top to bottom,
left to right.
It's going to notice,
oh, you want .
Let me go and find that,
wherever it is on the server,
copy and paste it, essentially,
into the top of your own file
so that at this point in the story,
line 1, the rest of the program
can, indeed, use any of the functions
therein, among them get_string.
So I'm going to ignore
the rest of those errors,
because I, indeed, suspect that only
the first one actually mattered.
And I'm going to go ahead and rerun,
after saving my file make buggy1.
And voila, it did work.
And if I do ./buggy1 and type in, for
instance, Zamyla, I now will get hello,
Zamyla, instead of hello, world.
>> All right.
So the takeaways here then are to,
one, try to glean as much as you can
from the error messages alone, looking
at some of the recognizable words.
Barring that, use help50 per
the problem set specification.
But barring that, too, always look
at the top error only, at least
initially, to see what information
it might actually yield.
But it turns out there's
even more functionality built
into the CS50 Library to help
you early on in the semester
and early on in programming
figure out what's going wrong.
So let's do another example here.
I'm going to call this buggy2, which,
again, is going to be flawed out
of the gate, by design.
>> And I'm going to go ahead
and do #include .
And then I'm going to do int main(void).
And then I'm going to do a for loop.
For (int i _ 0.
i is less than or equal to 10.
i++, and then in curly braces, I'm going
to print out just a hashtag symbol here
and a new line character.
>> So my intent with this
program is quite simply
to iterate 10 times
and on each iteration
of that loop each time
through the cycle,
print out a hashtag,
a hashtag, a hashtag.
One per line because I
have the new line there.
And recall that the for
loop, per last week--
and you'll get more
familiar with the syntax
by using it with practice
before long-- this gives me
a variable called i and sets it to 0.
>> This increments i on
every iteration by 1.
So i goes to 1 to 2 to 3.
And then this condition in the
middle between the semi-colons
gets checked on every iteration to make
sure that we are still within range.
So I want to iterate 10 times, so I
have sort of very intuitively just
put 10 as my upper bound there.
>> And yet, when I run this, after
compiling it with make buggy2--
and it does compile OK.
So I don't have a
syntax error this time.
Let me go ahead now
and run buggy2, Enter.
And now scroll up.
And let me increase
the size of the window.
>> I seem to have 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11.
So there's 11 hashtags, even though
I clearly put 10 inside of this loop.
Now, some of you might see immediately
what the error is because, indeed, this
isn't a very hard error to make.
But it's very commonly
made very early on.
>> What I want to point out, though,
is, how might I figure this out?
Well, it turns out that
the CS50 Library comes
with not only get_string and get_int
and get_float and other functions.
It also comes with a special function
called eprintf, or, error printf.
And it exists solely to make
it a little bit easier for you
when debugging your code to just
print an error message on the screen
and know where it came from.
>> So for instance, one thing I might
do here with this function is this--
eprintf, and then I'm going to go ahead
and say i is now %i, backslash, n.
And I'm going to plug in the value of i.
And up top, because this
is in the CS50 Library,
I'm going to go ahead
and include 
so I have access to this function.
But let's consider what line
9 is supposed to be doing.
I'm going to delete this eventually.
This has nothing to do
with my overarching goal.
But eprintf, error printf, is just meant
to give me some diagnostic information.
When I run my program, I want to
see this on the screen temporarily
as well just to understand
what's going on.
>> And, indeed, on each
iteration here of line 9
I want to see, what is the value of i?
What is the value of i?
What is the value of i?
And, hopefully, I should only
see that message, also, 10 times.
>> So let me go ahead and
recompile my program,
as I have to do any time
I make a change. ./buggy2.
And now-- OK.
There's a lot more going on.
So let me scroll up in
an even bigger window.
>> And you'll see that each of
the hashtags is still printing.
But in between each of them is now this
diagnostic output formatted as follows.
The name of my program here is buggy2.
The name of the file is buggy2.c.
The line number from which
this was printed is line 9.
And then to the right of that is the
error message that I'm expecting.
>> And what's nice about this is that
now I don't have to necessarily count
in my head what my program is doing.
I can see that on the
first iteration i is 0,
then 1, then 2, then 3, then 4, then
5, then 6, then 7, then 8, then 9, then
10.
So wait a minute.
What's going on here?
I still seem to be counting
as intended up to 10.
>> But where did I start?
0, 1, 2, 3, 4, 5, 6, 7, 8, 9 10.
So 0, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10-- the 11th finger
is indicative of the problem.
I seem to have counted
incorrectly in my loop.
Rather than go 10 iterations,
I'm starting at 0,
I'm ending at and through 10.
But because, like a computer,
I'm starting counting at 0,
I should be counting up
to, but not through, 10.
>> And so the fix, I eventually
realized here, is one of two things.
I could very simply say
count up to less than 10.
So 0, 1, 2, 3, 4, 5, 6, 7, 8,
9, which is, indeed, correct,
even though it sounds a little wrong.
Or I could do less than or equal
to 9, so long as I start at 0.
Or if you really don't like that, you
can count up through 10 but start at 1.
But again, this just isn't that common.
In programming-- albeit
not so much in Scratch--
but in programming in
C and other languages,
like JavaScript and
Python and others, it's
just very common for
our discussion of binary
to just start counting at the
lowest number you can, which is 0.
All right.
So that's eprintf.
And again, now that I've figured out my
problem, and I'm going to go back to 0
through less than 10, I'm going
to go in and delete eprintf.
>> It should not be there when I
ship my code or submit my code
or show it to anyone else.
It's really just meant
to be used temporarily.
But now I've fixed this
particular problem as well.
>> Well, let's do one more example here
that I'm going to whip up as follows.
I'm going to go ahead and
#include . $50
And I'm going to go ahead
and #include .
>> And I'm going to save
this file as buggy3.c.
And I'm going to go ahead
and declare int main(void).
And then inside of there
I'm going to do int i _ --
I want to implement a program
with a get_negative_int.
This is not a function that exists yet.
So we're going to implement
it in just a moment.
But we're going to see why
it's buggy at first pass.
And once I've gotten
an int from the user,
I'm just going to print %i is a negative
integer, backslash, n, comma, i.
In other words, all I
want this program to do
is get a negative int from
the user and then print out
that such and such is a negative int.
>> Now I need to implement this function.
So later in my file, I'm going to go
ahead and declare a function called
get_negative_int(void)-- and we'll
come back to what that line means again
in a moment-- int n; do-- do
the following-- printf n is:.
And then I'm going to do n - get_int,
and do this while n is greater than 0.
And then return n;.
>> So there's a lot going on in
this but none of which we didn't
look at last week, at least briefly.
So on line 10 here I've declared a
function called get_negative_int,
and I've put (void), in
parentheses, the reason being this
does not take an input.
I'm not passing anything
to this function.
I'm just getting something back from it.
>> And what I'm hoping to
get back is an integer.
There is no data type in
C called negative_int.
It's just int, so it's going
to be on us to make sure
that the value that's actually
returned is not only an int
but is also negative.
>> On line 12 I'm declaring a variable
called n and making it of type int.
And then in line 13 through 18 I'm
doing something while something is true.
I'm going ahead and printing
n is, colon, and then a space,
like a prompt for the user.
>> I'm then calling get_int and
storing its so-called return value
in that variable n.
But I'm going to keep doing
this while n is greater than 0.
In other words, if the user gives me an
int and that number is greater than 0,
ergo, positive, I'm going to
just keep reprompting the user,
keep reprompting, by forcing them to
cooperate and give me a negative int.
>> And once n is actually negative--
suppose the user finally types -50,
then this while loop is no longer true
because -50 is not greater than 0.
So we break out of that
loop logically and return n.
>> But there's one other
thing I have to do.
And I can simply do this
by copying and pasting
one line of code at the top of the file.
I need to teach Clang,
or promise to Clang,
explicitly that I will,
indeed, go and implement
this function get_negative_int.
It might just be lower in the file.
Again, recall that Clang
reads things top to bottom,
left to right, so you can't
call a function if Clang
doesn't know it's going to exist.
>> Now, unfortunately, this program,
as some of you might have noticed,
is already buggy.
Let me go ahead and make buggy3.
It compiles, so my problem now is not
a syntax error, like a textual error,
it's actually going to be a logical
error that I've deliberately
made as an opportunity to
step through what's going on.
>> I'm going to go ahead
now and run buggy3.
And I'm going to go
ahead and not cooperate.
I'm going to give it the number 1.
It didn't like it, so
it's prompting me again.
>> How about 2?
3?
50?
None of those are working.
How about -50?
And the program seems to work.
>> Let me try it once more.
Let me try -1, seems to work.
Let me try -2, seems to work.
Let me try 0.
Huh, that's incorrect.
Now, we're being a little pedantic here.
But it's, indeed, the case that 0
is neither positive nor negative.
And so the fact that my program is
saying that 0 is a negative integer,
that's not technically correct.
>> Now, why is it doing this?
Well, it might be obvious.
And, indeed, the program is
meant to be fairly simple
so we have something to explore.
>> But let's introduce a third debugging
technique here called debug50.
So this is a program
that we've just created
this year called debug50
that will allow you
to use what's called a built-in
graphical debugger in CS50 IDE.
And a debugger is just a program that
generally lets you run your program
but step by step by step, line
by line by line, pausing, poking
around, looking at variables so that
the program doesn't just blow past you
and quickly print something
or not print something.
It gives you an opportunity, at
human speed, to interact with it.
>> And to do this, you
simply do the following.
After compiling your code,
which I already did, buggy3,
you go ahead and run debug50 ./buggy.
So much like help50 has you run
help50 and then the command,
debug50 has you run debug50 and
then the name of the command.
>> Now watch what happens on my screen,
on the right-hand side in particular.
When I hit Run, all of the
sudden this right-hand panel
opens up on the screen.
And there's a lot going
on at first glance.
But there's not too
much to worry about yet.
>> This is showing me everything
that's going on inside of my program
right now and via these
buttons up top is then
allowing me to step through my code
ultimately step by step by step.
But not just yet.
Notice what happens.
At my terminal window
I'm being prompted for n.
And I'm going to go ahead and
cooperate this time and type in -1.
And albeit a little cryptically, -1
is a negative integer, as expected.
>> And then child exited with
status 0 GDBserver exiting.
GDB, GNU Debugger, is the name
of the underlying software
that implements this debugger.
But all this really means, the debugger
went away because my program quit
and all was well.
If I want to truly debug my program,
I have to preemptively tell debug50,
where do I want to start
stepping through my code?
>> And perhaps the simplest way
to do that is as follows.
If I hover over the
gutter of my editor here,
so really just in the sidebar here,
to the left of the line number,
notice that if I just click
once, I put a little red dot.
And that little red dot,
like a stop sign, means, hey,
debug50, pause execution of my code
right there when I run this program.
>> So let's do that.
Let me go ahead and run my program
again with debug50 ./buggy3, Enter.
And now, notice, something
different has happened.
I'm not being prompted
yet in my terminal window
for anything, because I haven't
gotten there yet in my program.
Notice that on line 8
which is now highlighted,
and there's a little arrow at
left saying, you are paused here.
This line of code, line
8, has not yet executed.
>> And what's curious, if I look
over here on the right-hand side,
notice that i is a local
variable, local in the sense
that it's inside the current function.
And its value, apparently by default,
and sort of conveniently, is 0.
But I didn't type 0.
That just happens to be its
default value at the moment.
>> So let me go ahead and do this now.
Let me go ahead and on
the top right here, I'm
going to go ahead and
click this first icon which
means step over which means don't skip
it but step over this line of code,
executing it along the way.
>> And now, notice, my
prompt has just changed.
Why is that?
I've told debug50,
run this line of code.
What does this line of code do?
Prompts me for an int.
OK.
Let me cooperate.
Let me go ahead now and type -1, Enter.
And now notice what has changed.
On the right-hand side,
my local variable i
is indicated as being -1 now.
And it's still of type int.
>> And notice, too, my so-called
call stack, where did I pause?
We'll talk more about
this in the future.
But the call stack just refers to what
functions are currently in motion.
Right now it's just main.
And right now the only local
variable is i with a value of 1.
>> And when I finally step over this line
here, with that same icon at top right,
-1 is a negative integer.
Now it's pausing over that curly brace.
Let's let it do its thing.
I step over that line, and voila.
>> So not all that terribly
enlightening yet,
but it did let me pause
and think through logically
what this program is doing.
But that wasn't the erroneous case.
Let's do this again as follows.
>> I'm going to leave that breakpoint
on line 8 with the red dot.
I'm going to rerun debug50.
It's automatically paused here.
But this time, instead of
stepping over this line,
let me actually go inside of
get_negative_int and figure out,
why is it accepting 0 as a valid answer?
>> So instead of clicking Step Over.
I'm going to go ahead
and click Step Into.
And notice that the line 8 that's
now highlighted now suddenly
becomes line 17.
>> Now, it's not that the debugger
has skipped lines 14 and 15 and 16.
It's just there's nothing
to show you there.
Those are just declaring variables,
and then there's the word Do
and then an open curly brace.
The only functional line that's
juicy really is this one here, 17.
And that's where we've
paused automatically.
>> So printf("n.is: ");, so
that hasn't happened yet.
So let's go ahead and click Step Over.
Now my prompt, indeed,
changed to ("n is: ").
Now get_int, I'm not going
to bother stepping into,
because that function was
made by CS50 in the Library.
It's presumably correct.
>> So I'm going to go ahead and
sort of cooperate by giving it
an int, but not a negative int.
So let me go ahead and hit 0.
And now what happens here
when I get down to line 21?
I've not iterated again.
I don't seem to be stuck in that loop.
In other words, this yellow
bar did not keep going around,
and around, and around.
>> Now, why is that?
Well, n, what is n right now?
I can look at the local
variables in the debugger.
n is 0.
All right, what was my condition?
>> 20-- line 20 is, well,
0 is greater than 0.
That is not true.
0 is not greater than 0.
And so I broke out of this.
>> And so that's why on line
21, if I actually continue,
I'm going to return 0, even
though I should have rejected 0
as not actually being negative.
So now, I don't really even
care about the debugger.
Got it, I don't need to
know what more is going on.
>> So I'm going to go ahead and
just click the Play button,
and let this finish up.
Now, I've realized that my
bug is apparently on line 20.
That's my logical error.
>> And so what do I want
to do to change this?
If the problem is that I'm not
catching 0, it's just a logical error.
And I can say while n is
greater than or equal to 0,
keep prompting the user again and again.
>> So, again, simple mistake, perhaps
even obvious when you saw me
write it just a few minutes ago.
But the takeaway here
is that with debug 50,
and with debugging
software more generally,
you have this new found power to
walk through your own code, look
via that right hand panel what
your variables values are.
So you don't necessarily
have to use something
like you eprintf to print those values.
You can actually see them
visually on the screen.
>> Now, beyond this, it's worth noting
that there's another technique that's
actually super common.
And you might wonder why this little
guy here has been sitting on the stage.
So there's this technique, generally
known as rubber duck debugging,
which really is just a
testament to the fact
that often when programmers
are writing code,
they're not necessarily
collaborating with others,
or working in a shared environment.
>> They're sort of at home.
Maybe it's late at night.
They're trying to figure
out some bug in their code.
And they're just not seeing it.
>> And there's no roommate.
There is no TF.
There is no CA around.
All they have on their shelf
is this little rubber ducky.
>> And so rubber duck debugging
is just this invitation
to think of something as silly
as this as a real creature,
and actually walk through your code
verbally to this inanimate object.
So, for instance, if
this is my example here--
and recall that earlier
the problem was this,
if I delete this first line of code,
and I go ahead and make buggy 0 again,
recall that I had these
error messages here.
So the idea here, ridiculous though I
feel at the moment doing this publicly,
is that error.
>> OK, so my problem is that I've
implicitly declared a library function.
And that library function is printf.
Declare-- OK, declare
reminds me of prototypes.
>> That means I need to actually
tell the compiler in advance what
the function looks like.
Wait a minute.
I didn't have standard io.h.
Thank you very much.
>> So just this process of-- you
don't need to actually have a duck.
But this idea of walking
yourself through your own code
so that you even hear
yourself, so that you
realize omissions in your own
remarks, is generally the idea.
>> And, perhaps more logically, not so
much with that one but the more involved
example we just did in buggy 3.c,
you might walk yourself through it
as follows.
So all right, rubber
ducky, DDB, if you will.
Here we have in my main function,
I'm calling get negative int.
>> And I am getting the return value.
I'm storing it on the left hand side
on line 8 in a variable called i.
OK, but wait, how did
that get that value?
Let me look at the function in line 12.
>> In line 12, we have get negative int.
Doesn't take any inputs,
does return an int, OK.
I declare on line 14 a variable n.
It's going to store an integer.
That's what I want.
>> So do the following while n is-- let
me undo what the fix I already made.
So while n is greater than
0, print out n is, OK.
And then call get int stored in n.
And then check if n is 0,
n is not-- there it is.
So, again, you don't
need the actual duck.
But just walking yourself through
your code as an intellectual exercise
will often help you
realize what's going on,
as opposed to just doing something
like this, staring at the screen,
and not talking yourself through
it, which honestly is not
nearly as an effective technique.
So there you have it, a
number of different techniques
for actually debugging your code
and finding fault, all of which
should be tools in your toolkit
so that you're not late at night,
especially, you're in the dining
halls, or at office hours,
banging your head against the
wall, trying to solve some problem.
Realize that there are software tools.
There are rubber duck tools.
And there's a whole staff of
support waiting to lend a hand.
>> So now, a word on the problem
sets, and on what we're hoping you
get out of them, and how
we go about evaluating.
Per the course's syllabus,
CS50's problem sets
are evaluated on four primary axes, so
to speak-- scope, correctness, design,
and style.
And scope just refers to how much
of the piece have you bitten off?
How much of a problem have you tried?
What level of effort
have you manifested?
>> Correctness is, does the program work as
it's supposed to per CS50 specification
when you provide certain inputs
or certain outputs coming back?
Design is the most subjective of them.
And it's the one that will
take the longest to learn
and the longest to teach, in
so far as it boils down to,
how well written is your code?
>> It's one thing to just print the correct
outputs or return the right values.
But are you doing it as
efficiently as possible?
Are you doing it divide
and conquer, or binary
search as we'll soon see that we did
two weeks ago with the phone book?
Are there better ways to solve the
problem than you currently have here?
That's an opportunity for better design.
>> And then style-- how
pretty is your code?
You'll notice that I'm pretty
particular about indenting my code,
and making sure my variables
are reasonably named. n,
while short, is a good name for a
number, i for a counting integer,
s for a string.
And we can have longer
variable names style.
Style is just how good
does your code look?
And how readable is it?
>> And over time, what your TAs
and TFs will do in the course
is provide you with that
kind of qualitative feedback
so that you get better
at those various aspects.
And in terms of how we
evaluate each of these axes,
it's typically with very few
buckets so that you, generally,
get a sense of how well you're doing.
And, indeed, if you receive a score on
any of those axes-- correctness, design
and style especially-- that number
will generally be between 1 and 5.
And, literally, if you're getting
3's at the start of the semester,
this is a very good thing.
It means there's still
room for improvement,
which you would hope for in
taking a class for the first time.
There's hopefully some bit of ceiling
to which you're aspiring to reaching.
And so getting 3's on
the earliest pieces,
if not some 2's and 4's,
is, indeed, a good thing.
It's well within range,
well within expectations.
>> And if your mind is racing, wait
a minute, three out of five.
That's really a 6 out of 10.
That's 60%.
My God, that's an F.
>> It's not.
It's not, in fact, that.
Rather, it's an opportunity to improve
over the course of the semester.
And if you're getting some
poors, these are an opportunity
to take advantage of office hours,
certainly sections and other resources.
>> Best is an opportunity, really,
to be proud of just how far you've
come over the course of the semester.
So do realize, if nothing
else, three is good.
And it allows room for growth over time.
>> As to how those axes are
weighted, realistically you're
going to spend most of your time getting
things to work, let alone correctly.
And so correctness tends to
be weighted the most, as with
this multiplicative factor of three.
Design is also important, but
something that you don't necessarily
spend all of those hours on
trying to get things just to work.
>> And so it's weighted
a little more lightly.
And then style is weighted the least.
Even though it's no less
important fundamentally,
it's just, perhaps, the
easiest thing to do right,
mimicking the examples we
do in lecture and section,
with things nicely
indented, and commented,
and so forth is among the easiest
things to do and get right.
So as such, realize
that those are points
that are relatively easy to grasp.
>> And now a word on
this-- academic honesty.
So per the course's
syllabus, you will see
that the course has quite a
bit of language around this.
And the course takes the issue of
academic honesty quite seriously.
>> We have the distinction,
for better or for worse,
of having sent each year more
students for disciplinary action
than most any other
course, that I am aware of.
This is not necessarily
indicative of the fact
that CS students, or CS50 students, are
any less honest than your classmates.
But the reality that in this
world, electronically, we just
have technological
means of detecting this.
>> It is important to us for
fairness across the class
that we do detect this, and raise
the issue when we see things.
And just to paint a picture, and really
to help something like this sink in,
these are the numbers of
students over the past 10 years
that have been involved in some
such issues of academic honesty,
with some 32 students
from fall 2015, which
is to say that we do take
the matter very seriously.
And, ultimately, these numbers compose,
most recently, about 3%, 4% or so
of the class.
>> So for the super majority of students
it seems that the lines are clear.
But do keep this in
mind, particularly late
at night when struggling with
some solution to a problem set,
that there are mechanisms
for getting yourself better
support than you might
think, even at that hour.
Realize that when we receive
student submissions, we cross
compare every submission this year
against every submission last year,
against every submission from 2007,
and since, looking at, as well,
code repositories online,
discussion forums, job sites.
And we mention this,
really, all for the sake
of full disclosure, that if
someone else can find it online,
certainly, so can we the course.
But, really, the spirit
of the course boils down
to this clause in the syllabus.
It really is just, be reasonable.
>> And if we had to elaborate on that
with just a bit more language,
realize that the essence of all
work that you submit to this course
must be your own.
But within that, there are certainly
opportunities, and encouragement,
and pedagogical value in turning to
others-- myself, the TFs, the CAs,
the TAs, and others in the class,
for support, let alone friends
and roommates who have studied
CS and programming before.
And so there is an allowance for that.
And the general rule of thumb
is this-- when asking for help,
you may show your code to others,
but you may not view theirs.
So even if you're at office hours,
or in the D hall, or somewhere else
working on some piece set,
working alongside a friend, which
is totally fine, at the
end of the day your work
should ultimately belong to each
of you respectively, and not
be some collaborative effort,
except for the final project where
it's allowed and encouraged.
>> Realize that if you are
struggling with something
and your friend just happens
to be better at this then you,
or better at that problem than you,
or a little farther ahead than you,
it's totally reasonable to turn
to your friend and say, hey,
do you mind looking at my code here,
helping me spot what my issue is?
And, hopefully, in the
interest of pedagogical value
that friend doesn't just
say, oh, do this, but rather,
what are you missing on line
6, or something like that?
But the solution is not
for the friend next to you
to say, oh, well, here, let me pull
this up, and show my solution to you.
So that is the line.
You show your code to
others, but you may not
view theirs, subject to the other
constraints in the course's syllabus.
>> So do keep in mind this
so-called regret clause
in the course's syllabus as well,
that if you commit some act that
is not reasonable, but bring it to
the attention of the course's heads
within 72 hours, the course
may impose local sanctions that
may include an unsatisfactory or
failing grade for the work submitted.
But the course will not refer the
matter for further disciplinary action,
except in cases of repeated acts.
In other words, if you do make some
stupid, especially late night, decision
that the next morning or two days
later, you wake up and realize,
what was I thinking?
You do in CS50 have an outlet
for fixing that problem
and owning up to it, so that we
will meet you halfway and deal
with it in a matter that is both
educational and valuable for you,
but still punitive in some way.
And now, to take the edge off, this.
>> [VIDEO PLAYBACK]
>> [MUSIC PLAYING]
>> [END PLAYBACK]
DAVID J. MALAN: All right, we are back.
And now we look at one of the
first of our real world domains
in CS50, the art of cryptography,
the art of sending and receiving
secret messages, encrypted
messages if you will,
that can only be deciphered if you have
some key ingredient that the sender has
as well.
So to motivate this we'll take
a look at this thing here,
which is an example of a
secret decoder ring that
can be used in order to figure out
what a secret message actually is.
In fact, back in the
day in grade school,
if you ever sent secret messages to
some friend or some crush in class,
you might have thought
you were being clever
by on your piece of paper changing,
like, A to B, and B to C, and C to D,
and so forth.
But you were actually encrypting
your information, even
if it was a little trivial, wasn't
that hard for the teacher to realize,
well, if you just change
B to A and C to B,
you actually figure out
what the message was,
but you were in ciphering information.
>> You were just doing it
simply, much like Ralphie here
in a famous movie that plays
pretty much ad nauseum each winter.
[VIDEO PLAYBACK]
-Be it known to all that
Ralph Parker is hereby
appointed a member of the Little
Orphan Annie Secret Circle
and is entitled to all the honors
and benefits occurring thereto.
>> -Signed, Little Orphan Annie,
counter-signed Pierre Andre, in ink.
Honors and benefits,
already at the age of nine.
>> [SHOUTING]
-Come on.
Let's get on with it.
I don't need all that jazz
about smugglers and pirates.
>> -Listen tomorrow night for
the concluding adventure
of the black pirate ship.
Now, it's time for
Annie's secret message
for you members of the Secret Circle.
Remember, kids, only members
of Annie's Secret Circle
can decode Annie's secret message.
>> Remember, Annie is depending on you.
Set your pins to B2.
Here is the message.
12, 11--
>> -I am in, my first secret meeting.
>> -14, 11, 18, 16.
>> -Pierre was in great voice tonight.
I could tell that tonight's
message was really important.
>> -3, 25, that's a message
from Annie herself.
Remember, don't tell anyone.
>> -90 seconds later, I'm in the only
room in the house where a boy of nine
could sit in privacy and decode.
Aha, B!
I went to the next, E.
>> The first word is be.
S, it was coming easier now, U, 25--
>> -Oh, come on, Ralphie, I gotta go!
>> -I'll be right down, Ma!
Gee whiz!
>> -T, O, be sure to-- be sure to what?
What was Little Orphan
Annie trying to say?
Be sure to what?
>> -Ralphie, Andy has got to
go, will you please come out?
>> -All right, Ma!
I'll be right out!
>> -I was getting closer now.
The tension was terrible.
What was it?
The fate of the planet
may hang in the balance.
>> -Ralphie!
Andy's gotta go!
>> -I'll be right out, for crying out loud!
>> -Almost there, my fingers flew, my mind
was a steel trap, every pore vibrated.
It was almost clear, yes, yes, yes.
>> -Be sure to drink your ovaltine.
Ovaltine?
A crummy commercial?
Son of a bitch.
[END PLAYBACK]
DAVID J. MALAN: OK, so
that was a very long way
of introducing cryptography,
and also ovaltine.
In fact, from this old advert
here, why is ovaltine so good?
It is a concentrated extraction of ripe
barley malt, pure creamy cow's milk,
and specially prepared cocoa, together
with natural phosphatides and vitamins.
It is further fortified with
additional vitamins B and D, yum.
And you can still get it, apparently,
on Amazon, as we did here.
>> But the motivation here was to
introduce cryptography, specifically
a type of cryptography known
as secret key cryptography.
And as the name suggests, the whole
security of a secret key crypto system,
if you will, a methodology
for just scrambling
information between two people, is that
only the sender and only the recipient
know a secret key-- some value, some
secret phrase, some secret number, that
allows them to both encrypt
and decrypt information.
And cryptography, really,
is just this from week 0.
>> It's a problem where there's inputs,
like the actual message in English
or whatever language that you
want to send to someone in class,
or across the internet.
There is some output, which is going
to be the scrambled message that you
want the recipient to receive.
And even if someone in the
middle receives it too,
you don't want them to
necessarily be able to decrypt it,
because inside of this
black box, or algorithm,
is some mechanism, some step by step
instructions, for taking that input
and converting it into the
output, in hopefully a secure way.
>> And, in fact, there is some
vocabulary in this world as follows.
Plain text is the word a
computer scientist would
use to describe the input
message, like the English
or whatever language you actually
want to send to some other human.
And then the ciphertext is the scramble
to the enciphered, or encrypted,
version thereof.
>> But there's one other ingredient here.
There's one other input to
secret key cryptography.
And that is the key itself,
which is, generally,
as we'll see, a number, or
letter, or word, whatever
the algorithm it is actually expects.
>> And how do you decrypt information?
How do you unscramble it?
Well, you just reverse the
outputs and the inputs.
>> In other words, once someone
receives your encrypted message,
he or she simply has
to know that same key.
They have received the ciphertext.
And by plugging those two
inputs into the crypto system,
the algorithm, this black box, out
should come the original plaintext.
And so that's the very high level
view of what cryptography is actually
all about.
>> So let's get there.
Let's now look underneath
the hood of something
we've been taking for granted for
the past week, and for this session
here-- the string.
A string at the end of the day
is just a sequence of characters.
>> It might be hello world, or
hello Zamyla, or whatever.
But what does that mean to
be a sequence of characters?
In fact, the CS50 library gives
us a data type called string.
>> But there is actually no
such thing as a string in C.
It really is just a sequence of
character, character, character,
character, back, to back, to
back, to back, to back inside
of your computer's memory, or RAM.
And we'll look deeper into that in the
future when we look at memory itself,
and the utilization, and the
threats that are involved.
>> But let's consider the string Zamyla.
So just the name of
the human here, Zamyla,
that is a sequence of
characters, Z-A-M-Y-L-A.
And now let's suppose that Zamyla's name
is being stored inside of a computer
program.
>> Well, it stands to reason that we should
be able to look at those characters
individually.
So I'm just going to draw a little
box around Zamyla's name here.
And it is the case in C that when you
have a string, like Zamyla-- and maybe
that string has come back from
a function like get string,
you can actually manipulate
it character by character.
>> Now, this is germane for the
conversation at hand, because
in cryptography if you want to change
A to B, and B to C, and C to D,
and so forth, you need to be able
to look at the individual characters
in a string.
You need to be able to change
the Z to something else, the A
to something else, the M to
something else, and so on.
And so we need a way,
programmatically, so
to speak, in C to be able to change
and look at individual letters.
And we can do this as follows.
>> Let me go head back in CS50 IDE.
And let me go ahead
and create a new file
that I'll call this time string0,
as our first such example, dot c.
And I'm going to go ahead
and whip it up as follows.
>> So include CS50.h, and
then include standard io.h,
which I'm almost always going to
be using in my programs, at least
initially.
int main void, and then in here I'm
going to do strings gets get string.
And then I'm going to
go ahead and do this.
I want to go ahead
and, as a sanity check,
just say, hello, percent s,
semi-colon, makes string 0.
Uh oh, what did I do here?
Oh, I didn't plug it in.
So lesson learned, that
was not intentional.
>> So error, more percent
conversions than data arguments.
And this is where, in
line 7-- OK, so I have,
quote unquote, that's
my string to printf.
I've got a percent sign.
But I'm missing the second argument.
>> I'm missing the comma s, which
I did have in previous examples.
So a good opportunity to fix
one more mistake, accidentally.
And now let me run
string0, type in Zamyla.
OK, hello Zamyla.
>> So we've run this kind of program
a few different times now.
But let's do something a
little different this time.
Instead of just printing Zamyla's
whole name out with printf,
let's do it character by character.
>> I'm going to use a for loop.
And I'm going to give myself
a counting variable, called i.
And I'm going to keep iterating, so
long as i is less than the length of s.
>> It turns out, we didn't
do this last time,
that c comes with a
function called Stirling.
Back in the day, and in general
still when implementing functions,
humans will often choose very
succinct names that kind of sound
like what you want, even though it's
missing a few vowels or letters.
So Stirling is the
name of a function that
takes an argument between
parentheses that should be a string.
And it just returns an integer,
the length of that string.
>> So this for loop on line 7 is going
to start counting at i equals 0.
It's going to increment
i on each iteration
by 1, as we've been doing a few times.
But it's going to only do
this up until the point
when i is the length
of the string itself.
>> So this is a way of, ultimately,
iterating over the characters
in the string as is follows.
I'm going to print out not a
whole string, but percent c,
a single character
followed by a new line.
And then I'm going to
go ahead, and I need
to say I want to print
ith character of s.
>> So if i is the variable that indicates
the index of the string, where
you are in it, I need to be able to
say, give me the ith character of s.
And c has a way of doing
this with square brackets.
You simply say the name of the
string, which in this case is s.
Then you use square brackets, which are
usually just above your Return or Enter
key on the keyboard.
And then you put the index of the
character that you want to print.
So the index is going to be a
number-- 0, or 1, or 2, or 3, or dot,
dot, dot, some other number.
>> And we ensure that it's going to
be the right number, because I
start counting at 0.
And by default, the first character
in a string is by convention 0.
And the second character is bracket 1.
And the third character is bracket 2.
And you don't want to go too
far, but we won't because we're
going to only increment i until it
equals the length of the string.
And at which point,
this for loop will stop.
>> So let me go ahead and save this
program, and run make string 0.
But I screwed up.
Implicitly declaring library function
Stirling with type such and such-- now,
this sounds familiar.
But it's not printf.
And it's not get string.
>> I didn't screw up in
the same way this time.
But notice down here a little down
further, include the header string.h,
explicitly provide the
declaration for Stirling.
So there is actually a clue in there.
>> And indeed it turns out
there's another header file
that we've not used
in class yet, but it's
among those available
to you, called string.h.
And in that file, string.h
is Stirling declared.
So let me go ahead and
save this, make string
0-- nice, no error messages this time.
>> ./string0 Zamyla, and
I'm about to hit Enter,
at which point getstring is going
to return the string, put it in s.
Then that for loop is going to iterate
over S's characters one at a time,
and print them one per line, because
I had that backslash n at the end.
So I could omit that backslash
n, and then just print Zamyla all
in the same line,
effectively reimplementing
printf, which isn't all that useful.
But in this case, I've not done that.
I've actually printed one
character at a time, one per line,
so that we actually see the effect.
>> But I should note one thing here.
And we'll come back to
this in a future week.
It turns out that this
code is potentially buggy.
>> It turns out that get string
and some other functions in life
don't necessarily always
return what you're expecting.
We know from class last
time in this that get
string is supposed to return a string.
But what if the user types out such
a long word, or paragraph, or essay
that there's just not enough
memory in the computer to fit it.
>> Like, what if something goes
wrong underneath the hood?
It might not happen often,
but it could happen once
in a while, very infrequently.
And so it turns out that get string
and functions like it don't necessarily
always return strings.
They might return some error value,
some sentinel value so to speak,
that indicates that
something has gone wrong.
And you would only know this from
having learned it in class now,
or having read some more documentation.
It turns out that get string
can return a value called null.
Null is a special value that we'll
come back to in a future week.
But for now, just know that if I want
to be really proper in moving forward
using get string, I
shouldn't just call it,
and blindly use its return value,
trusting that it's a string.
>> I should first say,
hey, wait a minute, only
proceed if s does not equal
null, where null, again,
is just some special value.
And it's the only special value you
need to worry about for get string.
Get string is either going
to return a string or null.
>> And this exclamation point equals sign
you might know from maybe math class
that you might draw an equal sign with
a line through it to indicate not equal.
That's not generally a character
you can type on your keyboard.
And so in most programming languages,
when you want to say not equal,
you use an exclamation point,
otherwise known as bang.
So you say bang equals, which
means not equals, logically.
It's just like there's not a greater
than, or equal to, or less than
or equal to key on your keyboard
that does it all in one symbol.
So that's why, in past examples,
you did an open bracket, and then
an equal sign, in order to do
greater than or, say, less than.
>> So what's the takeaway here?
This is simply a way now of
introducing this syntax, this feature,
iterating over individual
characters in a string.
And just like those square
brackets allow you to get at them,
consider those square brackets as
kind of hinting at this underlying
design, whereby every
character inside of a string
is kind of boxed in somewhere underneath
the hood in your computer's memory.
>> But let's make a variant of this.
It turns out that this
program is correct.
So per CS50's axes for evaluating
code, this is correct now.
Especially now that I'm checking for
null, this program should never crash.
And I just know that from experience.
But there's nothing else that
we can really go wrong here.
But it's not very well-designed,
because let's go back to basics.
>> First, principles--
what does a for loop do?
A for loop does three things.
It initializes some
value, if you ask it to.
It checks a condition.
And then after each
iteration, after each cycle,
it increments some
value, or values, here.
>> So what does that mean?
We initialize i to 0.
We check and make sure i is less than
the length of s, which is Z-A-M-Y-L-A,
so which is less than 6.
And, indeed, 0 as less than 6.
>> We print out Z from Zamyla's name.
Then we increment i from 0 to 1.
We then check, is 1 less
than the length of s?
The length of s is 6.
Yes, it is.
>> So we print a in Zamyla's name, ZA.
We increment i from 0, to 1, to 2.
We then check, is 2 less than
the length of Zamyla's name.
6- so 2 is less than 6.
Yes, let's print out now M in
Zamyla's name, the third character.
>> The key here is that on each
iteration of the story, I'm checking,
is i less than the length of Zamyla?
But the catch is that
Stirling is not a property.
Those of you who have programmed
before in Java or other languages
might know the length of a string is
a property, just some read only value.
>> In C in this case, if this is
a function that is literally
counting the number of
characters in Zamyla every time
we call that function.
Every time you ask the computer to use
Stirling, it's taking a look at Zamyla,
and saying Z-A-M-Y-L-A, 6.
And it returns 6.
The next time you call
it inside that for loop,
it's going to look at Zamyla
again, say Z-A-M-Y-L-A, 6.
And it's going to return 6.
So what's stupid about this design?
>> Why is my code not a 5 out of 5
for design right now, so to speak?
Well, I'm asking a
question unnecessarily.
I'm doing more work than I need to.
>> So even though the
answer is correct, I am
asking the computer, what is
the length of Zamyla again,
and again, and again, and again?
And that answer is
never going to change.
It's always going to be 6.
>> So a better solution than this
would be this next version.
Let me go ahead and put it in a
separate file called string1.c,
just to keep it separate.
And it turns out in a for
loop, you can actually
declare multiple variables at once.
>> So I'm going to keep i and set it to 0.
But I'm also going to
add a comma, and say,
give me a variable called n, whose
value equals the string length of s.
And now, please make my condition
so long as i is less than n.
>> So in this way, the logic is
identical at the end of the day.
But I am remembering the
value 6, in this case.
What is the length of Zamyla's name?
And I'm putting it at n.
>> And I'm still checking
the condition every time.
Is 0 less than 6?
Is 1 less than 6?
Is 2 less than 6, and so forth?
>> But I'm not asking the computer
again, and again, what's
the length of Zamyla's name?
What's the length of Zamyla's name?
What's the length of this Zamyla's name?
I'm literally remembering that first and
only answer in this second variable n.
So this now would be not only
correct, but also well-designed.
>> Now, what about style?
I've named my variables
pretty well, I would say.
They're super succinct right now.
And that's totally fine.
>> If you only have one
string in a program,
you might as well call it s for string.
If you only have one variable
for counting in a program,
you might as well call it i.
If you have a length, n
is super common as well.
But I haven't commented any of my code.
>> I've not informed the reader--
whether that's my TF, or TA,
or just colleague-- what is supposed
to be going on in this program.
And so to get good style,
what I would want to do
is this-- something
like ask user for input.
And I could rewrite
this any number of ways.
>> Make sure s-- make sure get
string returned a string.
And then in here-- and this is perhaps
the most important comment-- iterate
over the characters in s one at a time.
And I could use any
choice of English language
here to describe each
of these chunks of code.
>> Notice that I haven't put a
comment on every line of code,
really just on the interesting
ones, the ones that
have some meaning that I might
want to make super clear to someone
reading my code.
And why are you calling get
string ask user for input?
Even that one is not necessarily
all that descriptive.
But it helps tell a story, because the
second line in the story is, make sure
get string returned a string.
>> And the third line in the story is,
iterate over the characters in s one
at a time.
And now just for good measure,
I'm going to go ahead and add
one more comment that just
says print i-th character in s.
Now, what have I done
at the end of the day?
>> I have added some English
words in the form of comments.
The slash slash symbol means, hey,
computer this is for the human,
not for you, the computer.
So they're ignored logically.
They're just there.
>> And, indeed, CS50 IDE shows them as
gray, as being useful, but not key
to the program.
Notice what you can now do.
Whether you know C
programming or not, you
can just stand back at this
program, and skim the comments.
Ask user for input, make sure
get string returned a string,
iterate over the characters in s
one at a time, print the character
i-th character in s-- you don't
even have to look at the code
to understand what this program does.
And, better yet, if you yourself look
at this program in a week or two,
or a month, or a year,
you too don't have
to stare at the code,
trying to remember,
what was I trying to do with this code?
>> You've told yourself.
You've described it for yourself,
or some colleague, or TA, or TF.
And so this would now be
correct, and good design,
and ultimately good style as well.
So do keep that in mind.
>> So there's one other
thing I'm going to do here
that can now reveal exactly what's
going on underneath the hood.
So there's this feature
in C, and other languages,
called typecasting
that either implicitly
or explicitly allows you to convert
from one data type to another.
We've been dealing so
far today with strings.
>> And strings are characters.
But recall from week
0, what are characters?
Characters are just an abstraction
on top of numbers-- decimal numbers,
and decimal numbers are really just an
abstraction on top of binary numbers,
as we defined it.
>> So characters are numbers.
And numbers are characters,
just depending on the context.
And it turns out that inside
of a computer program,
can you specify how you want to look
at the bits inside of that program?
>> Recall from week 0 that we had
Ascii, which is just this code
mapping letters to numbers.
And we said, capital A is 65.
Capital B is 66, and so forth.
>> And notice, we essentially have chars on
the top row here, as C would call them,
characters, and then
ints on the second row.
And it turns out you can convert
seamlessly between the two, typically.
And if we want to do
this deliberately, we
might want to tackle
something like this.
>> We might want to convert
upper case to lower
case, or lower case to upper case.
And it turns out there's
actually a pattern here
we can embrace in just a moment.
But let's look first at an
example of doing this explicitly.
>> I'm going to go back into CS50 IDE.
I'm going to create a
file called Ascii 0.c.
And I'm going to go ahead and add my
standard io.h at the top, int main void
at the top of my function.
And then I'm just going to do the
following-- a for loop from i equals,
let's say, 65.
>> And then i is going to be less than
65, plus 26 letters in the alphabet.
So I'll let the computer
do the math for me there.
And then inside this loop,
what am I going to print?
>> %c is % i backslash n.
And now I want to plug in two values.
I've temporarily put question
marks there to invite the question.
>> I want to iterate from 65 onward
for 26 letters of the alphabet,
printing out on each iteration that
character's integral equivalent.
In other words, I want to
iterate over 26 numbers printing
what the Ascii character is, the letter,
and what the corresponding number is--
really just recreating
the chart from that slide.
So what should these question marks be?
>> Well, it turns out that the second
one should just be the variable i.
I want to see that as a number.
And the middle argument
here, I can tell the computer
to treat that integer
i as a character, so as
to substitute it here for percent C.
>> In other words, if I, the
human programmer, know
these are just numbers
at the end of the day.
And I know that 65 should
map to some character.
With this explicit cast,
with a parenthesis,
the name of the data type you want to
convert to, and a closed parenthesis,
you can tell the
computer, hey, computer,
convert this integer to a char.
>> So when I run this
program after compiling,
let's see what I get-- make Ascii 0.
Darn it, what did I do wrong here?
Use of undeclared identifier,
all right, not intentional,
but let's see if we can't
reason through this.
>> So line five-- so I didn't get
very far before screwing up.
That's OK.
So line 5 for i equals 65-- I see.
So remember that in C, unlike some
languages if you have prior programming
experience, you have
to tell the computer,
unlike Scratch, what
type of variable it is.
>> And I forgot a key phrase here.
In line five, I've started using i.
But I haven't told C
what data type it is.
So I'm going to go in here and
say, ah, make it an integer.
>> Now I'm going to go ahead and recompile.
That fixed that.
./ascii0 Enter, that's kind of cool.
Not only is it super fast to
ask the computer this question,
rather than looking it up on a slide,
it printed out one per line, A is 65,
B is 66, all the way down-- since I
did this 26 times-- to the letters z,
which is 90.
And, in fact, slightly
more intelligent would
have been for me not to rely
on the computer to add 26.
I could have just done
90 as well, so long
as I don't make the same mistake twice.
I want to go up through
z, not just up through y.
>> So that's an explicit cast.
It turns out that this
isn't even necessary.
Let me go ahead and rerun this
compiler, and rerun Ascii 0.
It turns out that C is pretty smart.
>> And printf, in particular,
is pretty smart.
If you just pass an i twice
for both placeholders, printf
will realize, oh, well I know you
gave me an integer-- some number,
like 65, or 90, or whatever.
But I see that you want me to
format that number like a character.
And so printf can implicitly cast
the int to a char for you as well.
So that's not a problem at all.
>> But notice, because of this equivalence
we can actually do this as well.
Let me go ahead and make one
other version of this-- Ascii 1.c.
And instead of iterating over
integers, can really blow your mind
by iterating over characters.
If a char c gets capital A, I
want to go ahead and do this,
so long as C is less than or equal
to capital Z. And on each iteration
I want to increment C, I can
now in my printf line here
say, percent C is
percent i again, comma C.
>> And now, I can go the other direction,
casting the character explicitly
to an integer.
So, again, why would you do this?
It's a little weird to sort of
count in terms of characters.
>> But if you understand what's
going on underneath the hood,
there's really no magic.
You're just saying, hey, computer give
me a variable called C of type char.
Initialize it to capital A. And
notice single quotes matter.
>> For characters in C, recall from
last week, you use single quotes.
For strings, for words,
phrases, you use double quotes.
OK, computer, keep doing this, so
long as the character is less than
or equal to z.
And I know from my Ascii table that all
of these Ascii codes are contiguous.
>> There's no gaps.
So it's just A through Z,
separated by one number each.
And then I can increment
a char, if I really want.
At the end of the day,
it's just a number.
I know this.
So I can just presume to add 1 to it.
>> And then this time, I print c,
and then the integral equivalent.
And I don't even need the explicit cast.
I can let printf and the
computer figure things out,
so that now if I run
make Ascii1./ascii1,
I get the exact same thing as well.
>> Useless program, though-- no one
is going to actually write software
in order to figure out, what was the
number that maps to A, or B, or Z?
You're just going to Google it, or
look it up online, or look it up
on a slide, or the like.
So where does this actually get useful?
>> Well, speaking of that
slide, notice there's
an actual pattern here between uppercase
and lowercase that was not accidental.
Notice that capital A is 65.
Lowercase a is 97.
And how far away is lower case a?
>> So 65 is how many steps away from 97?
So 97 minus 65 is 32.
So capital a is 65.
If you add 32 to that,
you get lowercase a.
And, equivalently, if you subtract 32,
you get back to capital A-- same with B
to little b, big C to little c.
>> All of these gaps are 32 apart.
Now, this would seem to allow us to
do something like Microsoft Word,
or Google Docs feature, where you
can select everything and then say,
change all to lowercase, or
change all to upper case,
or change only the first word
of a sentence to upper case.
We can actually do something
like that ourselves.
>> Let me go ahead and save a file
here called capitalize 0.c.
And let's go ahead and whip up a program
that does exactly that as follows.
So include the CS50 library.
And include standard I/O.
>> And I know this is coming soon.
So I'm going to put it in
there already, string.h,
so I have access to
things like Stirling,
and then int main void, as usual.
And then I'm going to go ahead
and do strings gets get string,
just to get a string from the user.
And then I'm going to
do my sanity check.
If string does not equal null,
then it's safe to proceed.
And what do I want to do?
I'm going to iterate from i equals 0,
and n up to the string length of s.
>> And I'm going to do this so long as
i is less than n, and i plus plus.
So far, I'm really just
borrowing ideas from before.
And now I'm going to introduce a branch.
>> So think back to Scratch, where
we had those forks in the road,
and last week in C. I'm going to
say this, if the i-th character in s
is greater than or
equal to lower case a,
and-- in Scratch you would literally
say and, but in C you say ampersand,
ampersand-- and the i-th character in s
is less than or equal to lower case z,
let's do something interesting.
Let's actually print out a
character with no newline
that is the character in the string,
the i-th character in the string.
>> But let's go ahead and
subtract 32 from it.
Else if the character in the
string that we're looking
is not between little a
and little z, go ahead
and just printed it out unchanged.
So we've introduced
this bracketed notation
for our strings to get at the
i-th character in the string.
>> I've added some conditional logic, like
Scratch in last week's week one, where
I'm just using my fundamental
understanding of what's
going on underneath the hood.
Is the i-th character of s
greater than or equal to a?
Like, is it 97, or 98,
or 99, and so forth?
>> But is it also less than or equal
to the value of lowercase z?
And if so, what does this line mean?
14, this is sort of the
germ of the whole idea,
capitalize the letter by
simply subtracting 32 from it,
in this case, because I know, per that
chart, how my numbers are represented.
So let's go ahead and run this,
after compiling capitalize 0.c,
and run capitalize 0.
>> Let's type in something like
Zamyla in all lowercase enter.
And now we have Zamyla in all uppercase.
Let's type in Rob in all lowercase.
Let's try Jason in all lowercase.
And we keep getting the
forced capitalization.
There's a minor bug that I
kind of didn't anticipate.
Notice my new prompt is ending up
on the same line as their names,
which feels a little messy.
>> So I'm going to go here, and
actually at the end of this program
print out a newline character.
That's all.
With printf, you don't need to
pass in variables or format code.
You can literally just print
something like a newline.
>> So let's go ahead and make
capitalize 0 again, rerun it, Zamyla.
And now it's a little prettier.
Now, my prompt is on its own new line.
So that's all fine and good.
So that's a good example.
But I don't even necessarily
need to hard code the 32.
You know what?
I could say-- I don't ever
remember what the difference is.
>> But I know that if I
have a lower case letter,
I essentially want to subtract off
whatever the distance is between little
a and big A, because if I assume that
all of the other letters are the same,
that should get the job done.
But rather than do that, you know what?
There's another way still.
>> If that's capitalize 1.c-- if I were
to put that into a separate file.
let's do capitalize 2.c as follows.
I'm going to really clean this up here.
And instead of even having to
know or care about those low level
implementation details, I'm instead
just going to print a character,
quote unquote, percent C, and
then call another function that
exists that takes an argument,
which is a character, like this.
>> It turns out in C, there's
another function call
to upper, which as its name
suggests takes a character
and makes it to its upper case
equivalent, and then returns it
so that printf can plug it in there.
And so to do this, though, I
need to introduce one other file.
It turns out there's another file
that you would only know from class,
or a textbook, or an online
reference, called C type.h.
>> So if I add that up among my header
files, and now re-compile this program,
capitalize2, ./capitalize2 Enter.
Let's type in Zamyla in all
lowercase, still works the same.
But you know what?
It turns out that to upper
has some other functionality.
>> And let me introduce this
command here, sort of awkwardly
named, but man for manual.
It turns out that most Linux computers,
as we are using here-- Linux operating
system-- have a command
called man, which says,
hey, computer, give me
the computer's manual.
What do you want to
look up in that manual?
>> I want to look up the function
called to upper, Enter.
And it's a little cryptic
to read sometimes.
But notice we're in the
Linux programmer's manual.
And it's all text.
And notice that there's the
name of the function up here.
It turns out it has a cousin called
to lower, which does the opposite.
And notice under synopsis, to use this
function the man page, so to speak,
is telling me that I
need to include c type.h.
And I knew that from practice.
>> Here, it's showing me the two
prototypes for the function,
so that if I ever want to use this
I know what they take as input,
and what they return as output.
And then if I read
the description, I see
in more detail what the function does.
But more importantly, if
I look under return value,
it says the value returned is
that of the converted letter,
or C, the original input, if
the conversion was not possible.
>> In other words, to upper will try
to convert a letter to upper case.
And if so, it's going to return it.
But if it can't for some reason--
maybe it's already upper case,
maybe it's an exclamation point
or some other punctuation--
it's just going to
return the original C,
which means I can make my code
better designed as follows.
>> I don't need all of
these darn lines of code.
All of the lines I've
just highlighted can
be collapsed into just one simple
line, which is this-- printf percent
c to upper S bracket i.
And this would be an
example of better design.
>> Why implement in 7 or 8 lines
of code, whatever it was I just
deleted, when you can instead collapse
all of that logic and decision making
into one single line, 13 now, that
relies on a library function--
a function that comes with C, but that
does exactly what you want it to do.
And, frankly, even if
it didn't come with C,
you could implement it yourself, as
we've seen, with get negative int
and get positive int last week as well.
>> This code now is much more readable.
And, indeed, if we scroll up,
look how much more compact
this version of my program is.
It's a little top heavy now,
with all these includes.
But that's OK, because now I'm standing
on the shoulders of programmers
before me.
And whoever it was who
implemented to upper really
did me a favor, much like whoever
implemented Stirling really
did me a favor some time ago.
And so now we have a
better design program
that implements the exact same logic.
>> Speaking of stirling, let
me go ahead and do this.
Let me go ahead and save
this file as stirling.c.
And it turns out, we can peel back
one other layer pretty simply now.
I'm going to go ahead and whip
up another program in main
here that simply re-implements
string length as follows.
So here's a line of code that
gets me a string from the user.
We keep using this again and again.
Let me give myself a variable called
n of type int that stores a number.
>> And let me go ahead and
do the following logic.
While the n-th character in s does
not equal backslash 0, go ahead
and increment n.
And then print out printf percent i n.
I claim that this program here,
without calling string length,
figures out the length of a string.
>> And the magic is entirely
encapsulated in line 8
here with what looks like new syntax,
this backslash 0 in single quotes.
But why is that?
Well, consider what's been
going on all this time.
>> And as an aside before I forget, realize
too, that in addition to the man pages
that come with a typical
Linux system like CS50 IDE,
realize that we, the
course's staff, have also
made a website version
of this same idea called
reference.cs50.net, which has
all of those same man pages,
all of that same
documentation, as well as
a little box at the top that allows
you to convert all of the fairly
arcane language into less comfortable
mode, where we, the teaching staff,
have gone through and tried to simplify
some of the language to keep things
focused on the ideas, and not
some of the technicalities.
So keep in mind, reference.cs50.net
as another resource as well.
>> But why does string length work in
the way I proposed a moment ago?
Here's Zamyla's name again.
And here's Zamyla's name
boxed in, as I keep doing,
to paint a picture of it being,
really, just a sequence of characters.
But Zamyla does not exist
in isolation in a program.
>> When you write and run a program,
you're using your Mac or your PC
as memory, or RAM so to speak.
And you can think of
your computer as having
lots of gigabytes of memory these days.
And a gig means billions,
so billions of bytes.
>> But let's rewind in time.
And suppose that we're using
a really old computer that
only has 32 bytes of memory.
I could, on my computer screen,
simply draw this out as follows.
>> I could simply say that my
computer has all of this memory.
And this is like a stick of memory, if
you recall our picture from last time.
And if I just divide
this in enough times,
I claim that I have 32 bytes
of memory on the screen.
>> Now, in reality, I can only
draw so far on this screen here.
So I'm going to go ahead,
and just by convention,
draw my computer's memory as a
grid, not just as one straight line.
Specifically, I claim now that
this grid, this 8 by 4 grid,
just represents all 32 bytes
of memory available in my Mac,
or available in my PC.
And they're wrapping
on to two lines, just
because it fits more on the screen.
But this is the first byte.
This is the second byte.
This is the third byte.
>> And this is the 32nd byte.
Or, if we think like a computer
scientist, this is byte 0, 1, 2, 3, 31.
So you have 0 to 31, if
you start counting at 0.
>> So if we use a program
that calls get string,
and we get a string from the human
like I did called Zamyla, Z-A-M-Y-L-A,
how in the world does the
computer keep track of which byte,
which chunk of memory,
belongs to which string?
In other words, if we proceed to
type another name into the computer,
like this Andi, calling
get string a second time,
A-N-D-I has to end up in the
computer's memory as well.
But how?
>> Well, it turns out that underneath the
hood, what C does when storing strings
that the human types in, or that
come from some other source, is it
delineates the end of them with
a special character-- backslash
0, which is just a special way
of saying 80 bits in a row.
>> So A-- this is the number 97 recall.
So some pattern of 8 bits
represents decimal number 97.
This backslash 0 is literally the number
0, a.k.a. nul, N-U-L, unlike earlier,
N-U-L-L, which we talked about.
But for now, just know that this
backslash 0 is just 80 bits in a row.
>> And it's just this line in the
sand that says anything to the left
belongs to one string, or one data type.
And anything to the right
belongs to something else.
Andi's name, meanwhile,
which just visually
happens to wrap on to the other line,
but that's just an aesthetic detail,
similarly is nul terminated.
>> It is a string of a A-N-D-I characters,
plus a fifth secret character,
all 0 bits, that just demarcates
the end of Andi's name as well.
And if we call get string a third time
in the computer to get a string like
Maria, M-A-R-I-A, similarly is Maria's
name nul terminated with backslash 0.
>> This is fundamentally different
from how a computer would typically
store an integer, or a float, or other
data types still, because recall,
an integer is usually 32 bits, or
4 bytes, or maybe even 64 bits,
or eight bytes.
But many primitives in a computer
in a programming language
have a fixed number of
bytes underneath the hood--
maybe 1, maybe 2, maybe 4, maybe 8.
>> But strings, by design, have a
dynamic number of characters.
You don't know in advance, until
the human types in Z-A-M-Y-L-A,
or M-A-R-I-A, or A-N-D-I. You don't know
how many times the user is going to hit
the keyboard.
Therefore, you don't know how
many characters in advance
you're going to need.
>> And so C just kind of leaves like a
secret breadcrumb underneath the hood
at the end of the string.
After storing Z-A-M-Y-L-A in memory,
it also just puts the equivalent
of a period.
At the end of a sentence,
it puts 80 bits, so as
to remember where
Zamyla begins and ends.
>> So what's the connection,
then, to this program?
This program here, Stirling,
is simply a mechanism
for getting a string
from the user, line 6.
Line 7, I declare a variable
called n and set it equal to 0.
>> And then in line 8, I simply asked the
question, while the n-th character does
not equal all 0 bits--
in other words, does not
equal this special
character, backslash 0, which
was just that special nul character--
go ahead and just increment n.
>> And keep doing it, and keep
doing it, and keep doing it.
And so even though in
the past we've used i,
it's perfectly fine
semantically to use n,
if you're just trying to
count this time deliberately,
and just want to call it n.
So this just keeps asking the question,
is the n-th character of s all 0s?
If not, look to the next look,
look to the next, look to the next,
look to the next.
>> But as soon as you see backslash 0,
this loop-- line 9 through 11-- stops.
You break out of the while loop,
leaving inside of that variable n
a total count of all of the
characters in the string you saw,
thereby printing it out.
So let's try this.
>> Let me go ahead and, without
using the stirling function,
but just using my own homegrown version
here called stirling, let me go ahead
and run stirling, type in something
like Zamyla, which I know in advance
is six characters.
Let's see if it works.
Indeed, it's six.
Let's try with Rob, three characters,
three characters as well, and so forth.
So that's all that's going
on underneath the hood.
And notice the connections,
then, with the first week
of class, where we talked about
something like abstraction,
which is just this layering of ideas, or
complexity, on top of basic principles.
Here, we're sort of looking
underneath the hood of stirling,
so to speak, to figure out,
how would it be implemented?
>> And we could re-implement it ourselves.
But we're never again going
to re-implement stirling.
We're just going to
use stirling in order
to actually get some strings length.
>> But there's no magic
underneath the hood.
If you know that underneath
the hood, a string
is just a sequence of characters.
And that sequence of characters
all can be numerically addressed
with bracket 0, bracket
1, bracket 2, and you
know that at the end of a string is a
special character, you can figure out
how to do most anything in a
program, because all it boils down to
is reading and writing memory.
That is, changing and looking
at memory, or moving things
around in memory, printing things
on the screen, and so forth.
>> So let's now use this newfound
understanding of what strings actually
are underneath the hood, and
peel back one other layer
that up until now we've
been ignoring altogether.
In particular, any time
we've implemented a program,
we've had this line of code
near the top declaring main.
And we've specified int main void.
>> And that void inside the parentheses
has been saying all this time that main
itself does not take any arguments.
Any input that main is
going to get from the user
has to come from some other
mechanism, like get int,
or get float, or get string,
or some other function.
But it turns out that
when you write a program,
you can actually specify
that this program shall
take inputs from the human
at the command line itself.
>> In other words, even though we thus far
have been running just ./hello hello
or similar programs, all of the
other programs that we've been using,
that we ourselves didn't write,
have been taking, it seems,
command line arguments--
things like make.
You say something like make,
and then a second word.
Or clang, you say clang, and then
a second word, the name of a file.
>> Or even RM or CP, as you might
have seen or used already
to remove or copy files.
All of those take so-called
command line arguments--
additional words at the terminal prompt.
But up until now, we
ourselves have not had
this luxury of taking input from the
user when he or she actually runs
the program itself at the command line.
>> But we can do that by re-declaring
main moving forward, not as having
void in parentheses,
but these two arguments
instead-- the first an integer,
and the second something
new, something that we're going to call
an array, something similar in spirit
to what we saw in Scratch as a list, but
an array of strings, as we'll soon see.
But let's see this by
way of example, before we
distinguish exactly what that means.
>> So if I go into CS50 IDE
here, I've gone ahead
and declared in a file called
argv0.c the following template.
And notice the only thing
that's different so far
is that I've changed void to int
argc string argv open bracket, close
bracket.
And notice for now, there's
nothing inside of those brackets.
>> There's no number.
And there's no i, or
n, or any other letter.
I'm just using the
square brackets for now,
for reasons we'll come
back to in just a moment.
>> And now what I'm going to do is this.
If argc equals equals 2--
and recall that equals equals
is the equality operator comparing
the left and right for equality.
It's not the assignment
operator, which is
the single equal sign, which means copy
from the right to the left some value.
>> If argc equals equals 2, I want to
say, printf, hello, percents, new line,
and then plug in-- and here's the new
trick-- argv bracket 1, for reasons
that we'll come back to in a moment.
Else if argc does not
equal 2, you know what?
Let's just go ahead and, as usual, print
out hello world with no substitution.
>> So it would seem that if argc, which
stands for argument count, equals 2,
I'm going to print out
hello something or other.
Otherwise, by default, I'm
going to print hello world.
So what does this mean?
>> Well, let me go ahead and save
this file, and then do make argv0,
and then ./argv0, Enter.
And it says hello world.
Now, why is that?
>> Well, it turns out anytime you
run a program at the command line,
you are filling in what we'll
generally call an argument vector.
In other words, automatically the
computer, the operating system,
is going to hand to your program
itself a list of all of the words
that the human typed at
the prompt, in case you
the programmer want to do
something with that information.
And in this case, the only word
I've typed at the prompt is ./argv0.
>> And so the number of arguments that is
being passed to my program is just one.
In other words, the argument
count, otherwise known as argc
here as an integer, is just one.
One, of course, does not equal two.
And so this is what prints, hello world.
>> But let me take this somewhere.
Let me say, argv0.
And then how about Maria?
And then hit Enter.
>> And notice what magically happens here.
Now, instead of hello world, I have
changed the behavior of this program
by taking the input not from get
string or some other function,
but from, apparently, my command
itself, what I originally typed in.
And I can play this game again by
changing it to Stelios, for instance.
>> And now I see another name still.
And here, I might say Andi.
And I might say Zamyla.
And we can play this game all day long,
just plugging in different values,
so long as I provide exactly
two words at the prompt,
such that argc, my argument count, is 2.
>> Do I see that name plugged into
printf, per this condition here?
So we seem to have now
the expressive capability
of taking input from another mechanism,
from the so-called command line,
rather than having to wait
until the user runs the program,
and then prompt him or her
using something like get string.
>> So what is this?
Argc, again, is just an integer,
the number of words-- arguments--
that the user provided at the
prompt, at the terminal window,
including the program's name.
So our ./argv0 is, effectively,
the program's name,
or how I run the program.
>> That counts as a word.
So argc would be 1.
But when I write Stelios, or
Andi, or Zamyla, or Maria,
that means the argument count is two.
And so now there's two words passed in.
>> And notice, we can continue this logic.
If I actually say
something like Zamyla Chan,
a full name, thereby passing
three arguments in total,
now it says the default again,
because, of course, 3 does not equal 2.
>> And so in this way, do I have
access via argv this new argument
that we could technically
call anything we want.
But by convention, it's
argv and argc, respectively.
Argv, argument vector, is kind
of a synonym for a programming
feature in C called an array.
>> An array is a list of similar values
back, to back, to back, to back.
In other words, if one is right here in
RAM, the next one is right next to it,
and right next to it.
They're not all over the place.
And that latter scenario, where things
are all over the place in memory,
can actually be a powerful feature.
But we'll come back to that when we
talk about fancier data structures.
For now, an array is just a
chunk of contiguous memory,
each of whose elements are
back, to back, to back, to back,
and generally the same type.
>> So if you think about, from a
moment ago, what is a string?
Well, a string, like Zamyla,
Z-A-M-Y-L-A, is, technically,
just an array.
It's an array of characters.
>> And so if we really draw this, as I
did earlier, as a chunk of memory,
it turns out that each of these
characters takes up a byte.
And then there's that special
sentinel character, the backslash 0,
or all eight 0 bits, that
demarcates the end of that string.
So a string, it turns
out, quote unquote string,
is just an array of chara--
char being an actual data type.
>> And now argv, meanwhile--
let's go back to the program.
Argv, even though we see the word
string here, is not a string itself.
Argv, argument vector,
is an array of strings.
>> So just as you can have an array of
characters, you can have higher level,
an array of strings-- so, for instance,
when I typed a moment ago ./argv0
argv0, space Z-A-M-Y-L-A, I claimed that
argv had two strings in it-- ./argv0,
and Z-A-M-Y-L-A. In
other words, argc was 2.
Why is that?
>> Well, effectively, what's going
on is that each of these strings
is, of course, an array of characters
as before, each of whose characters
takes up one byte.
And don't confuse the actual 0
in the program's name with the 0,
which means all 80 bits.
And Zamyla, meanwhile, is still
also an array of characters.
>> So at the end of the day, it really
looks like this underneath the hood.
But argv, by nature of how main
works, allows me to wrap all of this
up into, if you will, a bigger array
that, if we slightly over simplify
what the picture looks like and don't
quite draw it to scale up there,
this array is only of size 2, the first
element of which contains a string,
the second element of
which contains a string.
And, in turn, if you
kind of zoom in on each
of those strings, what you
see underneath the hood
is that each string is just
an array of characters.
>> Now, just as with strings,
we were able to get access
to the i-th character in a string
using that square bracket notation.
Similarly, with arrays
in general, can we
use square bracket notation to get
at any number of strings in an array?
For instance, let me
go ahead and do this.
>> Let me go ahead and create argv1.c,
which is a little different this time.
Instead of checking for argc2,
I'm going to instead do this.
For int I get 0, I is less
than argc, I plus plus,
and then print out inside of this,
percent s, new line, and then
argv bracket i.
>> So in other words, I'm not dealing with
individual characters at the moment.
Argv, as implied by these empty square
braces to the right of the name argv,
means argv is an array of strings.
And argc is just an int.
>> This line here, 6, is
saying set i equal to 0.
Count all the way up to,
but not including, argc.
And then on each iteration,
print out a string.
What string?
>> The i-th string in argv.
So whereas before I was
using the square bracket
notation to get at the ith
character in a string, now
I'm using the square bracket notation
to get at the ith string in an array.
So it's kind of one layer
above, conceptually.
>> And so what's neat about this
program now, if I compile argv1,
and then do ./argv1, and then type
in something like foo bar baz,
which are the three default words that a
computer scientist reaches for any time
he or she needs some placeholder words,
and hit Enter, each of those words,
including the program's name, which
is in argv at the first location,
ends up being printed one at a time.
And if I change this, and I say
something like argv1 Zamyla Chan,
we get all three of those
words, which is argv0,
argv1, argv2, because in this
case argc, the count, is 3.
>> But what's neat is if you understand
that argv is just an array of strings,
and you understand that a string
is an array of characters,
we can actually kind of use this
square bracket notation multiple times
to choose a string, and then choose
a character within the string,
diving in deeper as follows.
In this example, let me go
ahead and call this argv2.c.
And in this example, let me go ahead
and do the following-- for int i get 0,
i is less than argc, i plus
plus, just like before.
So in other words-- and now this
is getting complicated enough.
Then I'm going to say
iterate over strings in argv,
as a comment to myself.
And then I'm going to have a
nested for loop, which you probably
have done, or considered
doing, in Scratch, where
I'm going to say int-- I'm
not going to use i again,
because I don't want to shadow, or
sort of overwrite the existing i.
>> I'm going to, instead, say j, because
that's my go to variable after i,
when I'm just trying to
count simple numbers.
For j gets 0-- and also, n, is going to
get the stern length of argv bracket i,
so long as j is less than m,
j plus plus, do the following.
And here's the interesting part.
>> Print out a character and a new line,
plugging in argv bracket i, bracket j.
OK, so let me add some comments here.
Iterate over characters
in current string,
print j-th character in i-th string.
So now, let's consider
what these comments mean.
>> Iterate over the strings
in argv-- how many
strings are in argv, which is an array?
Argc many, so I'm iterating
from i equal 0 up to argc.
Meanwhile, how many characters
are in the i-th string in argv?
>> Well, to get that answer,
I just call string length
on the current string I care
about, which is argv bracket i.
And I'm going to temporarily store that
value in n, just for caching purposes,
to remember it for efficiency.
And then I'm going initialize j to 0,
keep going so long as j is less than n,
and on each iteration increment j.
>> And then in here, per
my comment on line 12,
print out a character,
followed by a new line,
specifically argv bracket
i gives me the i-th string
in argv-- so the first word, the
second word, the third word, whatever.
And then j dives in deeper, and gets
me the j-th character of that word.
And so, in effect, you can treat
argv as a multi-dimensional,
as a two-dimensional, array,
whereby every word kind of looks
like this in your mind's
eye, and every character
is kind of composed in
a column, if that helps.
>> In reality, when we tease
this apart in future weeks,
it's going to be a little
more sophisticated than that.
But you can really
think of that, for now,
as just this two-dimensional
array, whereby one level of it
is all of the strings.
And then if you dive in deeper, you
can get at the individual characters
therein by using this notation here.
>> So what is the net effect?
Let me go ahead and
make argv2-- darn it.
I made a mistake here.
Implicitly declaring the
library function stirling.
So all this time, it's
perhaps appropriate
that we're sort of finishing
exactly where we started.
>> I screwed up, implicitly declaring
library function stirling.
OK, wait a minute.
I remember that, especially
since it's right here.
I need to include string.h in
this version of the program.
>> Let me go ahead and include
string.h, save that, go ahead
and recompile argv2.
And now, here we go, make argv2, Enter.
And though it's a little
cryptic at first glance,
notice that, indeed, what
is printed is dot argv2.
>> But if I type some words after the
prompt, like argv2 Zamyla Chan,
Enter, also a little
cryptic at first glance.
But if we scroll back up,
./argv2 Z-A-M-Y-L-A C-H-A-N.
So we've iterated over every word.
And, in turn, we've iterated over
every character within a word.
>> Now, after all of this,
realize that there's
one other detail we've been kind
of ignoring this whole time.
We just teased apart what
main's inputs can be?
What about main's output?
>> All of this time, we've been
just copying and pasting
the word int in front of main,
though you may see online,
sometimes incorrectly in older versions
of C and compilers, that they say void,
or nothing at all.
But, indeed, for the version
of C that we're using,
C 11, or 2011, realize
that it should be int.
And it should either be
void or argc and argv here.
>> But why int main?
What is it actually returning?
Well, it turns out all of this time,
any time you've written a program main
is always returning something.
But it's been doing so secretly.
>> That something is an
int, as line 5 suggests.
But what int?
Well, there's this
convention in programming,
whereby if nothing has
gone wrong and all is well,
programs and functions generally
return-- somewhat counterintuitively--
0.
0 generally signifies all is well.
So even though you think of
it as false in many contexts,
it actually generally means a good thing
>> Meanwhile, if a program returns 1,
or negative 1, or 5, or negative 42,
or any non-0 value,
that generally signifies
that something has gone wrong.
In fact, on your own Mac or PC,
you might have actually seen
an error message, whereby it
says something or other, error
code negative 42, or error code
23, or something like that.
That number is generally just a hint
to the programmer, or the company
that made the software,
what went wrong and why,
so that they can look through
their documentation or code,
and figure out what the
error actually means.
It's generally not
useful to us end users.
>> But when main returns 0, all is well.
And if you don't specify
what main should return,
it will just automatically
return 0 for you.
But returning something
else is actually useful.
>> In this final program, let me
go ahead and call this exit.c,
and introduce the last of today's
topics, known as an error code.
Let me go ahead and include our
familiar files up top, do int main.
And this time, let's do int argc,
string argv, and with my brackets
to imply that it's in the array.
And then let me just do a sanity check.
This time, if argc does not
equal 2, then you know what?
Forget it.
I am going to say that, hey, user,
you are missing command line argument
backslash n.
>> And then that's it.
I want to exit.
I am going to preemptively,
and prematurely really, return
something other than the number 1.
The go to value for the first
error that can happen is 1.
If you have some other erroneous
situation that might occur,
you might say return 2 or return 3, or
maybe even negative 1 or negative 2.
>> These are just exit codes
that are, generally,
only useful to the programmer, or the
company that's shipping the software.
But the fact that it's
not 0 is what's important.
So if in this program, I want to
guarantee that this program only
works if the user provides me
with an argument count of two,
the name of the program, and some other
word, I can enforce as much as follows,
yell at the user with printf saying,
missing command line argument,
return 1.
That will just immediately
quit the program.
>> Only if argc equals 2 will we get down
here, at which point I'm going to say,
hello percent s, backslash n, argv1.
In other words, I'm
not going after argv 0,
which is just the name of the program.
I want to print out hello, comma,
the second word that the human typed.
And in this case on
line 13, all is well.
>> I know that argc is 2
logically from this program.
I'm going to go ahead and return 0.
As an aside, keep in mind that
this is true in Scratch as well.
>> Logically, I could do this
and encapsulate these lines
of code in this else clause here.
But that's sort of
unnecessarily indenting my code.
And I want to make super
clear that no matter what,
by default, hello
something will get printed,
so long as the user cooperates.
>> So it's very common to use
a condition, just an if,
to catch some erroneous
situation, and then exit.
And then, so long all is
well, not have an else,
but just have the code
outside that if, because it's
equivalent in this
particular case, logically.
So I'm returning 0, just to
explicitly signify all is well.
>> If I omitted the return 0, it would
be automatically assumed for me.
But now that I'm returning
one in at least this case,
I'm going to, for good measure and
clarity, return 0 in this case.
So now let me go ahead and make exit,
which is a perfect segue to just leave.
>> But make exit, and let me go
ahead and do ./exit, Enter.
And the program yelled at me,
missing command line argument.
OK, let me cooperate.
>> Let me instead do ./exit, David, Enter.
And now it says, hello David.
And you wouldn't normally see this.
>> But it turns out that there's a
special way in Linux to actually see
with what exit code a program exited.
Sometimes in a graphical
world like Mac OS or Windows,
you only see these numbers when an
error message pops up on the screen
and the programmer
shows you that number.
But if we want to see what the error
message is, we can do it here--
so ./exit, Enter, print
missing command line argument.
>> If I now do echo $?, which is
ridiculously cryptic looking.
But $?
is the magical incantation
that says, hey, computer,
tell me what the previous
program's exit code was.
And I hit Enter.
I see 1, because that's what I
told my main function to return.
>> Meanwhile, if I do ./exit David,
and hit Enter, I see, hello David.
And if I now do echo $?, I see hello 0.
And so this will actually
be valuable information
in the context of the debugger, not so
much that you, the human, would care.
But the debugger and other
programs we'll use this semester
will often look at that number,
even though it's sort of hidden away
unless you look for it, to
determine whether or not a program's
execution was correct or incorrect.
>> And so that brings us to
this, at the end of the day.
We started today by looking at
debugging, and in turn at the course
itself, and then more interestingly,
technically underneath the hood
at what strings are, which last
week we just took for granted,
and certainly took them
for granted in Scratch.
>> We then looked at how we can access
individual characters in a string,
and then again took a higher level
look at things, looking at how well--
if we want to get at individual
elements in a list like structure,
can't we do that with multiple strings?
And we can with command line arguments.
But this picture here of just boxes
is demonstrative of this general idea
of an array, or a list, or a vector.
And depending on the
context, all of these words
mean slightly different things.
So in C, we're only going
to talk about an array.
And an array is a chunk
of memory, each of whom's
elements are contiguous, back,
to back, to back, to back.
>> And those elements are, generally,
of the same data type, character,
character, character, character, or
string, string, string, string, or int,
int, int, whatever it is
we're trying to store.
But at the end of the day, this is
what it looks like conceptually.
You're taking your
computer's memory or RAM.
And you're carving it out into
identically sized boxes, all of which
are back, to back, to
back, to back in this way.
>> And what's nice about
this idea, and the fact
that we can express values in this way
with the first of our data structures
in the class, means we can start
to solve problems with code
that came so intuitively in week 0.
You'll recall the phone
book example, where
we used a divide and conquer,
or a binary search algorithm,
to sift through a whole
bunch of names and numbers.
But we assumed, recall, that that
phone book was already sorted,
that someone else had already
figured out-- given a list of names
and numbers-- how to alphabetize them.
And now that in C we,
too, have the ability
to lay things out, not
physically in a phone book
but virtually in a computer's
memory, will we be able next week
to introduce again this-- the first
of our data structures in an array--
but more importantly, actual computer
science algorithms implemented
in code, with which we can store
data in structures like this,
and then start to manipulate it, and
to actually solve problems with it,
and to build on top of that,
ultimately, programs in C,
in Python, in JavaScript,
querying databases with SQL?
>> And we'll see that all of these
different ideas interlock.
But for now, recall that the
domain that we introduced today
was this thing here, and
the world of cryptography.
And among the next problems you yourself
will solve is the art of cryptography,
scrambling and de-scrambling
information, and ciphering
and deciphering text,
and assuming ultimately
that you now know what
is underneath the hood
so that when you see or receive
a message like this, you
yourself can decipher it.
All this, and more next time.
>> [VIDEO PLAYBACK]
>> -Mover just arrived.
I'm going to go visit
his college professor.
Yep.
Hi.
It's you.
Wait!
David.
I'm just trying to figure
out what happened to you.
Please, anything could help.
You were his college
roommate, weren't you?
You were there with him when
he finished the CS50 project?
>> [MUSIC PLAYING]
>> -That was CS50.
>> I love this place.
>> -Eat up.
We're going out of business.
>> [END PLAYBACK]
