[MUSIC PLAYING]
DAVID MALAN: All right.
This is CS50, and this is the
day before our test, of course.
But this is lecture 8,
in which we're actually
going to finally transition from
C, this lower-level language
that we've been spending
quite some time to.
And the goal today isn't so
much to focus on Python per se,
but honestly to do what we hope will
be one of the most empowering aspects
of the class, which is to emphasize that
this has not been a semester learning
C. This has been a semester
learning programming,
a certain type of programming called
procedural or imperative programming.
But more on that in another
higher-level class perhaps.
But really, that this class
is about ultimately teaching
yourself to learn new languages.
And indeed, what you'll find is that
as we explore some of the features
and the syntax of
Python, odds are today,
it might look as cryptic as
C did just a few weeks ago.
But you'll find that once you
start recognizing patterns,
as you have with C, it will be all the
more accessible and all the more useful
when solving some problems.
So unrelatedly, just earlier this
week, I happened to be in Mountain View
with some of the team.
And you might recall from
last lecture at Harvard
we offered this glimpse of one
of the earliest racks of servers
that Google itself had.
Well, turns out they changed buildings.
But we happened to stumble
upon the actual display.
So pictured here is a
photo from my own phone,
which was actually really cool to see.
So inside of this, you'll see all
of the old hard drives they've used.
We actually looked at
some of the labels.
And indeed, hard drives
manufactured in 1999,
which is when Google started
getting some of its momentum.
You can see the green
circuit boards here,
on which would be CPUs and
other things, potentially.
So if you'd like a
stroll down memory lane,
feel free to read up on this on
Wikipedia or even on the excerpts here.
And then strangely enough, at
the conference some of us were at
did we discover this-- perhaps
the biggest duck debugger made up
of smaller duck debuggers,
one of whom was our own.
So that, too, was how
we spent this past week.
All right.
So how are we going to spend
this week and the weeks to come?
So you'll recall that when we
transitioned from Scratch to C,
we drew a couple of comparisons
between syntax and features.
And I thought it'd be useful to
take that same approach here,
really to emphasize that most of the
ideas we're going to explore today
are themselves not new.
It's just how you
express them and how you
write the syntax in the
language known as Python that's
indeed going to be different
from Scratch, from C,
and now here we are with Python.
So back in the day, in week 0, when
you wanted to say something in Scratch,
you would literally use this blue
purple puzzle piece, say hello.
And we called that a
function or a statement.
It was some kind of verb action.
And in C, of course, it looked
a little something like this.
Henceforth, starting today in
Python, it's going to look like this.
So before, after.
Before, after.
So it's pretty easy to
visually diff these two things.
But what are just a
couple of the differences
that jump out at you immediately?
C, Python.
So there's no more backslash n,
it would seem, in this context.
So that's kind of a nice relief
to not have to type anymore.
What else seems to be different?
No semicolon, thank god.
Right?
Perhaps the stupidest
source of frustration
that you might have experienced
by just omitting one of those.
And someone over here?
Yeah, so printf is now just print,
which is pretty reasonable unto itself.
So these are terribly minor differences.
But it's sort of testament to
the kinds of mental adjustments
you're going to have to start to make.
Fortunately, thus far we've seen that
you can start leaving things off,
which is actually a guiding principle of
Python in that one of its goals is it's
meant to be easier to write than some
of its predecessors, among them C.
So in C we might have implemented
this hello, world program that
actually ran when you clicked the green
flag using code like that at the right.
And this was, if those of you who
had no programming experience coming
in to CS50, what probably looked
like the proverbial Greek to you
just a few weeks ago.
And we teased apart what
those various lines meant.
But in Python, guess what?
If you want to write a program whose
purpose in life is to say, hello, well,
just write def main.
Print hello, world.
So it's a little similarly structured.
And in fact, it does not lack for
some of the more arcane syntax here.
But we'll see soon what
this actually means.
But it's a little simpler
than the one before.
And let's tease this apart.
So def here simply means
define me, a function.
So whereas in C we've
historically seen that you specify
the type that the
function should return,
we're not going to do
that in Python anymore.
Python still has data
types, but we're not
going to explicitly mention
what data types we're using.
Meanwhile, here is the
name of the function.
And main would be a
convention, but it's not
built into the language in the same
way as it is in C, as we shall see.
Meanwhile, this silly
incantation is just
a way of ensuring that
the default function
to be executed in a Python program
is indeed going to be called main.
But more on that when we
actually start creating.
But this is perhaps the most
subtle but most important
difference, at least early on.
And it's even hard to see at this scale.
But notice the colons both here and here
that I've highlighted now in yellow,
and these dots, which
are not to be typed,
but are just meant to draw
your attention to the fact
that I hit the space bar four
times in those locations.
So if you have ever sort of gotten
some feedback from your TA or TF
that your style could be better,
closer to 5 out of 5, because of lack
of indentation or pretty
formatting, Python's
actually gonna help us out with this.
So Python code will not run if you
have not invented things properly.
So gone are the curly braces that
encapsulate related lines of code
within some block of functionality.
And instead, they're replaced
generally with this general structure.
You have a colon, and then
below that and indented
are all of the lines that are somehow
related to that earlier line of code.
And the indentation must be consistent.
So even though your
own eye might not quite
distinguish four spaces from
three, the Python environment will.
And so this will
actually help implicitly
enforce better style,
perhaps, than might
have been easily done from the get-go.
So then, of course, in Scratch,
we had a forever block,
which says, hello, world
forever, much like in C, we
could implement it like this.
Now there's actually a pretty
clean mapping in Python.
We already know we can
get rid of the semicolon.
We already know we can get
rid of the curly braces.
We're going to have to add in a colon.
But it turns out we can get
rid of a little more, too.
So what more is absent from this
translation of hello, world to Python?
This one's more subtle.
So we definitely got
rid of the curly braces,
relying now just on indentation.
OK, so there's no
parentheses around while.
And so this, too, is actually
meant to be a feature of Python.
If you don't logically need parentheses
to enforce order of operations,
like in arithmetic or the
like, then don't use them
because they're just a distraction.
They're just more to type.
And the code now is just visually
cleaner and easier to read.
There's a minor difference, too--
True and False are going to
be capitalized in Python.
But that's a fairly incidental detail.
But notice this kind of captures
already the spirit of Python.
It's not a huge leap to
go from one to the other.
But we've just kind of
started to get rid of some
of the clutter and the stuff that
never really intellectually added much,
and if anything was annoying
to have to remember early on.
So True here is our Boolean.
And now we have a finite
number of iterations.
We might want to say hello,
world exactly 50 times.
In C, this was a crazy mess
if you wanted to do this.
You'd have to initialize a
variable with which to count up to,
but not including 50, plus plussing
along the way and so forth.
In Python, it's going
to be a little cleaner.
And we'll come back to
what this means exactly.
But if you kind of read it from left to
right, it kind of says what you mean.
Right?
For i in the range of 50.
So i is probably going to be a variable.
And notice we're not
mentioning its type.
It's going to be implied by whatever
the context is, which in this case
has to do, apparently,
with numbers, per the 50.
Range is actually going to
be a data type unto itself.
It's a little funky in that sense.
It's called a class.
But this essentially is a special
feature of Python that, unlike in C,
where if you want to iterate over an
array of values or 50 such values,
you would literally have
an array of 50 values.
Range is kind of cool in
that it kind of stands there.
And every time you iterate through a
loop, it hands you the next number,
but just one at a time, thereby
using maybe as little as one
50th the amount of memory because it
only has to keep one number around
at a time.
And there's a bit more
overhead than that.
It's not a perfect savings, quite so.
But this just says for i
in range 50, and that's
going to implicitly count
from 0 up through 49.
And meanwhile, what's below it is
what's going to get printed this time.
So meanwhile, here was one of our
bigger Scratch blocks early on.
And i translates pretty
literally to code in C.
And you can perhaps
guess, if you've never
seen Python before today, what the
Python code might now look like.
If this here on the
right is the C code, what
are some of the features syntactically
that we're about to throw away?
Yeah.
AUDIENCE: You can throw away the
curly braces and the parentheses.
DAVID MALAN: Curly braces and
parentheses are going to go away.
What else might go away?
The semicolons are going to go away.
The backslash n inside
of the print statements.
Great.
One more thing, I think.
The if.
So we don't strictly
need the parentheses
because it's not like I'm
combining things logically
like this or that or this and that.
So it should suffice to
get rid of those two.
And there's a couple of other tweaks
we're going to have to make here.
But indeed, the code's going to
be a lot tighter, so to speak.
Now you're just going to
say what you mean here.
And there is one weird thing.
And it's not a typo.
What apparently are we going
to have to start knowing now?
Elif whatever.
So elif is not a typo.
It's indeed how you express
the notion of else-if.
But otherwise, everything
is exactly the same.
And notice the colons.
Frankly, ironically,
whereas previously it
might have been annoying to
occasionally forget a semicolon,
now the colons my take on that role.
But at least everything below
them is meant to be indented.
So here's a fundamental difference
beyond the sort of silly syntactic
differences of this and,
say, other languages--
the flow of work that we've been using
thus far has been essentially this
in C. You write source code in
a file generally ending in .c.
You run a compiler, which, as
a quick check, is called clang.
So it's not technically make.
Make is just this helpful
build utility that
automates the process of calling clang.
So clang is, strictly
speaking, the compiler.
And clang outputs zeros and ones,
otherwise known as machine code.
And your computer-- Mac, PC, whatever--
has a CPU, Central Processing
Unit inside, made by Intel
or some other company.
And that's CPU is
hardwired to understand
certain patterns of bits, zeros and
ones, otherwise known as machine code.
So that's been our world in C.
With Python-- so the code that
you might have compiled in C,
for instance, might
have been this, which
we said we run clang on like this.
And if you don't specify a
default file name as output,
you'll instead just get in your
file all of the zeros and ones,
which can then be executed
by way of ./a.out,
the default name for the
assembler's output here.
So in Python, though, the world gets
here, too, a little simpler, as well.
So we just now have source
code and an interpreter.
So there's no machine
code, it would seem.
There's no compiler, it would seem.
And frankly, there's
one fewer arrow, which
suggests to me that the process
of running Python code itself
is actually going to be a little easier.
Running C code has
typically been two steps.
You rerun clang, or
via make you run clang.
Then you run the program.
And it's fine.
It's not all that hard.
But it's two steps.
Why not reduce to two steps what
you could instead do in one?
And we'll see exactly what this means.
Now, technically, that's a
bit of an oversimplification.
Technically, underneath
the hood, if you wanted
to run a program like this that
simply prints out hello, world,
you would simply run python hello.py.
And the result of that would be
to see hello, world on the screen,
as we'll soon see.
But technically, underneath the hood,
there is some other stuff going on.
So there actually kind of is a compiler.
But there's not something
called machine code, per se.
It's called bytecode.
There's even something called
a Python virtual machine.
But all of this is
abstracted away for us,
certainly for the sake
of today's conversation,
but also in the real
world, more generally.
Humans have gotten
better over the decades
at writing software and writing tools
via which we can write software.
And so a lot of the
more manual processes
and a lot of the lower-level
details that we've
been focusing on, if not struggling
on, in C, start to go away.
Because much like in week 0, where we
started layering on idea after idea--
zeros and ones, ASCII,
colors, and whatnot--
similarly with our actual tools are
we're going to start to do the same.
So whereas in actuality what's
going on underneath the hood is
this process here, we can
start to think about it,
really, as something quite simpler.
Now, if you're curious, and if
you take some higher-level class
like CS61 or another,
you'll actually talk
about things like bytecode and
assembly code and the like.
And we saw a glimpse of
the latter a bit ago.
This happens to be an
intermediate language
that Python source code is converted
into before it's run by the computer.
But again, we're going to turn a blind
eye to those lower-level details.
So here are some of the
tools now in our toolkit.
In Python, there are data
types, though as of now we've
not seen any examples whereby
I specify what types of values
are going to be in my variables
or what types of values
a function's going to return.
But they are there.
Everything is sort of
loosely typed in that
whatever you want a variable to be,
it will just take on that data type,
whether it's an int
or string or the like.
It's not going to be
the full word string.
In Python, it's literally called str.
But there are some familiar types here--
bool and float and int and others.
And, in fact, among the others, as
we'll soon see, are features like range.
But before that, note too that we'll
provide for at least our first foray
into Python a few familiar functions.
So Python has different mechanisms
than C for getting input from the user.
We've abstracted some of those details
away in a new CS50 library for Python
that you'll really just
use one or few times before
we transition away from even that, but
will give you functions like get_char,
get_float, get_int, get_string that
handle all the requisite error checking
so that at least for
your first few programs,
you can just start to
get some real work done
without diving into
underneath the hood there.
And then lastly, here are some
other tools in our toolkit.
And we'll just scratch the
surface of some of these today.
But what's nice about Python and what's
nice about higher-level languages more
generally-- like more modern languages
that learned lessons from older
languages like C--
is that you get so much more for
free, so much more out of the box.
There's so much more of a kitchen sink.
There's so many metaphors we can use
here, all of which speak to the fact
that Python has more features
than C, much like Java,
if you took AP CS or
something else, had than C.
So does Python have a whole toolkit
for representing complex numbers,
for representing dictionaries,
otherwise implemented as hash tables,
as you now know; lists, which is
kind of synonymous with an array.
But a list is an array that can sort
of automatically grow and shrink.
We don't have to jump through hoops as
we did in C. Range we've seen briefly,
which just hands you back one number
after another in some range, ideally
for iteration.
Set is the notion from
mathematics, where
if you want to put bunches of
things into a data structure
and you want to make sure you
have only one of each such thing
without duplicates, you can use a set.
And a tuple is also a
mathematical notion,
typically where you can combine related
things without complicating things
with actual structs.
Like, x, y is a common paradigm
in lots of programs-- graphics,
or videos, or certainly
math and graphing itself.
You don't really need a whole
full-fledged data structure.
You might just want to say, x, y.
And so Python gives us that
kind of expressiveness.
So let's actually now dive
in with that quick mapping
from one world to the other and focus
on what you can actually do with Python.
So here I am in the familiar CS50 IDE.
Much like we have
pre-installed for you clang
and make and other tools, we've
also installed for you a program.
That program is called Python, which
is a little confusing at first glance
because Python is apparently
the name of the language.
But it's also the name of the program.
And here's where Python is different.
Whereas C is again compiled,
and you use something
like clang to convert
it to machine code,
Python is both the name of the language
and the name of the program you
use to interpret the language.
So pre-installed in CS50
IDE, and frankly, these days,
probably on your own Macs or
PCs, even if you don't know it,
is a program called Python that
if fed Python source code as input
will do what that code says.
So let's go ahead and try
something just like that.
Let me go ahead and save a
file preemptively as hello.py.
So .py will be the
convention now instead of .c.
And I'm going to go ahead and
actually keep this pretty simple.
I'm just going to
print the first thing--
muscle memory.
So it's not printf anymore.
It's just hello, world.
Save, done.
That's going to be my
first program in Python.
Why?
It's one line of code.
It's consistent with the
features I've claimed Python has.
So how do I run it?
Well, in C, we would have
done, like, make hello.
But make knows nothing about this
because make is typically used with C,
at least in this context here.
So maybe it's, like, ./hello.py.
No.
It seems I don't have permission there.
But there's a step that I teased us
with earlier on just the slide alone.
How do I go about running
a program, did I say?
AUDIENCE: Python hello.py.
DAVID MALAN: Yeah.
I have to be a little more explicit.
So python, which is the name of the
interpreter that understands Python.
And now I need to feed it some input.
And we know from our
time in C that programs
can take command-line arguments.
And indeed, this program
itself does, Python.
You just give it the
name of a program to run.
And there it is, our very first program.
So that's all fine and good.
But what if I wanted to do
something a little more interesting,
like getting a string from the user?
Well, turns out in Python,
in CS50 IDE especially,
I can do something like
this. s gets get_string.
And I can ask someone, for
instance, for their name, like this.
Now, CS50 IDE is already yelling at me--
undefined variable get_string.
and let's actually see
if maybe it's just buggy.
No.
So this is a little
more arcane than usual.
But traceback, most recent call last.
File "hello.py," line 2, in
module-- whatever that is.
So I see a line of code from line 2.
NameError-- name
get_string is not defined.
This is not the same
language we've seen before,
but what does this feel reminiscent of?
Yeah, like in the past, when
you've forgotten cs50.h,
you've gotten something about an
undeclared identifier, something
like that.
It just didn't understand something
related to the CS50 library.
So in C, we would have
done include cs50.h.
That's no longer germane
because now we're in Python.
But it's somewhat similar in spirit.
Now I'm going to say instead from cs50
import get_string, and now save that.
And hopefully momentarily, the errors
will go away as the IDE realizes,
oh, you've now imported
the CS50 library,
specifically a method or function,
rather, inside of it called get_string.
So there, too, it's different
syntax, but it kind of
says what it means-- from cs50, which
is apparently the name of the library,
import a function called get_string.
Now if I go ahead and
rerun python hello.py,
I can go ahead and type
in, say, Maria's name
and ignore her altogether because
I need to make a fix here.
What's the obvious bug--
obvious now, to me-- in the program?
AUDIENCE: You need to
include the variable for s.
DAVID MALAN: Yeah.
So I need to include s,
which I got on line 3,
but didn't thereafter use in any way.
So this is going to be wrong, of
course, because that's going to say,
literally, hello s.
This is kind of how we used to do it.
And then we would put in s.
But this is not printf.
This is print.
So the world is a little different.
And it turns out we can do this
in a couple of different ways.
Perhaps the easiest,
if least obvious, would
be something like this, where
I could simply say hello,
open curly brace, close curly brace.
And then inside of there, simply
specify the name of the variable
that I want to plug in.
And that's not quite all the way there.
Let me go ahead and run this once more.
Now if I type in Maria's name, oh.
Still not quite right.
I need to actually tell Python that
this is a special type of string.
It's a formatted string, similar
in spirit to what printf expected.
And the way you do this, even though
it's a little different from C,
is you just say f.
This is an f string.
So literally before the
quotes, you write the letter f.
And then if I now run
this program here, i'm
going to actually see
Maria's name as hello, Maria.
And I'll take care of that red X later.
So that's a format string.
And there's one other way.
And this is not very
obvious, I would say.
You might also see in online
documentation something like this.
And let's just tease this
apart for just a second.
It turns out in Python that
what I've highlighted in green
here is known as a string,
otherwise known as a str.
str is the name of this data type.
Well, unlike in C, where string was
kind of a white lie, where it was just
a pointer at the end
of the day, a string
is actually a first-class
object in Python, which means
it's not just a sequence of characters.
It has built-in functionality,
built-in features.
So much like a struct in C had
multiple things inside of it,
so does a string in Python have
multiple things inside of it,
not just the sequence of characters, but
functions that can actually do things.
And it turns out you access those
functions by way of the same dot
operator as in C. And
then you would only
know from the documentation or examples
in class what functions are inside
of the string object.
But one of them is format.
And that's just a function
that takes an argument-- what
do you want to plug into the
string to the left of the dot?
And so simply by
specifying, hey, Python,
here's a string with a placeholder.
Inside of this string is a
built-in function-- otherwise known
as a method, when a function is
inside some object or structure--
pass in the value s.
So if I now go ahead and rerun
this after saving my changes,
I should now see that Maria's
name is still plugged in.
So that's it.
But a simple idea that
now even strings have
things inside of them
besides the characters alone,
and you can access that via the dots.
So let's go ahead now and ramp things up
to a more familiar example from a while
back.
Let me go ahead and open
up two side-by-side windows
and see if we can't
translate one to the other.
I'm going to go ahead and open up, for
instance, int.c from some time ago.
So you might recall from int.c, we had
this program, whose purpose in life
was to get an integer from the user
and actually now plug it into printf,
and then print it out.
So what's going to be
different now in Python?
Well in Python, if I go ahead and
implement this as, say, int.py,
I'm going to go ahead
and do the following.
Let me scroll down to kind
of line things up roughly.
I can go ahead and say def main,
as I saw in the slides before.
And then over here, I can say i gets
get_int, quote, unquote, integer.
And then down here, I'm going to say
not printf but print, quote, unquote,
"hello," and then the placeholder.
What's the simplest way to do
this now, per our past example?
Curly brace i.
And then I just need to be super clear
this is a special f string or format
string, into which you
can plug in values.
And now I'm going to
go ahead and save that.
And I've got most of
the pieces together,
ignoring, for now, the red X.
So what more remains to be done?
I've made one same mistake as before.
Yeah, so the get_int. so up here,
really the equivalent of line 3
would be from cs50
import get_int this time.
Saving that.
And now if in my terminal window I
go ahead and run python of int.py--
hmm.
That seems strange.
It's not an error, in terms
of, like, erroneous output.
Just nothing happened.
So why might this be?
How might you go about troubleshooting
this, even with very little Python
under your belt?
Was that a hand, or no?
No?
OK.
Yeah?
AUDIENCE: Is there a line break?
DAVID MALAN: Is there a line break?
That's OK.
I was just doing that to kind
of make everything line up.
But it's no big deal.
Everything's indented properly,
which is the important aesthetic.
Yeah.
AUDIENCE: We didn't call the function.
DAVID MALAN: We didn't
call the function.
And this is where Python's a little
different from C. In C, recall,
main just gets called
automatically for you.
Humans years ago decided that shall
be the default name of a function.
In Python, line 6 here, calling
something main is just a convention.
I could have called it foo
or bar or any other word.
It has no special meaning.
And so in Python, if you
want to actually call main,
you need to do something,
frankly, that's,
I think, one of the stupider
distractions early on.
But you have to literally say this--
if the name of this file happens
to equal something that's
specially called main, then call main.
So long story short, when you run
the Python interpreter on a file,
as we've been doing with python,
space, int.py or hello.py,
there is a special global variable
that your program has access to called
__name__.
And if that default name
happens to be __main__,
then you know that you have the
ability to call any function you want
by default.
So for now, much like
we did in week one,
where we glossed over certain details
that just weren't all that interesting,
lines 11 and 12, for now, let's
consider not all that interesting.
But it's how we're going to
kick-start these programs.
Because now if I run python, space,
int.py, type in a great number--
hello, 42.
That's the meaning of life,
the universe, and everything.
So let's now actually do
something more powerful
than just getting a
single int from the user.
Let me go ahead and close off
this one and close off this one
and open up, say, ints.c after splitting
my window again into two windows here.
And let's open ints.c.
So this one was a little different
in that we did some arithmetic.
And so here is going to be
another difference in Python.
Here's what we did in C. And
what was curious or worth
noting about math in C?
Which of these did not quite behave
as you might expect in the real world?
Division?
Yeah, why?
What did division do?
Yeah, it chopped off or rounded down.
It floored the value by throwing away
everything after the decimal point.
So this line here, 18, where it's
such-and-such divided by such-and-such
is such-and-such.
And we literally just
said x divided by y.
If you divided, for instance, 1 divided
by 2 in grade school, hopefully,
you would get the value 1/2 or 0.5.
But in C, what did we get instead?
AUDIENCE: Zero.
DAVID MALAN: Zero.
So it gets truncated to
an int, the closest int
without a decimal point being
0 because 0.5 is really 0.5.
And thus we had that effect.
So in Python, things are
going to be similar in spirit.
But this is kind of a feature that
was fixed or a bug that was fixed.
In Python-- let me go ahead
here and open up an example
I wrote in advance called
ints.py, which is actually
now going to look like this.
So the Python equivalent now,
which I've roughly line up,
looks a little different.
And there's a few distractions
because we have all these f strings.
Now in the way.
But notice I'm just
plugging in x's and y's.
But what's a new feature, apparently,
in Python, arithmetically?
So floor division.
So this was the more proper term for
what C has been doing all this time.
In C, when you use use the slash and
you divide one number by another,
it divides, and then floors
it to the nearest int.
In Python, if you want that
same old-school feature,
you're going to now use slash slash,
not to be confused with the C comment.
And if you want division to work the way
you always knew it did in grade school,
you continue using just the slash.
So a minor point, but one of
the differences to keep in mind.
So if we actually run this here in
Python, if I go into source 8 today
and our week's directory for
week 1, and I run Python ints.py,
here now we're going to see 1 and 2.
And there's all of the values
that we would expect to see.
All right.
So without dwelling too much on
this, let's fast forward to something
more powerful like conditions.
So in Python, if we want to do
something only conditionally,
laying out my browser like this, let
me go ahead and open up, let's say,
conditions.py.
Sorry, conditions.c, which once
upon a time looked like this.
So in this example here,
notice that we have
a program that gets
two ints from the user,
and then just compares x and
y and x and y and prints out
whether they're greater than, less
than, or equal to, ultimately.
So let's actually do this one from
scratch over here on the right.
So let me go ahead and
save this as conditions.py.
And then at the top,
what's the very first thing
I'm going to apparently now need?
Yeah, so the CS50 library.
So from cs50 import-- it looks like
get_int is the one we want this time.
Now, how do I go about getting an int?
Or what's the translation
of line 9 on the left
to the right-hand side of the screen?
x equals get_into of the same prompt.
OK, what comes next, if I
line it up roughly here?
y gets get_int of quote, unquote, y.
And what's down here?
The condition.
So if x less than y?
No parentheses are necessary.
It's not wrong to put
them, but it's unnecessary.
And now enters a word
into our terminology--
it's not Pythonic, so to speak.
If you don't need them, don't put them.
So if x is indeed less than
y, what do we want to do?
We want to print x is less than y, yes?
No.
OK.
All right, good.
So else if x--
OK, good.
Right.
So, kind of goofily, elif, then go
ahead and print out x is greater than y.
And as an aside, I actually
did that accidentally.
But it turns out in Python, too, you
can use double quotes or single quotes.
Either is fine, whereas in C, single
quotes had a very specific meaning,
which meant what?
AUDIENCE: Char.
DAVID MALAN: Char.
So single characters.
And double quotes
meant strings, sequence
of characters, which meant zero or more
characters, followed by backslash 0.
In Python, all of that is gone.
Single quotes and double
quotes are equivalent.
I'll almost always use double quotes,
just for consistency, as should you,
for consistency, within your own files.
But sometimes it's useful to drop
into one or the other if you nest,
for instance, quote marks, as you
might have once in a while in C
OK.
So finally, else print
out x is equal to y.
So it's cleaner.
And frankly, I don't
need all this whitespace.
So let's go ahead and just make
this a little tighter still.
You can see that in 11 lines, we've
now done what took 27 or so last time.
But I have omitted
something, to be fair.
What did I omit?
Yeah, I didn't do that whole
calling of function thing.
There's no mention of main.
And it actually turns out that's
not strictly necessary in Python.
If you're going to be interpreting
a file that contains Python code,
and it's a simple enough program that
you don't really need to factor code
out and organize it into
separate functions, then don't.
If this is what would now be
called a command-line script,
a program that just has lines of
code that you can execute, literally,
at the prompt.
So if I go into this directory and
run python of conditions.py, Enter.
x will be 1. y will be 2.
x is indeed less than y.
And that's it.
I don't need to bother doing all
of this, as I proposed earlier.
def main, and then I could go in here.
And if you've never known
this, and now it's useful,
especially, for Python, you
can highlight lines and just
tab them all at once.
I could do this, but then I would
need this thing, which I'd probably
have to go look up how to remember it,
if you're doing it for the first time.
There's just no value in
this case to doing that.
But at least it can be there as needed.
So let me go ahead and undo that.
And we're back to a porting
of one to the other.
All right.
So that might then be conditions.
And let's see if we can't--
noswitch there.
Let's take a look at this one.
Let me open up, rather than
comparing all of them side-by-side,
let me just open up this
one now called noswitch.py,
which is reminiscent of a program we
ran some time ago called noswitch.c.
And you can perhaps infer what
this does from the comments alone.
What does this program do in English?
Because logical operators is
not all that explicit at top.
What's that?
Yeah.
So if you've ever interacted with a
program that asked you for a prompt,
yes or no, here's some code with
which you might implement it.
And we could do this in C. We're
just comparing characters here.
But there's a few differences
if you kind of now
think back to how you
might implement this in C,
even if you don't recall
this specific program.
I'm importing my library right up here.
I'm then calling
get_char this time, which
is also in CS50's library for Python.
And then notice there's
just a couple of things
different down here syntactically.
Besides the colons and the indentation
and such, what else is noteworthy?
Yeah.
Yeah.
Thank god, you can just say
more of what you mean now.
If you want to do something or
something, you literally say or.
And if we were instead--
albeit nonsensically here-- trying to
do the conjunction of two things, this
and that, you could literally say and.
So instead of the two vertical
bars or the two ampersands,
here's another slight
difference in Python.
Let's now take a look at another
example reminiscent of ones past,
this one called return.py.
So here's an example
where it's actually more
compelling to have a main
function because now I'm
going to start organizing my code
into different functions still.
So up here, we are importing the
get_int function from CS50 library.
Here I have my main function
just saying x gets get_int.
And then print out the square of x.
So how do you go about defining
your own custom function
in Python that's not just main?
Well, here on line 11 is how I would
define a function called square--
that takes, apparently,
an argument called n,
though I could call this anything I
want-- colon, return, n, star star, 2.
So a few new features here.
But again, it's no big deal once you
just kind of look these features up
in a manual or in a class.
What is star star probably doing?
AUDIENCE: Square root.
DAVID MALAN: Not square root.
The power of, yeah.
So n star star 2 is just n
raised to the power of 2.
That was not a feature we had in
C. So now we get this in Python.
And what's this line 12 in green
with the weird use of double quotes?
Yeah, it's a comment.
And it's a different type
of comment than we've
seen before because in my previous
example, I did have a few comments.
Recall that just a moment
ago, in conditions.py,
we had a whole bunch of comments.
Prompt the user for x.
Prompt the user for y.
Compare x and y.
So whereas in C we
were using slash slash,
Python, unfortunately, uses that
for floor division, so to speak.
So we instead just use the
hashtag or the pound sign
to specify a line that should
be thought of as a comment.
But here is something
a little different.
And we won't dwell too
much on this for now.
But Python has different types of
comments, one of which is this.
This is technically called a
docstring or document string.
And what's nice about Python, as well
as languages like Java and others still,
is that you can put
comments in your code
that special programs can read, and
then generate documentation for you.
So if you ever took AP CS
and you ever saw a Javadoc,
this was a way of
commenting your methods
and your code in Java using
funky @ signs and other syntax
so that if you ran a special command, it
could generate a user's manual for all
of your functions and tell you or
colleagues or friends or teachers
exactly what all your functions
are, what their arguments are,
what their return values
are, and all of that.
Similarly in Python can you use
these funky quote quote quote
docstrings to document your function.
So whereas in C our style has been
to put quotes above the functions,
in Python it's going to be to
put them as the first line inside
and indented within the function.
All right.
So now let's actually try to
port a program from code again,
thinking back on week one in C
when we had this program here.
So there's quite a bit going--
oops, spoiler.
Don't look at that.
Hopefully, that didn't sink in just yet.
So in week one, we had this
program in C, get_positive_int.
And its purpose in life
was to write a program that
gets a positive integer from
the user, in and of itself
not all that interesting.
But it was an opportunity
to introduce a few things.
One, we introduced this
line 6 several weeks ago,
which is known as a prototype.
And what was the purpose of having
that function prototype up there?
Yeah, you declare the function, but why?
Because it's already implemented
down here on line 15.
AUDIENCE: The way the program runs,
it needs to be in order or something
like that.
DAVID MALAN: Yeah.
Because of the way the
program's run, and frankly
because of how sort of naive or
dumb that clang is by design,
it does not know that a function
exists until it actually sees it.
So the problem is that if in C,
you have main, inside of which
is a call to function
like get_positive_int,
but it's not implemented
until a few lines later,
clang is going to be dumb and
just not know that it even exists.
And it's not going to compile your code.
So this prototype, as we called it,
is kind of a teaser, a hint, that
doesn't implement the whole function.
It just shows the compiler
its return type and its types
and order of parameters so
that that's enough information
to then just trust that if
I just blindly compile main,
eventually I'm going to see the
actual implementation of the function.
So I can compile its bits, as well.
So in here, in C, we
called get_positive_int,
and then we passed in a prompt.
We stored it in a variable calle
i, and then printed it out.
And then to implement this, we
used kind of a familiar construct
that you've used in other programs.
Pretty much anytime you want
to prompt the user for input,
and you want to keep
pestering him or her
until they cooperate with
whatever your conditions are,
you would use the
so-called do-while loop.
And because the do-while loop, recall,
is distinct from the while loop how?
AUDIENCE: It runs at least once.
DAVID MALAN: It runs
at least once, which
just kind of makes
intuitive sense if you
want to prompt the user for something.
And then if he or she
doesn't cooperate, only then
do you want to prompt them again.
By contrast with a while loop, it's
going to happen again and again
and again no matter
what, from the get-go.
So let's see if we can't
now convert this or port
this, as people would say, to Python.
So here I'm going to go ahead and
save a new file called positive.py.
And I'm going to go ahead and do
everything here in main, as before.
So I'm going to ahead and do, let's
say, from cs50 import get_int,
because I do need that.
And then I'm going to go ahead
and have my main method here.
And then inside of main, just
like on the left-hand side,
I'm going to do i gets
get_positive_int--
positive integer, please.
It's going to wrap a little bit now.
That's fine.
And then I'm going to go ahead
and print this, which, recall,
is just print an f string
where the placeholder is i,
although, frankly,
this is kind of stupid,
to just create a string that has nothing
other than the value we want to print.
Nicely enough in Python,
just print what you want.
And so that simplifies that argument.
So now it remains to
implement get_positive_int,
which is going to take some
kind of prompt as its input.
And notice I'm not specifying the
data type of prompt, which is string.
I'm not specifying the
return type of this function.
But both actually do
exist underneath the hood.
So in the past, to get a
variable, I would do something
like this, semicolon.
But I know I don't need the semicolon.
I know I don't need the data type.
And this just looks stupid to just
put a variable there to need it.
You don't need to do this in Python.
If you want to use a
variable, just start using it.
And unfortunately, whereas almost every
other feature we've seen in Python
thus far kind of maps directly
back to a feature in C,
Python does not have a do-while.
So it has the for-in, and it has while.
And maybe it has other things
we haven't told you about.
But it doesn't have do-while.
So knowing that, and knowing only
what we've presented thus far,
how do we still go about
getting an int from the user
and ensuring it's positive and
reprompting him or her if and only
if it's not?
Put another way, how would you do
this in C if we took away from you
the do-while construct?
Exclamation points?
OK.
So we could invert something,
maybe, using that logically.
AUDIENCE: You can just do a while loop.
DAVID MALAN: We could
just use a while loop.
How?
AUDIENCE: So while
prompt is less than 1.
DAVID MALAN: So while prompt is--
OK, so the prompt is the string
we're going to display to the user.
So it's not prompt, I think.
So maybe i or n, to be
consistent with the other side.
So you know what?
Why don't I-- what about this?
What if I just do-- you know what?
I know I need a loop.
This is by far the easiest
way to just get a loop, right?
It's infinite, which is not good.
But I can break out of loops, recall.
So what if I do something like this?
What if I do n gets get_int,
passing in the same prompt?
And then what do I want to do next?
I'm inside of an infinite loop.
So this is going to keep happening,
keep happening, keep happening until--
is positive?
So python's not quite
that user-friendly.
We can't just say that.
But we can say what?
AUDIENCE: Greater than 1.
DAVID MALAN: Greater than--
close.
AUDIENCE: Equal to.
DAVID MALAN: OK, that's fine.
Greater than or equal to one.
Then what do we want to do?
Break.
So it's not quite as cool
as, like, a do-while loop,
which kind of gives us all these
features, though frankly, this
was never that pretty, right?
Especially the fact that you had
to deal with the issue of scope
by putting the variable outside.
So in Python, the right way to do
this would be something like this.
Just induce an infinite loop, but make
sure you break out of it logically
when it's appropriate to do so.
And so now if I go ahead and add in
that last thing that I keep needing--
so if name equals main, and it's always
find to copy-paste something like that,
call main.
Let me go ahead now and in my terminal
window run python of positive.py.
And let me go ahead
and give it negative 5.
How about negative 1?
How about 0?
Whoops.
How about that?
How about 0?
1?
Hmm.
I screwed up.
None is interesting.
It's kind of our new null, so to speak.
But whereas in C, null can,
potentially, if used in the wrong way,
crash your program, Python
might just print it, apparently.
Where did I screw up?
Yeah, so I didn't
return an actual value.
And whereas clang might have
noticed something like this, Python,
the interpreter's not going to be as
vigilant when it comes to figuring out
if your code is missing something.
Because after all, we never said
we were going to return anything.
And so we don't strictly need to.
So what could I instead
do here instead of break?
I could just return n here.
Or I could equivalently do this, and
then just make sure I return n here.
And another difference in Python,
too, is that the issue of scope
isn't quite as difficult as it was in C.
As soon as I've declared n to exist up
here, it now exists down below.
So even though it was declared
inside of this indentation,
it is not scoped to
that while loop alone.
So either way could we
actually make this work.
OK, so now let's try to run this again.
Positive integer.
Negative 1.
0.
1.
And now we're actually
seeing the number 1.
All right.
Let me pause here for just a moment
and see if there's any questions.
No?
Yes.
AUDIENCE: Do you to call things
from the CS50 library individually,
or can you just import
the entire library?
DAVID MALAN: Ah, good question.
Do you have to call things inside
of the CS50 library individually,
or can you import the whole thing?
You can technically import
the whole thing as follows.
If you want access to
everything in the CS50 library,
you can literally say star.
And a star in programming--
well, in many computer contexts,
star generally is a wildcard character.
And it means anything that
matches this string here.
This is generally considered
bad practice, though.
Because if CS50 staff happens to give
you functionality or variables that you
don't want, you have now just
imported into your namespace,
so to speak, all of those functions.
So for instance, if the CS50
library had public inside of it
a variable called x
and y and z in addition
to functions like get_string
and get_int and get_char,
your program is now seeing
variables x and y and z.
And if you have your own
variables called x and y and z,
you're going to shadow
those variables inside ours.
And it just gets messy quickly.
So generally, you want to
be a little more nitpicky
and just import what you want.
Or, another convention in Python
is to not specify it like this,
but instead to do import CS50.
This does not have the
same effect of importing
all of those keywords like
get_int and get_string
into your program's namespace,
like the list of symbols
you can actually type in.
But what you then have to do is this--
you have to now prefix any usages
of the functions in that library
with the now familiar or
more familiar dot operator.
So this is just a
stylistic decision now.
I have consciously
chosen the other approach
so that initially, you can just call
get_int, get_string, just like we
did in C. But technically and
probably more conventionally would
people do this to make super clear
this isn't my get_int method.
It's CS50's get_int function.
OK.
Other questions?
Yeah.
AUDIENCE: Is it good coding practice
to do the if __name__ or just--
because you can run hello,
world without defining main.
Do you really need to do--
DAVID MALAN: Oh, it's a good question.
Short answer, no.
So I'm showing you
this way because you'll
see this in various examples
online and in programs
that you might look at
that are open source.
Strictly speaking,
this is not necessary.
If you end up making your own library,
this tends to be a useful feature.
But otherwise, I could equivalently do
this, which is perfectly fine as well.
I can still define get_positive int.
I can get rid of main altogether.
And I can just now do this.
So this program is equivalent
and just as fine for now.
OK.
So with that said, let's do
a couple of more examples
here to kind of paint a
picture of some of the things
that are similar and different.
And let's go ahead and open up, for
instance, overflow.c from some weeks
ago, splitting our windows again.
And then on the right-hand side, let me
open up something called overflow.py,
which I put together in advance.
So here we have on the left an
example of integer overflow, whereby
if I start counting at 1, and
then don't even have a condition,
and I just keep multiplying
i by 2, by 2, by 2,
doubling it, doubling it,
doubling it, doubling it,
we know from C that bad things
happen if you just kind of keep
incrementing something
without any boundary in sight.
So this program is just going to
print out each of those values,
and it's going to sleep
one second in between.
Same program in Python
looks pretty similar.
But notice I'm initializing i to
1, doing the following forever--
printing out i, multiplying i by 2,
and then sleeping for one second.
But sleep is also not built into
Python in the way that print is.
Notice what I had to include up here.
And I wasn't sure what that was.
And so honestly, just a
few days ago, I googled,
like, "sleep one second Python," saw
that there's this time library, inside
of which is a sleep function.
And that's how I knew which
library to actually include.
And so just as there
are man pages for C,
there's a whole documentation
website for Python
that has all of this
information, as well.
So let me go ahead and do this.
And let me actually try to
create two windows here.
What's the best way for me to do this?
Split one to two.
OK.
So let's do this, just so I
can run this in the same place.
So if I go into my source--
[POPPING NOISE]
Jeez.
My source 8 directory, and I go into
weeks and one, and I make overflow--
nope, sorry.
Week one.
OK.
So if I go into source one,
and I do make overflow,
which is kind of cute
semantically, I'm now
going to be able to run a
program called overflow.
Meanwhile, over here, let me go
ahead and split this window, too.
Dammit, not there.
Let's put this over here.
Oh, no!
OK.
One second.
Sorry.
Overflow.py.
OK.
So now we're-- oh, now
I lost the other window.
Oh, that's cool.
OK.
So let's do this.
OK.
Now I know how to use the IDE.
All right.
So on the left-hand side,
I'm about to run overflow.
And then lastly, without generating that
beep again, I'm going to go in here.
And I'm about to run
python of overflow.py.
All right.
And so the left will run the C version.
The right will run the Python version.
And we'll start to see--
no pun intended-- what
happens with these programs.
Oh, damn it.
I got to scroll.
OK, so I'll just keep scrolling for us.
This is fun.
OK.
OK.
Next time, Google how to sleep
for half a second instead.
OK.
So there we go.
Something bad has happened here.
And now C is just completely choking.
Things are in a funky state.
So what happened on the left,
before the answer scrolls away?
Integer overflow, right?
We had so many bits becoming
ones, that eventually, it
was mistaken for a negative
number temporarily.
And then the whole thing
just kind of got confused
and became permanently zeros.
Whereas on the right-hand
side, like, yeah, Python.
Look at you go.
Still counting higher
and higher and higher.
And even though we haven't talked
about the underlying representation
of these types in
Python, what can we infer
from the apparent better correctness
of the version on the right in Python?
It's not an eight-bit representation.
And even C, to be fair,
uses 32 bits for its ints.
And that's what we got as high as
2 billion or 4 billion in total.
But same idea.
How many bits must Python be using?
AUDIENCE: 64?
DAVID MALAN: Yeah, maybe 64.
I don't know exactly.
But I know it's not 32 because it's
keep counting up and up and up.
And so this is another
feature of Python.
Whereas int in C has typically
been for us 32 bits--
although that is technically
machine-specific--
Python integers are now
going to be 64, which
just means we can do
much bigger math, which
is great for various data-science
applications and stats and whatnot,
where you actually might have
some large data sets to deal with.
Unfortunately, we still have
some issues of imprecision.
Let me go ahead and close a
whole bunch of these windows
and go ahead and open up, for
instance, just this one here.
OK.
No, I'm going to skip
this and do something
slightly more fun, which is this.
So in Python here, let's
do a quick warm-up.
This is going to print for me what?
AUDIENCE: Four question marks.
DAVID MALAN: 4 question marks, right?
And this is reminiscent-- this is like
a really cheap version of "Super Mario
Bros."
And if you think back to week
one, where we explored this,
there was a screenshot I had of
"Super Mario Bros," one of the worlds
where we just had four question marks
which Mario could hit his head against
to actually generate a coin.
So we stepped up from there
in C to do this instead.
And this is going to
give us another feature.
But let's see if we can't start to infer
from context what these programs do.
Here's another one, mario1.
What's this do?
It's using a loop, for sure.
And it's using how many
iterations, apparently?
Four.
So from 0 to 1 to 2 to 3, total.
Each time, it's going to print
out, apparently, a question mark.
But now, just infer from this--
I haven't answered
this question already--
what else is going on line 4 and why?
AUDIENCE: It's not going to a new line.
DAVID MALAN: Not going
to a new line, right?
So there's always this trade-off in
programming and CS more generally.
Like, yay, we took away the backslash
n, which was annoying to type.
But now if it's always there,
how do you turn it off?
So this is one way to
do that, and it also
reveals another fundamental
feature of Python.
Notice that print apparently takes,
in this case, more than one argument.
The first is a string-- literally
quote, unquote, and a question mark.
The second is a little funkier.
It's like a word, end.
It's then an equal sign,
and then it's a quote mark.
So what is this here?
So it turns out Python supports
what are called named parameters.
So in C, any parameters
you pass to a function
are defined, ultimately,
by way of their order.
Because even if a function
takes arguments that have names,
like x and y or a and b or whatever,
when you call the function,
you do not mention those names.
You know they exist, and that's how you
think about them in the documentation
or in the original code.
But you don't name the arguments as
you pass them in and call a function.
You instead pass them in in the
appropriate order per the man page
or per the documentation.
In Python, you can actually
be a little more flexible.
If a function takes multiple
arguments, all of which have names,
you can actually mention
the names explicitly,
thereby freeing you from
the minor inconvenience
of having to remember and always get
right the actual order of arguments.
So in this case, print apparently
takes at least two arguments
in this case, one of
which is called end.
And if you want to use that one, which
is clearly optional because I haven't
used it yet, you can literally
mention it by name, set an equal sign,
and then specify the value
that you want to pass in.
So if I actually now go into this
and go into weeks and 1 and do python
of mario1.py, I'll still get--
in week two.
If I get mario1.py, I still
get four question marks.
But that's the result of printing this
with a line ending of quote, unquote.
If I do this, meanwhile,
it's a little stupid
because I'm going to get that for
free if I just omit it altogether.
But now I get four question marks here.
And if you really want to be
funky, you can do something
like this, which is just
going to be taken literally
to give you that instead.
Unclear utility of taking this approach.
But that's all--
[POPPING NOISE]
Sorry-- that's going on.
Let's take a look at mario2.
This one works a little
differently, as well.
And how would you describe the feature
offered by this version of mario?
Prints any number of
question marks perfectly.
So it's parameterized by first
getting and int from the user using
CS50's get_int function.
And now I'm iterating from i to
the range of n, whatever that is,
and then actually printing
out the question marks.
Meanwhile, in mario3.py,
a little fancier still.
But what am I doing a little better now?
AUDIENCE: You're making
sure that the n is positive.
DAVID MALAN: Yeah, I'm just making
sure that the n is positive.
So I didn't bother implementing
a whole function called, like,
get_positive_int.
I don't need that.
This is a super-short program.
I'm just using the same logic up here--
inducing, deliberately,
an infinite loop,
breaking out of it only when I've
gotten back a positive integer,
and then printing out that many of
hashtags, reminiscent of the bricks
in Mario.
And then lastly, we have this
slightly more sophisticated version
that actually prints out a
different shape altogether.
You can infer from the
comments, but focus more on why.
So this first line 12
iterates from i to n,
whatever n is that the user typed in.
Meanwhile, line 15, indented, iterates
from j from 0 up to n, as well.
So this is kind of like
our canonical for int i
gets 0, dot, dot, dot, for int
j get 0, dot, dot, dot, where
we've had nested loops in the past.
So notice now that we have this building
block, which is a line of code or kind
of conceptually just a Scratch piece.
We can embed one inside of the other.
Here I can print out
a hashtag, making sure
not to put a new line after
every single hashtag I print out,
only printing out a new line on line
17, on each iteration of the outer loop.
And now notice whereas in C we
would have done this historically--
and that's fine-- in
Python, we don't need the f.
And we also don't need the backslash n.
End So ergo, you can simply do print,
and you'll get, if nothing else,
a backslash n automatically.
So that now when I run
this version of Mario,
we now get something more interesting.
And I'll increase the
size of my terminal window
for this so that I can enter a
positive number like this and print 10.
And now we've got a whole block.
So that was a lot.
Let's go ahead and take
our five-minute break here.
And when we come back, we'll look at
some more sophisticated examples still.
All right.
So let's begin to start to transition
to actually solving problems with Python
after introducing just a
couple of additional features
that aren't so much syntactic but
actual features of the language.
So here on the left was an old program
we wrote in week three called argv0.c.
And its purpose in life
was simply to allow
you to run a command-line
argument for the very first time.
And that was a nice tool
to have in our toolkit.
So how might we go ahead and map this?
Well, we actually need
to know how Python works
a little bit differently as follows.
If I go ahead and open
a new file called--
let's call it argv0.py.
I'm going to go ahead and translate
this, just as we did earlier.
So I'm going to go ahead and
want to use the following.
So if argc-- so there is no argc.
And actually, so def main--
there was also no argc or argv.
And it's not actually correct to do
this and this, as you might assume.
It turns out that the feature
of command-line arguments
are provided by a Python
package, so to speak,
or a library, much like the
CS50 library is a package
that you can import in Python speak.
So to do this, I actually need to
do this-- import sys, which gives me
access to a whole bunch
of system-related stuff,
like what the user has
typed at the command prompt.
And if I want to check if the
number of words that the human typed
at the prompt is two,
I actually am going
to do this-- if the length
of sys.argv equals 2,
then I'm going to go ahead and
print out quote, unquote, "hello,"
and then a placeholder here.
I know for placeholders, I need to
turn this into a formatted string, so
an f string there.
And now inside of the curly braces,
it turns out I can do sys.argv[1].
So it's a little different from before.
But notice I'm borrowing almost
all the same ideas as earlier,
including how we're
printing out strings.
And even though this is
a little more verbose,
what is between these two curly braces?
Well it's the result of looking
in the system package, which
has a variable called
argv, for argument vector,
just like in C. It is itself
an array, AKA a list in Python.
And here we have the result of
indexing into element one of that list.
And the way that I have
access to this is because I've
imported that whole package.
So if on the right-hand side here
I go ahead, after saving that file,
and I do python of
argv0.py, I see nothing.
But if I actually say, like, my
name here, I see, "hello, David."
So a very similar program, but
implemented a little differently.
And you'll notice, too,
that the length of an array,
henceforth known as a list, is
not something that you yourself
have to remember or keep around.
You can just ask a list how
long it is by calling the len--
or L-E-N for length--
function, passing it in as an argument.
So that's one of the takeaways there.
And if we actually want to do
something a little more clever,
like print out all of the strings in
argv, well, back in the day in see,
you might recall this example--
argv1.c, wherein I had this for loop.
And I iterated from zero on up
to argc, the argument count,
printing out each of the
arguments in that vector.
Python actually makes even
something like this even simpler.
Let me go ahead and
create a new file here.
And I'll call this, say, argv1.py.
And it turns out in Python, I
can similarly just import sys,
and then do, honestly, for
s in sys.argv, print s.
Done.
So again, kind of just
says what it means.
So I've imported the system library.
sys.argv I know to be a list,
apparently, of command-line arguments.
For something in something is a
new syntax we have for for loops.
So for some variable s inside of
this list, go ahead and print it.
And so it's a much cleaner,
much more succinct way
of honestly getting rid of
all of the complexity of this
by just saying instead what we mean.
Meanwhile, if I wanted to
print out every character,
I can take this one step further.
So back in the day in C
if I wanted to print out
every command line argument and every
character therein I could do this.
I just need a couple of nested
loops, wherein via the outer loop,
I iterate over all of
the arguments passed in.
And on the inner loop, I iterate
over the current string length
of whatever argument I'm printing.
And this had the effect of printing
out all of the command-line arguments'
letters one at a time.
I can do this in Python,
honestly, so much easier.
So let me go over here.
Let me create a new
file called argv2.py.
Let me import sys, as I did.
So import sys.
And then for s in sys.argv,
for c in s, print c.
Done.
So what is this doing?
Gone is all of that overhead of for
int i and for int j and so forth.
For s in sys.argv iterates over
all of the elements of that list,
one string at a time.
For c in s is a little different
because s is technically
a string or a str object, as
we're going to start calling it.
But at the end of the day, a string
is just a sequence of characters.
And it turns out Python supports, out of
the box, the ability to use a for loop
even to iterate over all of
the characters in a string.
And so c-- I just mean char.
So for c in s, that gives
me each of the characters.
So now at the end here, if I go
ahead and run python of argv2.py
with nothing, I get
just the program's name
because that's, of course, the very
first thing in argv, as in C. And if I
write, say, a word like "Maria"
here, I get argv2.py, Maria,
all in one long column because of the
additional prints that are happening
and the implicit new lines.
So any questions before we
proceed on this use of a package
called sys using these
functions therein?
All right.
So let me skip ahead, then, to
something slightly familiar, too.
Let me go ahead-- and you might
recall initials.c from some time ago,
wherein we accepted as input
to get_string a user's name,
and then we printed out their initials.
So let's go ahead and do that.
So from CS50, let me go
ahead and import get_string.
Then let me go ahead and
say, get me a string.
And I want the user to be prompted
for their name, as we might do here.
Then let me go ahead and say,
all right, their initials--
I don't know what they are yet.
So let me just initialize
an empty string.
But then do this--
for c in s, which is for each
character in the person's name, if--
and I don't know how to say this yet.
If c is an uppercase letter, then
go ahead and append c to initials.
And then down here, print initials.
So I've left a couple of blanks.
That's just pseudocode for the moment.
But this line 5, just to be
clear, is doing what for me?
What is being iterated over?
The string.
So for each character in the string, for
c in s, I'm going to ask two questions.
So in C, we did this in a
couple of different ways.
We can actually do it
with inequality checks
and actually considering what
the underlying ASCII values are.
The ctype library had that isupper
function and islower that we use.
Well, it turns out because
c is itself not a char,
there is no such thing,
technically, as a char in Python.
You have only strings of length 1.
And this is why single quotes no
longer have any special meaning.
It turns out c is technically
just a one-character string.
Strings are what we've
started calling objects,
which is a fancier name for struct.
So inside of an object like
a string is functionality.
And we saw one piece of functionality
earlier, which was what?
Not length, though that is another one.
It was format.
We saw it briefly.
But when I did the
string.format, I proposed
that there's actually
built-in functionality
to a string called format.
Well, you know what?
It turns out there is a method
or function inside of the string
class also called isupper.
And I can ask the very string
I'm looking at that question
by saying if c.isupper is true, then
go ahead and append c to initials.
So in C, if initials were
technically a string,
how could you go about appending
another character to a string in C?
AUDIENCE: c.append?
DAVID MALAN: Not in C. In
C. So in C, the language.
OK, so what's a string in C?
A string in C is a
sequence of characters,
the last one of which is backslash 0.
All right.
So it's an array of characters,
last of which is backslash 0.
So if I, for instance, typed
in my first name, "David,"
and now I want to append "Malan" to
the end of it, how do I do that in C?
AUDIENCE: [INAUDIBLE]
DAVID MALAN: Exactly.
It's like an utter pain in the neck.
You have to create a new array that's
bigger, that can fit both words,
copy the "David" into the new array,
then copy the last name in, then put
the null terminator at
the new array, then free,
probably, the original memory.
I mean, it's a ridiculous
number of hoops to jump through.
And you've done this on occasion,
especially for things like, perhaps,
problem set five.
But my god, we're kind of past that.
Just go ahead and append to the
array the character you care about.
So in this case, not
an array, but a list.
Sorry, not an array but a string
object that's initially blank.
It turns out that Python
supports this syntax--
plus equals typically means arithmetic
and add one number to another.
But it also means append.
So you can simply append to
initials by doing plus equals c, one
additional character.
So even though the string starts
like this, and this big in memory,
it's then going to
grow for one character,
grow, grow, grow, grow, until it
has all of the user's initials.
And as for where that memory
is coming from, who cares?
This is the point that we're now past.
You leave it to the language.
You leave it to the computer to
start to manage those details.
And yes, if it needs
to call malloc, fine.
Do it.
Don't bother me with that detail.
We can now start thinking
and writing code sort
of conceptually at this level,
instead of at this level.
So again, we're sort of abstracting
away what a string even is
and leaving it to the language itself.
So if I now go ahead and run
python of initials.py and type
in, for instance, "Maria Zlatkova"
here, with a capital M and a capital Z,
I then see her names because I've
plucked out the middle initials.
And if we do something else, like "David
J. Malan," even with a period in there,
it infers from the capitalization
what my initials should actually be.
So again, a much tighter
way of doing things.
Let me go ahead and now
open up another example
we didn't see a few
weeks ago, though it was
included in some of our distribution
code, if you wanted to look.
Some weeks ago, we had this program
among the distribution code,
where I declared an array
of strings called book.
And I proposed that there were
these several names in the phone
book, so to speak-- all of
the past instructors of CS50
sorted alphabetically.
And then down below in this C program,
I used that global variable called book
to implement, it seems, linear search.
And to implement linear search
in C, I'm going to need,
of course, a loop to iterate
over all of the strings.
This line 26 does exactly that.
I then in C, recall,
had to use str compare
because remember we
tripped over this issue
early on where you can't
just compare two strings in C
because you'd be
comparing, accidentally,
their addresses, their
pointers, not the actual value.
So we used str compare.
And I could pass in the name that
I'm looking for in the i'th book one
at a time, checking for equals zero.
And then I can call
Mike or David or whoever
I'm trying to call, or just
quit if the user isn't found.
So what did this program actually do?
If I go into this example,
which, again, was from weeks 3,
and I do make linear--
nope, not that.
Wrong directory again.
If I go into source3 and
make linear, this program
is supposed to behave as follows.
So if I go ahead and run ./linear,
look for our old friend Smith,
it found Smith.
If I go ahead and search for,
say, Jones, who did not previously
teach CS50, it says "quitting."
All right.
So meanwhile, in Python, bless its
heart, we can get rid of all of that.
And in our source8 directory
here and our subdirectory 3,
let me go ahead and open this instead.
In Python, I can declare an
array, otherwise known as a list,
almost in the same way.
But what's different,
just to be super clear?
AUDIENCE: Brackets?
DAVID MALAN: So the brackets
are now square brackets
instead of curly braces.
And frankly, unless you statically
initialized an array in C--
like hardcoded the values
for your array in C--
you might not have even known
you could use curly braces.
So that's not a huge deal here.
But in Python, square brackets here
and here represent a list of elements,
literally.
And what else is different?
Didn't declare the size of the array.
And I technically don't
have to do that in C,
either, if you're hardcoding
all of the values all at once.
But there is something
missing on line 7.
AUDIENCE: Type.
DAVID MALAN: Sorry?
AUDIENCE: The type?
DAVID MALAN: The type.
I didn't specify string.
But otherwise, this is pretty
similar to what we've done in C.
But what's beautiful here--
and let me go ahead and
hide that for just a second.
Let me go ahead and prompt
the user for his or her name.
So let's ask for the name here.
And then if I want to search the
book, which is just a list of names,
how do I implement linear search?
Well I could just do if name
in book, print "Calling name."
And Let's make this an f string.
And then down here, that's it.
So that's how you implement
linear search in Python.
You don't need a loop.
You can just ask the question yourself.
So if book is a list, and name is
the string that you're looking for,
just ask the language to
figure this out for you.
If name in book is the syntax you can
use to ask literally that question.
And then Python will use, probably,
linear search over that list
because it doesn't necessarily
know it's sorted, even
though it happens to be alphabetically.
But it will find it for
you, thereby saving us
a lot of the complexity and time of
having had to implement that ourselves.
Meanwhile, if I want
to compare two strings,
let me propose this-- let me write
a quick program here, compare1.py.
And let me go ahead and from CS50
import get_string, as before.
And now let me go ahead and get
one string that I'll call s.
And let me get another
string that I shall call t,
just as we did a few weeks ago.
And now in C, this was buggy, right?
If I print same, else I print different.
So in C, just to be super clear, why
was this incorrect, this general idea
of using equals equals?
Yeah, they're comparing
addresses, right?
This was like the day
before we peeled back
the layer of what a string actually is.
And it turns out that s and t in C
were char stars or addresses, which
means certainly if you
get two different strings,
even if you've typed
the same characters,
you're going to be comparing
two different addresses.
They're not going to be the same.
Now, you can perhaps infer
from the theme of today--
what is Python going to do if
asked if s and t are equal?
It's gonna answer that question
as you would expect as the human.
Equals equals now, in Python,
is going to compare s and t,
look at their actual values
because they are strings,
and return same if you
literally typed the same words.
So in here, if I go in here and I do
python of compare1.py, and I type in,
for instance, "Maria," and then I type
in "Maria," they're indeed the same.
If I type in "Maria" and,
say, "Stelios," they're
different because it's actually
now comparing the strings,
as we would have hoped some time ago.
So let's take a look
at another that kind of
led to some interesting quandaries.
You might recall in week four,
we had this example in C--
noswap, so named because
this just did not work.
It was logically seemingly correct.
But swap did not actually swap x
and y, but it did swap a and b.
Why?
AUDIENCE: The memory locations?
DAVID MALAN: The memory
locations were different, right?
So x and y, recall, are variables in C
that exist in a certain slice of memory
that we called a frame on the
stack, main's frame on the stack.
Meanwhile, a and b are from a
slightly different location in memory.
We sort of kept drawing
it slightly above,
like a tray at the dining
hall on the so-called stack.
a and b had the same values of x and y--
1 and 2-- but their own copies of them.
So even though we logically,
as with Kate, I think,
with the Gatorade,
swapped the two values,
we ultimately swapped
the wrong two values
without actually permanently
mutating the original x and y.
So unfortunately-- well, fortunately
and unfortunately in Python,
there is no such thing as a pointer.
So those are now gone.
So we no longer have the
expressiveness with which
to solve this problem that way.
But let me propose that we do it
in oh-so-clever of another way.
Here let me go ahead and
declare x is 1, y is 2.
Let me go ahead and print out as much.
So with a format string, I'm going
to go ahead and say x is x, y is y,
plugging in their respective values.
I'm going to do that twice.
But in between, I'm going
to try to perform this swap.
And if your mind's
ready to be blown, you
can do that in Python, do
the old switcheroo in Python.
And this actually does swap the
two values as you would expect.
Now this is not a very common case.
And to be fair, this is an
incredibly contrived example
because if you needed
them swapped, well,
maybe you should have just
done this in the first place.
But it does speak to one
of the features of Python,
where you can actually
do something like that.
Let me introduce now one additional
feature that we only recently
acquired in C. And that's
the notion of a struct.
And let me go ahead and do
this in code from scratch.
So let me go ahead and save this
file proactively as struct0.py,
reminiscent of one of
our older programs.
And let me go ahead and do this.
From CS50 import get_string.
And then let me give
myself an empty list.
So that would be a conventional
way of giving yourself
an empty list in Python.
And much like in C, you
can declare an empty array.
But in C, you have to
know the size of it
or, if not, you have to use a pointer.
And then you have to malloc.
No.
All of that is gone.
Now in Python, you want a list?
Just say you need a list.
And it will grow and shrink as you need.
Now I'm going to go ahead and
just three times, arbitrarily,
for i in the range of 3,
let me go ahead and ask
the user for a name using get_string.
And I'll ask him or her for their name.
Dorm will use get_string, as well.
Dorm here.
And then I want to append
to the array this student.
So I could do something like
this-- students.append name.
And it turns out-- and
we've not said this yet.
But there is inside of the
list data type a method--
that is function-- built into it
called append that literally does that.
So if you've got an
otherwise empty list,
and you calls that
list's name dot append,
you'll add something
to the end of the list.
And if there's not enough
memory for it, no big deal.
Python will find you the memory,
allocate it, move everything in it,
and you move on your way without
having to worry about that.
But I don't want to store just the name.
I want to store the name and the dorm.
So I could do this.
I could do-- well, maybe
this isn't really students.
Maybe this is now, like, dorms.
And then here I could
do dorms.append dorm.
But why is this devolving
now into bad design
if my goal was to associate a
student with his or her dorm,
and then keep those values together?
Why is this not the best approach in
Python or, back in the day, even in C,
to have two separate arrays?
AUDIENCE: Like struct?
DAVID MALAN: What's that?
AUDIENCE: Struct?
DAVID MALAN: Well, it's twice as
many things to maintain, for sure.
And what else?
AUDIENCE: You can't
map them to each other.
DAVID MALAN: You can't
map one to the other.
It's just-- it's very arbitrary.
It's sort of this social
contract that I will just
assume that student 0 lives in dorm 0.
And student 1 lives in dorm 1.
And that's fine.
And that's true.
But one of the features of
programming and computer science
is this idea of encapsulation,
like, associate related memory
with each other.
And so what did we do in C instead?
We did not have two arrays.
AUDIENCE: We had a struct.
DAVID MALAN: Yeah, we had a struct.
And so Python doesn't
have structs per se.
It instead has what are called classes.
And it has a few other things
like tuples and namedtuples,
but more on those some other time.
So it turns out I could actually
implement my own notion of a student.
And I could import it like this.
The convention in Python is
if you create your own struct,
henceforth called a class,
you capitalize the name of it
by convention.
So a little different
from C conventions.
So what is a student going to look like?
This is perhaps the most complex
syntax that we'll have today,
but it just has a few lines.
If you want to implement the notion
of a student, how might you do this?
Well, in Python, you
literally say class Student,
where class is similar in
spirit to-- just to be clear--
struct or typedef struct.
But in Python, we're just saying class.
And then this is the funky part.
You can declare a function that
by convention must be called init
for initialize that takes as its
first argument a keyword called
self, and then any number of
other arguments like this.
And then, for reasons that will
hopefully be clear momentarily,
I can write some code
inside of this method.
So long story short, what am I doing?
I have declared a new type of
data structure called Student.
And implicitly inside
of this data structure,
there are two things inside of itself--
something called name and
something called dorm.
And this is how you would
in a C struct typically do
things with the data types and
semicolons inside of the curly braces.
Meanwhile, there's this method here.
And it's a method insofar
as it is inside of a class.
Otherwise it's a function,
just by a different name.
This method init takes whatever
self is-- more on that another time.
But it then takes zero or more custom
arguments that you can provide.
And I called it name and dorm.
So it turns out this
special method init is
a function that's going to be called
automatically for you any time you
create a student object.
So what does that actually mean?
That means in your code, what
you can actually do is this.
I can create a student in memory by
saying s gets capital Student, passing
in name and dorm.
And we don't have this feature in C.
On the right-hand side,
what I've highlighted
is the name of the class
and its two arguments--
name and dorm, which are just
what the user has typed in.
What this class does for me
now is it allocates memory
underneath the hood for a Student.
It's got to be big enough for their
name and big enough for their dorm.
So it's, like, yay big
in memory, so to speak.
It then puts in the name and the
dorm strings into that object,
and then returns the whole object.
So you can kind of think of this as
a much fancier version of malloc.
So this is allocating
all the memory you need.
But it's also installing inside of
that memory the name and the dorm.
And it's bundling it up inside of not
just an arbitrary chunk of memory,
but something you can
call a Student object.
And all that means that
now for our students,
we can just go ahead and append
that student to the list.
So now if later I want to iterate
over for student in students,
I can go ahead and print out,
for instance, that student.name
lives in student.dorm, close quote.
And if now over here--
whoops, close that.
Now over here, if I go ahead
and run python on struct0.py--
oh, no!
Oh, thank you.
That goes there.
So now-- dammit.
Missing curly-- oh, thank you.
OK.
So now if I want to go ahead and
type "Maria" and "Cabot" and "David"
and "Mather" and "Rob" and, say,
"Kirkland," now we get all three
of those names.
And there's other
ways, too, if we wanted
to actually store this thing on disk.
But I'll defer that
to an example online.
Let's look at one final
example that will hopefully
either make you regret
the past several weeks
or embrace the next several instead.
So you'll recall that--
though the former, I suppose,
could be true even without my help.
So if I go into now today's distribution
code, you will see this program.
And we won't walk
through all of its lines.
But this is a program written
in Python called speller.
And what I did was literally sit down
with speller.c from problem set 5.
And I just converted it from
left to right, from C to Python,
implementing it in Python in as
close to an identical way as I could,
just using features of Python.
So just skimming this, you'll see that
apparently my implementation of speller
in Python has a class called Dictionary
which is very similar in spirit
to dictionary.h in C Notice that
I still have a constant here.
Or it's not technically a constant,
but a variable called length equals 45.
I hardcoded in dictionary/large,
as speller.c did, too.
I'm using command-line
arguments, as we saw earlier,
but this time in Python instead of C.
Notice you can do
funky things like this,
which is reminiscent of our swap
trick just a little bit ago.
If you want to declare multiple
variables all on the same line
and initialize them, you can just
enumerate them all with commas.
Then on the other side
of the equal sign,
enumerate with commas the values that
you want to assign to those variables.
And then down here, if
I keep scrolling, you'll
see code that we won't get into the
weeds of, but some familiar phrases.
So this is the program that
actually runs a student's dictionary
on some input, and then prints out
per all of this stuff at the bottom
all of the familiar phrases that you
might recall from problem set five.
So this took a lot of work, most likely,
to implement in C. And understandably,
you might have used a
linked list initially,
or ultimately you might have
used a hash table or a try
or struggled with something
in between those two.
And that is a function
of C. C is difficult.
C is challenging because you
have to do everything yourself.
An upside, though, of it is that you end
up getting a lot of great performance,
theoretically.
Once you have implemented the code,
you're kind of as close to the hardware
as possible.
And so your code runs pretty
darn well and is dependent
only then on your algorithms,
not on your choice of language.
So here let me go ahead and implement
a file called dictionary.py.
And let me propose that the words--
the equivalent, sorry, of
dictionary.h would be this file here.
And it's going to have
a function called check,
which takes in an argument called word.
It's going to have a
function called load, which
takes in an argument called dictionary.
It's going to have a method
called size, which takes
in no arguments other than itself.
And then it's going to have
a method called unload,
which also takes no
arguments other than itself.
So if we were instead to have
assigned problem set five in Python,
we essentially would have given
you a file called dictionary.py
with these placeholders for you
because recall in pset five,
those were all to dos.
Strictly speaking, there
would be one other here.
We would probably have a def init
because every class in Python,
we'll see, we'll typically
have this init method,
where we just are able to do something
to initialize the data structure.
So let me go ahead and do this.
We don't know that much Python yet.
And we're taking for granted
that speller in fact, works.
But let me go ahead and load
some words in a dictionary.
So here is my method called load.
Dictionary is going to be the
name of the dictionary to load.
So you guys implemented this yourself
by loading those files from disk.
In Python, I'm going
to do this as follows.
Give me a file and open it in read mode.
Iterate over each line in the file.
Then go ahead and add
to my set called words
the result of that line by stripping
off the end of it backslash 0.
Then go ahead and close the
file, and then return true
because I'm done implementing load.
So that is the load method in Python.
Happy, yes.
OK.
So check.
Check was a struggle, too, right?
Because once you had your hash
table, or once you had your try, now
you had to actually navigate
that structure in memory,
maybe recursively, maybe
iteratively, following lots
of pointers and the like,
following a linked list.
How about I just do--
let's just say if word lowercase
in self.words, return true.
Else return false.
Done.
So that one's done.
Size-- we actually can kind
of infer how to do this.
Return the length of the words.
That's done.
Unload-- don't have to worry about
memory in Python, so that's done.
And there you have a problem set five.
[APPLAUSE]
Thank you.
So what then are the takeaways?
Either great elation that you now
have this power or great sadness
that you had to implement this
first in C. But this was really
ultimately meant to be thematic.
Hopefully moving forward, even if you
struggled with any number of these
topics-- linked lists and hash
tables and pointers and the like--
hopefully you have a
general understanding
of some of these fundamentals
and what computers
are doing underneath the hood.
And now with languages like Python
and soon with JavaScript and SQL,
with a little bit of HTML and CSS
mixed in for our user interfaces,
do you have the ability
to now solve problems,
taking for granted both your
understanding of those topics
and the reality that someone else has
now implemented those concepts for you
so that when it comes to solving
problem sets six and seven and eight,
and then leaving CS50 and solving
problems in your own domain,
you have so many more
tools in your toolkit.
And the goal really for
you is going to be to pick
whichever one is most appropriate.
So let's adjourn here.
I'll stick around for questions.
And we'll see you next time.
Best of luck on the test.
