COLTON OGDEN: Hello, world.
This is CS50 on Twitch.
My name is Colton Ogden.
And I'm joined today by--
DAVID MALAN: I'm David Malan.
Nice to see everyone
again, for real this time.
It's not just a pop-in.
I'm here the whole time.
COLTON OGDEN: The whole time, yeah.
This a very special appearance today.
What are we talking about today?
DAVID MALAN: So we thought
we'd introduce render50,
which is a command line tool
that CS50 uses internally,
and some of our high school teachers use
as well, to turn source code into PDFs.
COLTON OGDEN: And what are some of
the ways that you use this yourself?
DAVID MALAN: So over the
years for CS50, especially
for our undergrads and our extension
school students here and off campus,
we have typically given
feedback, of course,
on students' code when
they submit it for grades
and also for qualitative feedback.
And years ago, back in my day,
so this is like late '90s,
the technology we used was paper.
And so literally, part of the process
of submitting your homework, your CS50
problem sets, was
literally print it out,
in addition to submitting the files,
and we would have used big ASCII.
You print out a big sheet
of paper and like Malan
would be printed in big M's and big
A's and big L's all over the paper
to make clear to your
teaching fellow whose it is.
And then he or she would write
comments physically on the paper.
COLTON OGDEN: Is this also the
era of the paper with holes
on the sides of it?
DAVID MALAN: Yes, dot matrix.
Actually, these were laser
printers, but I think
those were still, for sure, around.
So that's obviously behind us now.
And so over the years, we
transitioned to other techniques
where we would somehow generate
PDFs out of students' code
and then use things like Adobe Acrobat,
or Apple Preview, or Bluebeam PDF Revu
was another tool we used for some time.
So you could actually type the comments.
And actually, we went through a phase,
thanks to some friends at Microsoft
a few years ago, where they kindly
let all of our teaching Fellows
here on campus use tab--
they weren't called tablets yet--
what were they even called?
COLTON OGDEN: The like
PalmPilots, like the--
DAVID MALAN: Touchscreens.
Well, they were the first
laptops with touchscreens.
But I don't think they
were called tablets yet.
I'm totally blanking
now on the technology.
COLTON OGDEN: I don't remember either.
DAVID MALAN: So you
could draw on the screen.
And this was marginally
better, because then you
could circle things and be like good,
or better, could be this, and so forth.
And just handwrite it as well.
And then most recently did we just write
our own tool for generating the PDFs,
because then we can get them just right.
We can syntax highlight them.
We can do it automatically,
programmatically.
And so thus was born
render50, a tool that
just makes it super easy
to render PDFs out of code.
And then also, unfortunately,
to render files side
by side in cases of academic dishonesty.
If we suspect that a student has
unduly copied someone else's work,
it's often helpful for
folks on campus to go
to see their code in the GitHub
repo or something else side by side.
So we use this same tool to do that.
So you can just very easily,
especially if you're less technical,
look at the code on a PDF
or a printout as opposed
to like a diff or GitHub
or something like that.
COLTON OGDEN: Nice.
And another use case that,
for example, to illustrate,
we're using it today here--
DAVID MALAN: Ironically, yes.
COLTON OGDEN: --for notes.
DAVID MALAN: This is very meta.
COLTON OGDEN: You probably won't be
able to see it too well in the stream.
DAVID MALAN: This is my
cheat sheet for today.
COLTON OGDEN: I've used
it in prior streams
as well for code bases where
I just want to have something
to look at as a reference,
or in case I forget
how I wrote some function or something.
And you use it for lecture notes too.
DAVID MALAN: All the time.
Like literally every
lecture and any stream where
I need some code to
reference, just really as
a cheat sheet to remind myself so I'm
not futzing with the computer too much.
I just run code through render50,
get a nice pretty printed,
syntax highlighted, colorful PDF.
And then we just send
it to a color printer.
And then do it old school like this.
COLTON OGDEN: Is this related
at all to the old Annotate 50
that we were working on back in the day?
DAVID MALAN: This is
like a simplistic version
of that where you rely on actual human
hands for the annotations or existing
PDF tools.
We did have a tool in the
past that was more digital,
but we've since deprecated that.
COLTON OGDEN: OK, cool.
Let's make sure that we're
keeping up with the chat.
We have a lot of new
names today that I saw.
DAVID MALAN: Sure, let's say hello.
[INTERPOSING VOICES]
COLTON OGDEN: So BELLA
[INAUDIBLE] is a regular.
Hello, Bella, good to have you.
DROP4, VDHUG, again.
HASSAN-- the Twitch font on this
sometimes is absurdly difficult to read
against the--
DAVID MALAN: It's very bright green.
COLTON OGDEN: I have it on light mode
for some reason, but HASSAN [INAUDIBLE]
says hello.
DAVID MALAN: Hello.
COLTON OGDEN: Good to have you, Hassan.
I think that's the first time
I've seen your name in the chat.
So Brenda, shout out to Brenda.
DAVID MALAN: Brenda, nice to see you.
Brenda tuning in, of
course, from New Zealand,
one of our furthest away team members.
COLTON OGDEN: Good to have you, Brenda.
SHAMXR says good evening.
MATTHEWTHEGOODMAN.
DAVID MALAN: Montreal, nice.
COLTON OGDEN: In from Montreal.
So very good global representation.
DAVID MALAN: And Pakistan,
the other direction.
Wow.
COLTON OGDEN: Yeah.
DAVID MALAN: Good
night, nice to see you.
Thanks for tuning in before bed.
COLTON OGDEN: Yeah, awesome.
Some of the folks here are watching
us like 4 or 5 o'clock in the morning.
DAVID MALAN: Yeah, well, it must be.
COLTON OGDEN: Which is awesome.
DAVID MALAN: Well, I'm
up at that hour, too,
so we might as well just
do the streams then too.
COLTON OGDEN: VDHUG acre in the middle
of the Amazon forest, interesting.
DAVID MALAN: Wow, all right.
VDHUG, you're going to have to elaborate
on that one and how that's working.
COLTON OGDEN: TWINTOWERPOWER,
generic third quarter of the day
greeting from a country on one
of the continents of earth.
Nice.
DAVID MALAN: OK, this
is a puzzle, isn't it?
COLTON OGDEN: That's almost like a
very political statement, like a very--
DAVID MALAN: Very generic, yes.
Hello, earthling.
COLTON OGDEN: Waves at folks from
around the globe, says Brenda.
[INAUDIBLE] hello, everybody.
Hello, [INAUDIBLE], good to have you.
GARETHBUTLER2, hello.
That's a new name as well.
So many new names today.
POULTON1987, hello.
Hello, hello.
DAVID MALAN: Hello, everyone.
Yeah, the green really
doesn't work very well.
COLTON OGDEN: Yeah, it's a little bit--
DAVID MALAN: [INAUDIBLE],, we've
been chatting online too, often.
Nice to see you.
COLTON OGDEN: Yeah, yeah.
OK.
And VERONI, VERONI was
here in yesterday's chat.
Hello, VERONI.
DAVID MALAN: Nice.
COLTON OGDEN: LITTLEJR,
hello from Brazil.
[INAUDIBLE], hello.
Unspecified location from TJ.
And--
DAVID MALAN: This is great.
Siberia, Russia.
That's terrific.
Also far away, and pretty cold though.
Frankly, it's been pretty
darn cold here lately.
COLTON OGDEN: Thankfully,
today and yesterday not as bad.
We had that blizzard last weekend.
That was terrible.
That was horrible.
DAVID MALAN: Though I don't know
if we'd compare it to Siberia.
COLTON OGDEN: Yeah,
Siberia's pretty rough.
That's like a summer.
Or to them, that's like a summer.
But you know.
And then MADKINGVALLA says hoy.
DAVID MALAN: Nice.
Nice haircut.
Nice haircut comment.
COLTON OGDEN: I haven't gotten it yet.
Their joking.
I told them I was going
to get my hair cut.
I think today I told them I was going
to get my hair cut, but [INAUDIBLE]..
DAVID MALAN: All right, so I
guess it's being sarcastic.
So not actually very nice.
COLTON OGDEN: No, it's not.
But it's OK.
It's very soon, I promise.
Actual promise.
OK, I think we're all caught up
on the comments, so why don't we--
I'm going to transition to your laptop
so you can start taking [INAUDIBLE]..
DAVID MALAN: OK, here we go.
COLTON OGDEN: Hopefully--
oh, is it the right size?
Everything good?
DAVID MALAN: Yeah, we
can zoom in a little.
Folks want to see a little better.
COLTON OGDEN: Trying make
sure that the actual editor--
this wasn't cutting off your screen.
DAVID MALAN: Nice.
I hope everyone enjoyed.
Does anyone have any questions
about how the technology works,
any of the code that we've looked
at, or where you can go from here?
COLTON OGDEN: This will be
amazing when I re-splice it.
Sorry about that.
Facebook booted us.
It booted our server.
So we're cross-streaming
right now to Twitch.
Were cross-streaming to
Facebook and to YouTube.
Right now we're just to
YouTube and to Twitch,
but it looks like Facebook
kicked us for some reason.
So if you're watching, or were
watching on Facebook, sorry about that.
DAVID MALAN: All right.
But hopefully, we're now
back and we can dive in.
COLTON OGDEN: Yeah, let me
just make sure that we--
people, if they had any
questions before we started.
DAVID MALAN: Sure, yeah.
We were.
COLTON OGDEN: Looks like, no.
I guess we'll do one
question from HASSAN.
Which language do you prefer
between C and C++, before we begin?
DAVID MALAN: I'd say C. C++ certainly
gets useful when you want to do
something object oriented,
obviously, by definition.
I find it a little messy, though
it's one of the earlier OO languages,
at least, that really was
popularized for software development.
But it's also just a
little messy syntactically.
It's kind of a pain, I think.
So nah, I never really loved it.
COLTON OGDEN: OK.
Cool, cool.
And then everybody in
the stream, definitely
let us know whether or
not you can still hear us,
whether or not everything looks good.
I'm going to refresh.
Looks like everything's live on our
end, but let us know if you can hear us,
just as a sanity check for us too.
DAVID MALAN: All right, wonderful.
COLTON OGDEN: 100% now,
so I think we're OK.
DAVID MALAN: All right, so if you'd
like to play along in any way,
let me suggest you go to
any of a couple of URLs.
We can paste them in here.
If you want to go to
cs50.readthedocs.io/render50,
we have some documentation there
as to how to use the tool itself.
I mean, if you'd like to look
at the source code, as well,
I have that opened up in another
tab here, github.com/cs50/render50.
And what we'll do, I
think, is a mix of kind
of making some things
from scratch, maybe
look at the actual source
code toward the end
to point you out at the various
features and how we did things.
But what I think is cool
about render50, even
though to be fair I wrote most
of it, is that it's really
a mesh of various
technologies and techniques.
So the goal at hand,
quite simply, was we've
got some source code, one or more files
in C, Python, JavaScript, or whatever,
and the goal is to turn it into a PDF.
How do we go about doing that?
Well, let me propose this.
Let me actually go ahead
and just download some code
that we've already written.
So github.com/cs50.
How about libcs50?
This is the so-called CS50 library.
And it's got a few files
in here, but I'm just
going to go to cs50.c, which you
might not have actually ever seen,
but is, in fact, the code that
implements CS50 library, like GetString
and GetInt and GetFloat and so forth.
And I'm just going to go ahead on
GitHub and save this, just so that I
have a big file locally
that I might want
to print out so that my TF,
for instance, could comment on,
give me feedback.
Sort of an in-person code
review in written form.
So let me go ahead and
save this on my Mac
here as cs50.c in my Downloads folder.
And let me do-- let me try printing this
as a PDF in the simplest way possible.
I can go ahead and open up this file
on a Mac with something like TextEdit.
Super simple text program.
It's the simplest of text editors.
And you'll notice it doesn't
even support syntax highlighting.
So all right, fine, I'll sacrifice that.
So let me go to File, Print.
COLTON OGDEN: Fun fact--
I think that was one of the
first text editors I ever used.
DAVID MALAN: TextEdit?
Ooh, I'm sorry.
COLTON OGDEN: Just in plain text
mode, not in rich text mode.
DAVID MALAN: Oh, yeah.
Rich text would really mess you up.
COLTON OGDEN: That would be pretty bad.
DAVID MALAN: And so you'll see
you can kind of see a preview here
in macOS of what this thing
is going to look like.
It's not going to look good.
But on macOS, you can actually
PDF things pretty easily.
In Windows, you might have to
jump through a couple more steps.
But I'm going to go to PDF, Save as PDF.
I'm going to call this cs50.pdf.
Author is John Harvard, sure.
And then I'm going to go ahead and
open this PDF in my Downloads folder.
And you'll see that, wow, this
is really just a mess here.
I've opened up Acrobat, and
the comments are wrapping.
It's obviously just black and white.
Acrobat is doing its stupid thing here.
Now the spinning beach ball.
So we're just filled with problems here.
And it just kind of goes on and on.
And this isn't bad.
You could certainly
write comments on this.
You could certainly type comments on
this, but it's not all that pretty.
COLTON OGDEN: I really
love these comments
down here with the line breaks.
DAVID MALAN: Yeah, well, I wasn't
designing for 8.5 inch width.
So we could, of course,
do this a little better.
I could go in here, for instance,
and say-- well, let's not do that.
Let's actually print it again.
Let's go to Show Details.
And let's do one step
better-- landscape mode.
COLTON OGDEN: Hey, that
looks a lot better.
DAVID MALAN: Game changer.
So let's go ahead and do this.
And I'll just open the
PDF directly this time.
And OK, so we're actually
in better shape, because it
looks like 11 inches for US
letter paper, so to speak,
is actually pretty good, at
least for my commenting style.
And so it seems to be better.
But what do you think?
What more could we do here?
COLTON OGDEN: Well, it looks like
some comments, for example right here,
still do break, so you're
not immune to the problem.
I can imagine a large Boolean
expression or the like
also taking up more than 11
inches worth of characters.
But I mean, it just looks
ugly, like it's just
black text on a white background.
As a programmer now, I'm more used to
the syntax highlighted IDE, some kind
of VS code, Atom, Sublime, or CS50 IDE.
And frankly, I'm not
a huge fan also of--
I guess render50 normally does
white background and colored text.
But it would be kind of impractical
to do black background for printed--
DAVID MALAN: Yeah, that would
probably be pretty expensive ink-wise.
So you mentioned Atom, and there's VS
code, and there's Sublime and whatnot.
So let's try that.
Let me go ahead and
open up my source file
in a more powerful
program like Atom, which
is going to give me syntax
highlighting, among other features.
And so this is nice here, because
if I start scrolling down,
OK, it looks beautiful.
And you'd think, well,
let's just print this,
hopefully without the black background.
So we can go up to File--
no, we can't.
So you can't even print from Atom.
And this is probably true
of many text editors,
because why would you print your code?
So to be fair, the use case in
question is somewhat narrow.
It's pretty academic.
We want to print code.
But honestly, I use render50
multiple times per week.
And I do think teachers more
generally, especially in high school,
do find this methodology
still pretty compelling.
COLTON OGDEN: Yeah, I would imagine.
I am pretty surprised, because I
could definitely see it, especially
now that computer science is
pretty popular in most high school
and colleges, like the fact that text
editors don't include a print feature
seems a little bit weird to me.
DAVID MALAN: Yeah, even
for print to PDF, right?
Because printing to
paper is pretty silly.
And I only do this so
that the computer doesn't
fail on me when looking something up.
But you could imagine wanting to
print to PDF, because then you
can email the PDF.
Anyone can open it.
You can type in it.
You can write on it.
It's pretty versatile.
So it's pretty compelling final format.
OK, so this isn't working, so we
could maybe screenshot the code.
Maybe screenful at a
time, and start printing,
and start printing these
screenshots as PDFs,
but that's, of course, kind of insane.
COLTON OGDEN: Yeah, yeah, that is.
DAVID MALAN: Those are just images then.
It's going to be a huge amount of black.
You can't highlight it if it's
a ping and not actual PDF.
So we quickly realized, like we're
going to have to write a bit of code
to actually solve this problem.
COLTON OGDEN: Cool, yeah.
DAVID MALAN: All right,
so how do we do that?
So we kind of adopted internally
with cs50 Python as our language
the past few years.
COLTON OGDEN: Little
better, thank goodness.
DAVID MALAN: Yeah, so we
figured, all right, let's just
start with that constraint.
Doesn't have to be written in Python.
We could use JavaScript
or PHP or whatever.
COLTON OGDEN: I mean, the original
version of render50 was PHP, wasn't it?
DAVID MALAN: It was, actually, yeah.
COLTON OGDEN: We had the printer
issue with the library we were using.
DAVID MALAN: Yeah, eventually
broke, because yeah,
the library we were using
that converts from text to PDF
was running into some weird errors.
COLTON OGDEN: It had an
unconventional PDF format
issue, like it had some bytes in the
header or something were messed up.
And printers were expecting that, and
so it wasn't printing [INAUDIBLE]..
DAVID MALAN: Yeah, literally
we could make the PDF,
but we would go to print it
and can't print it, which
was literally the point of the program.
So that was kind of hurting us.
So yeah, we rewrote it
in Python a bit ago.
And so the first question was, well,
how do you create PDFs in Python?
And you, frankly, I think one of us
probably just did PDFs in-- whoops--
PDFs in Python.
And kind of started here.
COLTON OGDEN: This is
modern programming.
DAVID MALAN: Yeah.
So--
COLTON OGDEN: [INAUDIBLE]
unfamiliar with the process.
DAVID MALAN: There we go, automate
the boring stuff with Python.
I don't know if that
existed at the time.
So there's a lot of search
results, a lot about reading PDFs.
And long story short, we found
our way to a few libraries.
And one of them we settled on.
So I'm going to go ahead and
Google that one directly.
It's called WeasyPrint.
It's a free open source library that's
been continually getting improved.
In fact, the author has been receptive
to pull requests, even, for bug fixes
and performance improvements.
COLTON OGDEN: I'm a big
fan of their web page.
It's very-- if you want to refresh
the animation, like the-- that thing.
DAVID MALAN: Yeah, it's pretty cool.
This has nothing to do
with PDFs, but it does
have to do with CSS and
JavaScript, probably.
COLTON OGDEN: And the top
little bit there, that's cute.
DAVID MALAN: Yeah.
COLTON OGDEN: That's cool.
DAVID MALAN: So what's nice is
this is freely available on GitHub.
You can PIP install it, which
is the Python Package manager.
And so this just lets you
take specifically HTML
to, per this little demo, PDFs.
And so this is what was a new,
interesting experience for me.
It turns out you can,
and some people do, use
CSS, not just for printing to
screens, but also printing to paper.
Books can be laid out with HTML
and CSS in a very resizable way.
Like here in the US, we use 8
and 1/2 by 11 inches for paper.
Folks abroad might use
other measurements as well--
A4 or something like that.
And so there is actually, built
into CSS, support for printing
that we actually leverage for render50.
So let me go ahead and do this.
In advance, what I've done is this.
I've got a terminal window here.
I happen to be using cli50,
which is CS50's command line
interface based on Docker.
For more on that, check out
our stream from the other week.
But I'm using that just so
that I have all of my libraries
and stuff pre-installed.
But you can do this on a Mac or PC.
You're just going to
want to install Python 3.
And that's about it,
initially, and PIP if it
doesn't come with your distribution.
And what I'm going to go
ahead here and do is this.
If I type ls, I've got my cs50.c
file in my terminal environment.
And I'm actually going to go ahead
and run WeasyPrint of cs50.c.
And then we'll call it cs50.pdf.
COLTON OGDEN: This WeasyPrint,
this is a Python library
that's been exposed or been--
DAVID MALAN: Indeed.
COLTON OGDEN: --aliased as
a sort of a Linux program.
DAVID MALAN: It comes as both.
So it's both a library and it's a
program that you can actually use.
So if you just run WeasyPrint,
then the name of the input file,
then the name of the output file,
it's going to take that input
and turn it into a PDF using
the WeasyPrint library.
So let me go ahead and hit Enter.
And we'll wait for a second.
And there we go.
We're back at the prompt.
If I type ls, we'll see that we
now have see cs50.c and cs50.pdf.
Let me go into my Downloads
folder there and choose cs50.pdf.
And we'll see here,
once this opens, lovely.
COLTON OGDEN: No, it looks beautiful.
DAVID MALAN: Oh, my goodness.
It's actually worse--
COLTON OGDEN: I'm so
glad we downloaded that.
DAVID MALAN: --than before.
So what do you think is going on here?
COLTON OGDEN: That's a good question.
Is it rendering it as just plain text?
DAVID MALAN: Yeah, pretty much.
I mean, what's obviously missing?
COLTON OGDEN: I mean, it's not--
there's no new lines at all.
DAVID MALAN: No new lines, yeah.
COLTON OGDEN: No syntax highlighting.
Does need to be in a
specific format, or does
it need to have a style sheet
that gets referenced, or--
DAVID MALAN: That's exactly right.
So WeasyPrint is all
about converting, not text
to PDF per se, but HTML and CSS to PDF.
COLTON OGDEN: Oh, that's right.
That's right.
Yeah.
DAVID MALAN: Right.
Well, so this is kind of cool, because
even from CS50 or any prior background
you all might have with web
design, it's pretty simple
to make monospaced font in HTML.
It's pretty easy to start
colorizing things if you want.
So we just have to
figure out now like how
to stitch these ideas together,
like write HTML, add some CSS,
colorize the code, just like
Atom is for us, and then go ahead
and turn it into a PDF.
COLTON OGDEN: So essentially,
the first step, I guess,
would be take someone's source code
and make an HTML representation
of that source code.
DAVID MALAN: Yeah, exactly.
So honestly, one of the simplest
ways we could do this is this.
Let me go ahead and copy cs50.c and just
literally call it cs50.html, same file.
And now let me go ahead and
open this up in my text editor.
I'll use vim at the command
line here, cs50.html.
And you'll see, of course, just text.
This is not HTML.
COLTON OGDEN: Right.
DAVID MALAN: But it could be.
Let me just go ahead here
and enact HTML doc type.
And then say, hey browser,
here comes my HTML.
And then, hey browser, here
comes the head of my page.
And then, hey browser,
here comes the title.
And we'll just say CS50
library, for instance.
Close my title.
Close my head.
Hey browser, here comes
the body of my page.
And then I could do something
like, hey browser, here come
some preformatted text, which typically
is displayed in monospace font.
I'm going to go ahead, then, and
fast forward to the end of the file.
And the syntax highlighting
you're seeing now is from Vim.
It's not HTML or anything,
or CSS or anything like that.
Down here, I'm going to go ahead and
say, OK, well that's it for my pre tag.
That's it for my body tag.
And this is it for my HTML tag.
COLTON OGDEN: OK, so just wrap
everything and pre, make it monospaced,
and then that's it.
DAVID MALAN: Yeah, so let's
see what happens here.
Let me go ahead and save this.
And now let me go ahead
and run WeasyPrint,
which again is just a program.
And as an aside, what I did earlier
was PIP install WeasyPrint in order
to install that on my Mac or your
Linux box or your Windows machine,
so long as you have PIP
and Python installed.
OK, but here I want to go ahead
and run WeasyPrint of cs50.html
is the input now, not dot c.
And then output again cs50.pdf,
so we'll blow over the other one.
All right, it's taking a little longer
this time, but there's more thinking,
there's more structure there.
COLTON OGDEN: Parsing the HTML,
converting that into the PDF elements.
DAVID MALAN: Exactly, so kind of
behaving like a browser would.
Let's go back into my PDF
and click this and open it
with preview, which gets rid of some
of the menus that Acrobat earlier had.
And OK, like we're in better shape.
It's not beautiful.
And we still have some line wrap issues.
COLTON OGDEN: Yeah, it
looks like it does--
I guess it would be overflow.
What would that-- what CSS attribute
would that be where it just
keeps going, wrap [INAUDIBLE]?
DAVID MALAN: No, this
would be overflow none.
So you just chop it off, yeah.
That's what's happening by default here.
But of course, there are some bugs.
Come back to that in a second.
COLTON OGDEN: Oh, 'cause yeah, it's
thinking that those are HTML elements.
Because in C, when you do an include,
you need to use the angle brackets.
DAVID MALAN: Exactly.
COLTON OGDEN: So it's thinking that's
a tag, but it's an invalid tag.
And so it doesn't know how to parse it.
DAVID MALAN: So this, of
course, would be a nightmare
if like we, the programmers or the
teaching Fellows, needed to go in
and change their code.
So let's try to fix this.
Let me go in here and
let me go into cs50.html.
And I could, for instance, when we get
to these includes, do something like,
well, let's go ahead and
change all instances of hash
include space open bracket
to the same thing, but do--
let's see, that's a less than.
So ampersand lt semicolon.
And then do this globally.
Enter.
OK, so now I actually screwed
that up, because in Vim, I
think ampersand has special meaning.
So let's undo that.
Let's try again.
Let's go ahead and replace that with
ampersand lt semicolon globally.
And I think I probably
have to escape this.
I'm not sure offhand, so we'll try.
Yeah, OK.
COLTON OGDEN: Nice.
You can do the same thing for the
greater than in this case too,
with that escape character.
DAVID MALAN: Yeah, so let's do that.
So let's save that.
Oops, sorry, let's substitute.
Now say dot h.
So literal dot instead of a
wildcard, h, angle bracket.
And then change this now to a dot h, and
then ampersand greater than semicolon
globally.
But let's escape that again.
Cross your fingers,
because I rarely do this.
OK, I think that's better.
COLTON OGDEN: For more, see the
regular expression stream that you did.
DAVID MALAN: Yes, also.
Look at this.
We're coming full
circle with everything.
So many reasons to tune in.
COLTON OGDEN: Want to shout out
also, MATTHEWTHEGOODMAN, MASHABANA--
DAVID MALAN: Nice, some new followers.
COLTON OGDEN: And [INAUDIBLE].
DAVID MALAN: Nice to see you all.
COLTON OGDEN: [INAUDIBLE], yeah, OK bro.
[INAUDIBLE] and NABSLACK,
thank you very much.
DAVID MALAN: Nice.
Nice, and we'll take a couple of
questions in just a moment too.
Let's go ahead and
save this in just demo.
So saving the file.
Let's go now back to my prompt where I'm
going to run WeasyPrint of cs50.html.
Output cs50.pdf, Enter.
It's taking a second too.
All right, now I'm going to go ahead
and reopen that in Preview on Mac OS,
but you could use Acrobat
or whatever on Windows.
And scrolling down,
crossing my fingers, OK.
So better.
COLTON OGDEN: Nice, OK.
DAVID MALAN: But this,
of course, is going
to be super annoying to the TF
staff to manually change their code.
But honestly, we can probably do
that with code, like with Python.
COLTON OGDEN: Yeah, I can see it being
troublesome for Boolean expressions
where you're trying to compare
less than or greater than.
I mean, if you were doing it in
the context of a Boolean expression
that did a less than and then a
greater than check [INAUDIBLE]..
DAVID MALAN: Oh, absolutely.
We haven't caught those.
I was very specifically
just fixing a few of these.
COLTON OGDEN: And that would be hard
to, I think, maybe necessarily fix--
I guess you could--
I guess you could.
Well, no, it'd be kind
of tough, because people
with different spacing conventions
in the context of different letters,
different numbers.
I think that could be very difficult.
DAVID MALAN: No, you
pretty much want to do
a global replace of all HTML
entities that might otherwise
be confused as actual HTML.
Should we take a couple questions?
COLTON OGDEN: Yeah, we had a couple.
Actually, very few questions,
but a few comments.
So SAMCODE says hi, David.
DAVID MALAN: Hi, SAMCODE.
COLTON OGDEN: The classic [INAUDIBLE].
DAVID MALAN: Hello from DAVIDCODE.
COLTON OGDEN: They're saying that--
VERONI was saying that this is, I
think, spaghetti code in a nutshell
when you were writing up the HTML
page with the HTML or [INAUDIBLE]..
DAVID MALAN: Yeah, pretty much.
And actually, there was one up above.
Did we give the URL of
the CS50 library here?
Down here.
COLTON OGDEN: CS50 library [INAUDIBLE].
DAVID MALAN: So HASSAN asked for that.
COLTON OGDEN: Oh, did we--
DAVID MALAN: The CS50 library.
So that's github.com/cs50--
can't type-- /libcs50.
That's the C library that we're
just using for demonstration sake.
That has nothing to do
with render50, per se.
COLTON OGDEN: What's the topic?
I am on limbo, says RABIN.
So RABIN, we're talking
about render50 today.
So the Python application
we use to basically print
source code to PDF, which is
a surprisingly difficult thing
to get with most modern text editors.
DAVID MALAN: Yep.
And BRENDA and [INAUDIBLE]
kindly elaborated on that too.
So I think we're in a good place now.
All right.
All right, so we could
do better than this,
but we at least now
have a tool, WeasyPrint,
that can clearly take code, or more
specifically HTML, to the final output.
So what comes next?
So why don't we start to clean up the
formatting here and the page size,
and get just a basic PDF right.
And then we'll come to
syntax highlighting.
COLTON OGDEN: So would that be--
OK, so I guess maybe
one of the first things,
would that be like a CSS
style sheet, like you said.
DAVID MALAN: Yeah.
COLTON OGDEN: This
HTML needs the styling.
Would we fix these comments with
something like overflow wrap?
DAVID MALAN: We could, indeed.
Let me actually pull out my cheat sheet
so I don't forget some of the details.
COLTON OGDEN: Sure.
It's a surprisingly large application.
DAVID MALAN: Yeah, render50, in
the end, is currently 515 lines.
But we're not going to go
through all of those today.
Let me just actually find
the lines I care about here.
COLTON OGDEN: That'd be a super stream.
That'd be like a whole weekend's
worth of line by line analysis.
DAVID MALAN: Indeed.
So what I'm going to go here
is show a very simplified
version of some of the
CSS we use in render50
now to start solving these problems.
And to be clear, the
problems I claim are,
this is still messy,
even though it's better.
I'm chopping off some of
the code on the right.
It's portrait mode, which
means there's like no room
anywhere for any comments, like
you might have with landscape mode,
whether letter or A4.
So let's see if we can clean that up.
Let me go ahead and go back
to my HTML page for now.
Again, doing things manually just
to demonstrate how to do this.
Then we'll consider how to
automate all of this with code.
And I'm going to go up
here, and of course,
say well, in my head of my web page,
I can also have some style tags.
Vim's being a little uncooperative,
so let me go ahead and re-indent that.
COLTON OGDEN: So that's cool.
It'll actually look at
the style within the HTML.
You don't need an external
style sheet to get
it to work just like a web browser.
DAVID MALAN: You can just
do it all self-contained.
And this is nice, because it's
really easy for us in code,
ultimately, to just generate a temporary
HTML file with CSS and code in it,
render it, and then throw it away.
COLTON OGDEN: This is cool, yeah.
I like it.
DAVID MALAN: So all right, so what
do we want to go about doing here?
Well, let's go ahead and fix--
how about the size of the paper?
So it turns out that there's
some special directives in CSS
that apply not to web pages,
but literally to printouts.
And you can literally start
your CSS by saying page, @page.
And in here are all
of the properties that
are somehow going to relate to
the whole page when printed out.
So I'm going to go ahead and
do a couple of things here.
Let me go ahead and say
a margin, like in the US,
it's pretty common to have
0.5 inches for margins.
And even though in web
design you typically
wouldn't use inches or points,
that's generally for the print world.
Well, that's our goal,
PDFs and printing,
so I'm going to use inches here.
But you in your country could use
whatever is conventional there as well.
Let me just go ahead
and add some paper size.
So it turns out for paper
size, you can literally
say something like letter, landscape.
And these are well-defined
values in CSS.
In fact, let me go ahead and open
a browser and Google CSS page size.
See if we get some helpful results.
I like to go to Mozilla's MDN,
Mozilla Developer Network.
And you'll see here that this defines
the size and orientation of the box
which is used to represent a page.
And so you can, more literally,
this size corresponds
to the target size of the
printed page, if applicable.
And if we look down here on this
site, you can see some sample syntax.
You can just say auto, which
is what I was getting earlier.
Just default to letter in
the US and portrait mode.
But you can also specify landscape mode.
You can specify the size of the page.
So if we had smaller paper
that we wanted to print to,
or want smaller or bigger
PDFs, you can specify.
And then it understands A4.
So if you're abroad, you could
say A4 landscape as well.
Whatever suits your paper best.
COLTON OGDEN: Nice.
Something that you don't often see
in the world of web programming.
You dive into this rabbit hole and--
DAVID MALAN: I don't even know
if I knew this beforehand.
And so we actually had a
problem to solve with it.
COLTON OGDEN: Very flexible, but
also GOAL1 says, will use it.
Very helpful.
Thanks for sharing.
DAVID MALAN: Nice, nice.
Cool.
COLTON OGDEN: And MDN is the
best, says RABIN as well.
DAVID MALAN: All right, so let's see.
So I've specified a margin, which
just gives me a white border,
just to keep things
away from the printer.
Because a lot of physical printers
can't print full bleed, so to speak,
where the ink bleeds to
the edge of the page.
So we'll give it at
least a half an inch.
And then letter is,
for those unfamiliar,
8.5 inches wide and 11 inches tall.
But landscape means
rotate that 90 degrees.
COLTON OGDEN: Short those, yeah.
DAVID MALAN: OK, so let's see if
we've gotten a little farther.
Let me save this.
And let me go ahead and rerun
WeasyPrint cs50.html, cs50.pdf.
Give it a second or so.
Might have to do a little more thinking.
All right, now let me
go back to preview.
And notice, it's automatically
reloaded and it's already
in better pedagogical shape.
Like I now have some actual
margins on the right-hand side
here, where I could write
comments with a pen or type
them in Acrobat or something.
And what do you think so far?
COLTON OGDEN: It looks good.
It looks a lot better.
I mean, it still needs a lot of
colorization, I think, huge part of it.
DAVID MALAN: And what do--
you're not going to like this.
COLTON OGDEN: We do still
have the wrapping issue.
And earlier we didn't see
it, but we can see it here
for this particularly long line.
DAVID MALAN: Yeah, I'm not liking that.
COLTON OGDEN: That's still a problem.
DAVID MALAN: So let's go in there.
And let me go in here and say
now, on cs50.html, let's say--
not inside the page, because
that's paper specific.
Now this is more of an HTML thing,
because I have certain tags in there,
like my pre tag that
I want to wrap around.
So I'm going to go ahead and
say pre, specifically, which
is just the standard CSS selector.
And then inside here--
COLTON OGDEN: And pre, in this case,
this is our entire source code.
DAVID MALAN: Whole source code.
Whole source code.
I'm already getting for
free the monospace font,
but I could change it here using this.
Font size, let me go ahead and be
specific, say something like font size.
I think I like 10 point for printouts,
we've decided, just by trial and error.
Looks pretty good and
is pretty readable.
Let me go ahead, then, and say--
and this is to your point earlier--
overflow wrap.
I'm going to say break word.
COLTON OGDEN: OK.
DAVID MALAN: So this is
going to help break the code,
especially if you have a
long string that's not really
English or any other language,
you need to help the computer know
where it's allowed to displace it.
COLTON OGDEN: So this is different
than overflow colon wrap?
DAVID MALAN: This-- overflow.
Yes.
So this is a little different,
specifically for this kind of problem.
And then I'm going to
specify whitespace pre wrap.
So by default, here's how
you should handle whitespace.
Should be unnecessary because we're
already in a pre, but in render50,
we actually use divs and some other tags
too, so this helps with that problem
as well.
All right, so let's go ahead and
save this, and cross my fingers
and hope that this actually works.
Let me go ahead and rerun this.
And I'm going to stop
typing the same commands.
If you're unfamiliar, in Linux or Mac
OS, or the Bash subsystem in Windows,
you can do bang, or exclamation point,
and then like the last few letters
of a command you typed, even just !w,
and that's going to rerun the previous
command that matches
if you just hit Enter.
COLTON OGDEN: That's cool.
This time it looked like it was faster,
just a little bit, for some reason.
DAVID MALAN: Yeah, it might--
I think that's probably
a coincidence, to be--
or my Mac had a little
[INAUDIBLE] cycles there.
COLTON OGDEN: Maybe.
DAVID MALAN: All right, so
let's go back into there.
You'll see it's a little smaller because
I changed the font size to be 10 point.
Let's scroll up.
And now more fits on the screen.
So actually, I probably shouldn't have
tricked you by both shrinking the font
and then claiming to solve the problem.
So let's actually undo that.
Let me go back in here and
not change the font size
and see if we're actually
solving a problem.
WeasyPrint-- oops.
So !wa, command not found,
because I mistyped that.
!we for WeasyPrint does match.
COLTON OGDEN: This time it looks like
it was slower, so maybe I'm not sure.
DAVID MALAN: So let's see here.
And now, let's scroll a little,
see if we can find that line.
OK.
COLTON OGDEN: Oh good.
OK, nice.
DAVID MALAN: So now try-- it's ugly,
but if you're printing to paper,
you've got to accept either
cut it off or wrap it forcibly.
COLTON OGDEN: Yeah, certain drawbacks.
DAVID MALAN: Yeah.
Any questions come in?
COLTON OGDEN: [INAUDIBLE]
is just making fun of you
for not using Fish Shell,
which is just another shell.
DAVID MALAN: Oh, we use Bash
for everything for CS50,
but you can use whatever you'd like.
COLTON OGDEN: Yeah,
there's a lot of-- so these
are private comments when
they do this, which means they
don't want to show up on the stream.
You'll see dot--
DAVID MALAN: OK, so we
won't read those verbally.
COLTON OGDEN: Yeah, we
don't read those verbally.
Otherwise, I think we're all caught up.
DAVID MALAN: All right, cool.
So I feel like what we're lacking--
so we now have landscape mode.
We can shrink the font, which
just makes it a little more
pleasant for the teaching Fellows.
But we're lacking in syntax
highlighting, which means
it's just a little annoying to look at.
And it's worse than, of
course, a text editor.
So let's bring that back.
How do we do that?
COLTON OGDEN: We probably
don't do it manually,
because I feel like
that would be terrible.
DAVID MALAN: But--
COLTON OGDEN: We have to write
a grammar to do that, I imagine,
and it would be a bit complicated.
But I'm sure there's probably
a library that lets us do that.
DAVID MALAN: We could.
Let me try this.
Let me do this.
Just for the sake of demonstration,
let me go into the file,
scroll down to CS50 library.
And what if I were to do this--
span, style, color, red, close quote,
close bracket.
And then close the span here, because
recall that this is just HTML now,
even though it still looks like
code because it's in my pre tag.
Let me save that.
And let me go ahead and
rerun WeasyPrint, which
we can do with our bang trick now.
Second to do that.
And so here's like kind
of a germ of an idea.
COLTON OGDEN: Yeah, now I
would imagine that would be--
how ultimately it would have to happen.
But us doing that ourselves would
be nightmarish, I feel like.
DAVID MALAN: That's fair to say, yeah.
And so there's a bunch
of ways to do this.
And you allude to the most robust,
like any programming language
has what's called a grammar, which is
a very formal specification for what--
not even what the functions
are or what the arguments are,
but rather what the structure
of any expression has to be.
If you take a CS theory course, you
can actually see cool techniques
for defining these things.
But in a nutshell, every
language has a grammar.
And one of the tools you can use to
parse languages according to a grammar
is this free library called
ANTLR, which we've actually
used a little bit in the past.
COLTON OGDEN: We used this for some of
our more recent projects [INAUDIBLE]..
I'm not too familiar with them,
but some of the teaching staff
have definitely mentioned it.
DAVID MALAN: Yeah, it's very robust.
It's another tool for
language recognition.
And it's got support for
lots of different languages.
In fact, I'll try to pull
them up in just a moment.
The catch is, it's very
computationally expensive, because--
COLTON OGDEN: That was another
thing that one of our teaching staff
was mentioning.
DAVID MALAN: In fact, I
think we started using it
and then dropped it in
favor of another technique
that render50, coincidentally, uses--
well, not coincidentally-- uses as well.
Because this does it 100%
correctly and robustly.
The problem is when your code
gets bigger and bigger and bigger,
it just gets harder and harder
and more time consuming to parse.
COLTON OGDEN: What's the technique
that we use instead of ANTLR?
DAVID MALAN: So we use regular
expressions, which are not as robust.
COLTON OGDEN: Oh, for
check50, the checks.
DAVID MALAN: For render50.
COLTON OGDEN: Oh, for render50, OK.
DAVID MALAN: But it gets the
job done in almost every case.
And so it's-- the performance boost
we get on the order of seconds,
if not minutes, is
certainly super compelling.
But just for folks that are curious,
because this is a handy library in case
you want to do something
formal with grammar sometime,
let me go to the GitHub link
here, browse source tree.
And then if I recall here, let's see
if the grammars are built in here.
I'm trying to remember where they are.
Scripts, runtime, let's look in here.
Yeah, maybe CPP.
No, these are not them.
Let me see if I can
find them, the grammars.
Grammar, additional grammars.
Here we go.
COLTON OGDEN: [INAUDIBLE]
DAVID MALAN: There we go, that's funny.
This repository.
OK, so here are all of the languages
that people have written grammars for.
Let's go to C right here.
And I think it's in this G4 file.
Is it?
And that's just comments.
Yeah, so this is the syntax that they
adopted for defining a grammar for C.
Won't get into the weeds of
this, because it's crazy long.
But that's why it's expensive
computationally to actually parse this,
but you'll see, like a
block comment is something
that starts with forward
slash star, ends with star
forward slash, and then has zero
or more characters in between.
Or you can just skip
this block altogether.
COLTON OGDEN: [INAUDIBLE]
compilation of simple regex pieces.
DAVID MALAN: Sort of, yeah.
Exactly.
But that's ANTLR.
We don't use it, but it's one way.
You could think about,
well gee, how do I
go about figuring out what should
be red, what should be blue?
I mean, how does Atom do
it, how does Sublime do it?
They all do something like this.
COLTON OGDEN: So RABIN was asking, is
ANTLR used for compiler design as well?
DAVID MALAN: I think
so, because you can use
it to parse any language into
an AST abstract syntax tree,
like literally a tree
structure that represents it.
And all compiler really does is
transform a tree in one language
to a tree in another
using different symbols.
COLTON OGDEN: Should have a
stream on that at some point.
DAVID MALAN: Yeah, you've got to get
someone more qualified for that one.
So what we actually use,
spoiler, is Pygments.
Pygments is a very popular Python
library for syntax highlighting.
It's at pygments.org, if
you'd like to play along.
And you'll see that it, like
WeasyPrint, can be installed
very easily with PIP Install, Pygments.
Honestly, it's a little
overwhelming, that documentation,
like it's such a rich library that
there's a lot to wrap your mind around.
It took me some googling and
some reading to figure it out.
But we can at least try it out first.
Let's do this.
So--
COLTON OGDEN: That's cool.
They have a little built-in.
DAVID MALAN: Yeah, let
me go ahead and just
let me go ahead and open this file.
So cat is a handy tool if you just
want to dump the contents of a file,
concatenate them onto the screen.
So I'm going to go ahead and
highlight all of the code
from libcs50 onto my clipboard.
COLTON OGDEN: Bella kindly
plugged pygments.org [INAUDIBLE]..
DAVID MALAN: Nice, thank you Bella.
Let's go ahead and paste here.
Code description, cs50 library I'll
call it, just because they're asking me.
And then the language, you can see
here, there's some mention of ANTLR,
so they apparently might use this
underneath the hood, or the syntax
maybe.
I've not looked closely.
But I know it's C, so I'll
choose C. But you'll see,
oh, there's some NSFW
languages on the list, folks.
But we'll just move on.
I'll choose the language as C.
COLTON OGDEN: So [INAUDIBLE].
DAVID MALAN: We'll leave it to the
reader at home to zoom in on there.
And let's go ahead and click highlight.
And you'll see now, if
we scroll down, that we
have some minimalist highlighting.
And it's honestly
really hard to see here.
Let's zoom in on this.
COLTON OGDEN: Yeah, I'm not
a huge fan of that theme.
DAVID MALAN: Yeah, this is awful.
This is like dark green.
This is black.
This is dark red.
This is light blue.
So actually, let me see
where it is in the UI.
Yeah, here we go.
Use this style.
So friendly, ironically,
is pretty unfriendly.
COLTON OGDEN: Not friendly, yeah.
DAVID MALAN: And I don't remember
which one we use offhand.
So--
COLTON OGDEN: Monica-- I mean,
Monica has a good one for text--
a lot of text editors will use that one.
DAVID MALAN: OK, so let's use that one.
COLTON OGDEN: But we
probably-- for printing,
we probably don't use [INAUDIBLE].
DAVID MALAN: But this is awful.
This will cost us a fortune in ink.
But it's a lot more readable.
So if you ever wondered where
night mode or dark mode comes from,
it's just another CSS
theme, especially in Atom
which is based on a web
browser, based on HTML.
So that's looks nice.
Let's see if we can find one
that is appropriate for printing.
Maybe Xcode for Mac OS.
They have--
COLTON OGDEN: Oh yeah, that's true.
They have a pretty good life theme.
DAVID MALAN: All right, so a little bit.
It's a little subtle.
Not loving it.
But long story short, you
can play with the themes.
You can come up with your custom theme.
And I think what we
actually do is we downloaded
our own theme that mimics
GitHub, just because it's
popular and familiar to look at.
Let's look at one more.
You want to pick one here?
COLTON OGDEN: Let's do
Fruity, very bottom.
DAVID MALAN: Fruity at the bottom.
Hopefully that's nice and colorful.
It is, but going to cost us a fortune.
That is an awful red.
COLTON OGDEN: It's hard on the eyes.
DAVID MALAN: And pink.
You know, it's funny though, I've
done this before for render50,
going through all the themes.
Like they're all awful.
COLTON OGDEN: Yeah, it's hard to
find a good-- like VS codes is nice.
I have it on--
I mean, I could pull it up on mine, but
the stream wouldn't be able to see it.
But a lot of the main text
editors, like Atom, VS, Sublime,
they all have pretty good native--
DAVID MALAN: Yeah, I know.
And I don't know why the
libraries that are out there just
have awful code, awful syntax.
COLTON OGDEN: I don't know, yeah.
I can't imagine coding
something like this.
I feel like I would lose my vision.
DAVID MALAN: Yeah, so-- but long
story short, we can do this,
but we can do this programmatically
now with some code.
So should we try to
weave this in somehow?
COLTON OGDEN: Yeah, let's do that.
Let's integrate Pygments.
DAVID MALAN: All right, so let--
COLTON OGDEN: Is it
"pig-ments" or "pygments"?
DAVID MALAN: Pig--
It's probably pigments, even
though it's Python, so yeah.
Since "pigments" is an
actual word for color.
COLTON OGDEN: Well,
[INAUDIBLE] was saying
Vibrant Ink is a good
one, or [INAUDIBLE]..
DAVID MALAN: Oh, yeah?
You want to try that?
Vibrant Ink, Vibrant Ink.
COLTON OGDEN: It might not be--
DAVID MALAN: Oh, but it's not
supported by this one at least.
COLTON OGDEN: Deep ocean, is that one?
DAVID MALAN: Deep Ocean.
COLTON OGDEN: And why
are they not sorted?
I mean, honestly.
[INAUDIBLE] random sort, yeah.
DAVID MALAN: OK, so in any case,
don't let this discourage you.
Pygments can-- really, the
powerful part of Pygments
is that it can do the
syntax highlighting.
The colors is just an
aesthetic CSS detail.
COLTON OGDEN: [INAUDIBLE] asking,
what about formatting the code,
adding prettier format on save?
And I think that's what
we're essentially getting
to is the syntax highlighting aspect.
DAVID MALAN: But only syntax
highlighting. render50
is not going to fix your style.
It's not going to do like style50
and tell you what's wrong.
It's not going to change indentation.
It's just going to print your
code, but syntax highlighting.
COLTON OGDEN: Do we have a tool that
actually does manually change the code
or does render50 only--
or style50 only prescribe
the changes that you need to make?
DAVID MALAN: Just the latter,
and that's deliberate,
because we didn't want
students to get into the habit
lazily of just know fix my code
for me, fix my code for me.
The whole point is to get muscle
memory for doing it yourself.
COLTON OGDEN: Makes sense.
DAVID MALAN: But there is
a tool underneath the hood
of style50 called astyle.
It's an older program
that's great for C--
we use different tools
for different languages--
that actually could do
that for you if you wanted.
COLTON OGDEN: OK, that's pretty cool.
DAVID MALAN: All right, so
let's see if we can't start
to stylize things a little bit better.
So I'm going to go in now.
It's time to, I think, write a Python
program here so that we're actually
doing this programmatically
and stop doing this manually.
So let's go ahead and do this.
Let me go ahead and
save that last version.
Let's go ahead and write a program
called render50.py, for instance.
And let's go ahead and
start to do this as follows.
Import argv, or no, rather sys,
so that I have access to argv.
And then let's just do
something simple like this.
Let's call it input is going
to be sys.argv bracket 1.
And then my output file is going to
be sys.argv bracket 2, for instance.
And then just as a sanity
check, let's go ahead
and say up here, if length of
sys.argv does not equal 3, I believe,
let's go ahead and
sys.exit, saying usage
is Python render50.py of your input
and your output file, like this.
So what I do when programming,
honestly, even something like this,
because frankly it's been a few
months since I wrote Python,
I'm going to go ahead and save the file.
And I'm just going to run it.
Let's see if I screwed up already.
And indeed, I did, it would seem,
because it's detecting this.
Let's say my input's going to be foo--
OK, still not cooperating, but
that's to be expected-- bar.
So I think I'm OK.
It hasn't done anything,
but the next sanity
check I would probably
do for myself is this.
Once I've gotten these variables,
I might say something like,
let's just print out
input is, and then I
can go ahead and put a
placeholder here and say input.
And then I can do another
one that says output is.
COLTON OGDEN: We do something
very similar to this
in the a lot of the game streams too.
DAVID MALAN: Yeah, like
baby steps, if you will.
COLTON OGDEN: Sanity
checking variables, yeah.
DAVID MALAN: And let's format those
strings so that Python 3.5 knows
what to do, or higher.
Let's go ahead and rerun this.
Python of render50, foo bar.
OK, so like correct.
Like five out of five for
correctness, even though it doesn't do
anything useful.
But honestly, to this day, this
is how I start writing code.
Because if you just start
blindly writing lots of code,
you then waste more of your
time debugging than coding.
COLTON OGDEN: That scaffolding is kind
of like test-driven development too.
DAVID MALAN: Yeah, no.
I mean, absolutely.
I could actually write tests
now that pass in foo and bar
and make sure that foo and
bar are printed, for instance.
COLTON OGDEN: We have a couple
people make [INAUDIBLE] also.
REMOTE says hello.
DAVID MALAN: Hello REMOTE.
COLTON OGDEN: Hello, R3MOTE.
DAVID MALAN: R3MOTE.
COLTON OGDEN: And then we have,
NOTSUREOFCOURSE say, I see Vim,
I follow.
So you've got to follow [INAUDIBLE].
DAVID MALAN: OK, we'll just start--
this is CS50's Vim stream, all
the things you can do with Vim.
COLTON OGDEN: Yeah,
NOTSUREOFCOURSE, and R3MOTE,
thank you very much for following.
Appreciate it.
DAVID MALAN: OK, cool.
All right, so let's see if we
can't now automate or encode
what we were doing with WeasyPrint,
then add some syntax highlighting,
and then we're pretty
close to having render50.
COLTON OGDEN: To the
actual application, yeah.
DAVID MALAN: Even though
the actual one took
500 lines and a lot of weeks or days.
OK, so let's go back into my code.
Oops, that's my other file.
Let's do Vim of cs50.--
nope-- of render50.py.
Get rid of these test lines.
And now let's go ahead and actually
do something interesting here.
So I want to actually use WeasyPrint
to take as-- input that file,
and I want to go ahead and
output that file as well.
So we're going to have
to read in the file.
So how, in Python, would you
read in the text of a file?
COLTON OGDEN: With open.
We're using a context manager,
so [INAUDIBLE] name of the file.
DAVID MALAN: With open,
and open input, and read.
COLTON OGDEN: As f, or
file, and then that way
it'll close automatically
when you're done with it.
You don't have to
manually close the file.
And whatever work you need to do on
this, if you're going to read the lines
or read it as text, you could
say file.read, or file.readlines.
DAVID MALAN: OK, so do that.
Lines gets file.read, and
just read the whole thing in.
And then honestly, just
as a sanity check here,
let's just print out the line.
So if I run this program
now, hopefully I'll
see the entire contents of my
input, if we did this right.
COLTON OGDEN: As one
string in this case, yeah.
DAVID MALAN: So let's do this.
And Python of render50.py.
That's my usage.
My input is going to be cs50.c.
So I'm not using that
intermediate HTML file anymore.
And eventually, I want my
output to be cs50.pdf, even
though I'm not going to get there yet.
Enter.
OK.
COLTON OGDEN: Well, it seems to work.
DAVID MALAN: So we're
baby steps, and we're
one step closer to having the program.
So that's good.
So now we somehow need to tell
WeasyPrint to actually turn
this string into a PDF.
But we also want to
wrap it with some HTML.
So there's a bunch of
ways we could do this.
But just for the sake of simplicity,
let's go ahead and just start this.
So let's go ahead and say
the HTML that I actually want
is going to be literally a doc type.
And then let me go ahead
and concatenate onto that.
For instance-- actually,
let's give myself a new line.
COLTON OGDEN: You can also
use triple quoted string to--
DAVID MALAN: Yeah, we could do that.
COLTON OGDEN: If you wanted to do that.
DAVID MALAN: Yeah, let's
just show it this way.
And then we can--
eventually, we're going
to clean it up, 'cause
you're not going to want all
this concatenation in your file.
But it's perhaps the simplest
way to show it also line by line.
COLTON OGDEN: OK.
DAVID MALAN: So let's, hey
browser, here comes my HTML tag.
Hey browser, here comes my head tag,
and new line, just for good measure,
even though strictly speaking, we
don't need these last two new lines.
HTML plus equals, now let me give me--
honestly, I don't need the
title because I'm making a PDF,
so I'm going to leave that alone.
I am going to give
myself a style tag here.
And I'm going to go ahead
and close that tag, even
though we won't put anything in it yet.
And I'm going to go ahead
and close the head tag now,
even though there's nothing in it.
And I'm going to go ahead
and start now a body tag.
And now I'm going to go ahead
and concatenate on the lines.
And then I'm going to
go ahead and concatenate
on close body, and a new line.
And then-- oops, I'm using C. Then--
though incorrectly.
Then let's go ahead and
concatenate one last thing--
HTML.
And just to be nit picky, a new line.
COLTON OGDEN: So this is
unenclosed with a pre, too,
so this isn't even
going to be monospaced.
This is just literally
the text of your--
DAVID MALAN: Indeed.
This is like version 1 where we
started without the pre tag, I think.
Or now, even without a tag.
So let's just do a sanity check now.
Let's actually print out the HTML mail
that I've just dynamically constructed.
Save that.
Python of render50.py,
passing in cs50.c.
Outputting cs50.pdf.
Let me clear my screen.
So here we go.
Run.
All right, so it's almost the same,
but sure enough, there are those tags.
I didn't bother indenting them,
but it doesn't matter for HTML.
And if we scroll up, up, up, up, up,
keep going, this is all my C code,
C code, C code, C code,
thick, C code comments.
There we go.
There's all that starter.
COLTON OGDEN: Very poorly
formatted, by the way.
DAVID MALAN: Yeah, but honestly, it
really doesn't matter in this context,
because a human, and a browser
even, are never going to see.
It's all going through the library.
COLTON OGDEN: [INAUDIBLE] this HTML.
DAVID MALAN: So let's go back
here and let's add the pre tag,
like you alluded, so that
we're ready for that.
So plus equals-- oh, quote
it, pre, backslash n.
And then let's close that over here.
And as Colton said, we don't need all of
these lines of plus equals plus equals.
Could use a triple coded
string and just do it manually.
But honestly, once we
really get clean here,
we're probably going to want to
use a templating library maybe.
Or honestly, in render50, what I do is
I just have long single lines of HTML,
because it really is just throw away
stuff that I'm generating quickly.
And I didn't want to overengineer it by
having a whole template for, honestly,
what's a pretty small file.
OK, so let's go ahead now and run this.
Oh, right.
Let's still print the HTML.
Let's run this one last time.
And I'm just going to copy
my command from before.
And now we have a dynamically
generated HTML file,
but using code, not using Vim
and manually editing the C file.
COLTON OGDEN: Yeah.
DAVID MALAN: So we've
already made progress, even
though we're reinventing a wheel here.
COLTON OGDEN: What comes after CS50,
and can foreigners participate?
DAVID MALAN: Really good question.
We can paste a few URLs here,
actually, and field this.
CS50 has a few follow-on courses
now, one taught by Colton.
So if you want to go to
edx.org/cs50, you'll see--
COLTON OGDEN: [INAUDIBLE] in there.
And then-- oh, [INAUDIBLE].
DAVID MALAN: That has
all of them, actually.
Oh, actually, Brenda posted it there.
So let's see what Colton just posted.
COLTON OGDEN: Oh, nice.
I posted the [INAUDIBLE].
DAVID MALAN: OK, we'll
paste all of them.
So we have /web for the web course, and
/mobile for Jordan [INAUDIBLE] react
native course as well.
So those are three
good follow-on courses.
COLTON OGDEN: They love VS code.
Just type exclamation
point and that's it.
Oh yeah, because there
are certain text editors--
I don't know if maybe Vim
has something like this.
You can type, basically, in VS
code you can just type HTML tab,
and it'll just give you an
entire template for free.
DAVID MALAN: Oh, yeah.
COLTON OGDEN: Vim probably
has something like that too.
DAVID MALAN: I'm sure.
I just don't have fancy Vim plugins.
I do it old school.
COLTON OGDEN: Yeah.
And aside from that, I think we're--
oh, and JMC516 saying,
yay, David is here.
DAVID MALAN: Nice to see you too, JMC.
COLTON OGDEN: Hope your week was good.
DAVID MALAN: And these courses,
are they equally as short--
oh, OK.
So that's being asked--
we'll answer that publicly anyway.
Are these three other courses
equally as short as CS50?
Well, I'm not sure I would call CS50
short, but they're no longer than CS50.
And they're probably a little
less work, because there
are fewer projects over multiple
weeks, whereas CS50 is week after week
after week of work, at least if
you're doing it in real time.
COLTON OGDEN: It prepares
you for anything, any course.
DAVID MALAN: Perhaps, indeed.
Yeah.
COLTON OGDEN: Why are there
sometimes-- why is it sometimes
that I have to specifically read a file
in bytes with rb instead of just r?
DAVID MALAN: Yeah, so rb specifies
binary file instead of Ascii.
And this has to do with
encoding issues, because if you
have actual text in a
file and you want Python
to be able to distinguish Ascii
characters from longer Unicode
characters and different encodings,
you have to help the computer do that.
If you just tell it binary, it doesn't
know if this byte or these bytes
is a character or binary data.
And so long story short, when it is
truly just binary data, zeros and ones,
you tell that to Python
so that it doesn't
try interpreting the characters as
letters or symbols on a keyboard.
COLTON OGDEN: If you read
like an executable file--
DAVID MALAN: An executable file,
a forensic file, image file,
a video file, anything that's not text.
COLTON OGDEN: Anything that's not
encoded, I guess, as text, yeah.
And there is a binding in VS code
for the templatization, et cetera.
And the key bindings
autoformat in Vim, actually.
DAVID MALAN: Yeah, Vim keystrokes can be
used in lots of different editors here.
All right, so shall
we go ahead and start
to use WeasyPrint programmatically?
Then we'll figure out how
to get Pygments in there.
And then I think we can kind of look
at it altogether in render50 itself.
COLTON OGDEN: Sounds good.
DAVID MALAN: All right,
so with WeasyPrint,
let's see if, actually, we can
give folks a good starting point.
Because honestly, it's
been so long, I don't even
remember without looking at my code.
I'm going to go back to
WeasyPrint.org, which
is this the website for this
free library that we use.
It is both a command in
Linux and Mac OS and Windows
that you can use to generate
PDFs, but it's also a library.
And that's the part I
want to remember now.
So I'm going to go to
Documentation up at the top here.
And the author is really
into their animations here.
COLTON OGDEN: It's a nice
looking website, though.
I do like it.
DAVID MALAN: It's very pretty.
I just really want to
get to the good stuff.
Beautiful examples, source files.
So here's how you install it.
And I think we mentioned this earlier--
PIP Install WeasyPrint,
pretty straightforward.
Oh, and as an aside, you can
use it to print web pages, too.
COLTON OGDEN: Oh, that's cool.
It will actually use like wget, or
underneath, curl under the hood.
Or I guess [INAUDIBLE]
wget, probably, right?
DAVID MALAN: Well, the equivalent.
It will grab all the source code
and then render it for you. which
is pretty cool.
If you wanted to like make sort
of a PDF screenshot of a web page
at a command line.
So let's go to, let's
see, Installation Guide.
The one catch with WeasyPrint is that
it does have a bunch of dependencies.
And so let me defer to the
documentation as to what those are.
Frankly, this is one
of the reasons why I
tend to run it in CLI50, because
our own command line environment has
all this stuff packaged up for you.
However, render50 does pull these
things in as needed as dependencies,
but here's where you might run into
some hiccups, especially on Windows,
I think especially,
and maybe even Mac OS,
where there's just not as rich of an
ecosystem for some of these libraries.
COLTON OGDEN: WEBSTREAM23,
Nate says, I'm here.
Sorry I was late.
Hey, David.
What is-- I'm sorry,
but what is render50?
In a nutshell--
DAVID MALAN: OK, Brenda's
pasted the URL there.
Just a command line tool for
creating PDFs out of source code,
in a very pretty and useful way.
COLTON OGDEN: Yeah, definitely
check out the early part of the VOD
too, if you want to see our
scaffolding from nothing to here.
DAVID MALAN: Indeed.
All right, so let's see if
we can go to the API here.
And here's where I spent a bunch of
my time trying to figure out what
functions are available in WeasyPrint.
And so if you read through the
API page on their documentation,
you'll see how to use
the command line API.
But if we scroll down to
Python API, you can actually
see where the actual
functions are here too.
So let's try to start
integrating this into our code.
I'm to go back to my render50.py.
And at the top, I'm going
to have another import here.
I'm going to go ahead and say,
from WeasyPrint, import CSS
and HTML, both of which are
classes that come with WeasyPrint.
I'm going to skip some
error checking here,
because I think that's sufficient
to get us started here.
And let's go ahead now and actually use
WeasyPrint to create a PDF dynamically
out of this code.
So here, let me just scroll down
to where we do this in render50.
And let me pull up the
corresponding documentation.
I think if we go back a bit, let's
see if we have a simple example here.
Documentation.
Oh, that's where we were.
Documentation, Design,
Full Documentation.
Do I want to go to Samples?
Let's just see the samples and see if
they give us some sample Python code.
Nope, these are just sample HTML
files and the resulting PDFs,
which are pretty nice.
So I'm, indeed, going to go back to
Documentation and go to Read the Docs.
Not ours, but theirs.
And let's see, if I scroll here, here
we go, Tutorial as a Python library.
That's what I want.
So this is exactly where I just started.
So from WeasyPrint, import HTML.
And you'll see that it can
be as simple as this, calling
the HTML class, the constructor, passing
in the URL, which isn't what we want,
but this is one usage.
And then just calling a write
PDF method inside of it.
COLTON OGDEN: So it looks like
it takes either a URL or a file.
So it can't take, in this case,
necessarily, a block of string data.
DAVID MALAN: Not just yet.
It can, it can.
But that's not what they're
actually doing here.
So let's see, what we want to do
here is this is opening a file.
And that's the same as actually
using a named parameter.
Same as using a URL here.
And here we're just doing standard in.
But now it's getting interesting.
If you have a byte string or
Unicode string already in memory,
you can pass that in, but
the argument has to be named.
So using literally
string, you can pass in.
And this is great, because this is
what you and I have been making.
You can pass in a raw HTML.
COLTON OGDEN: Nice, I like it.
DAVID MALAN: All right, so let's go
ahead and pull all this together.
Down here, I'm going to go ahead
and not print that out anymore.
I'm instead going to go ahead
and call this my document.
And I'm going to go ahead and
instantiate this HTML class.
And I'm going to pass in
a string of HTML, which
is exactly what I constructed.
And then we just have to figure
out how we take now this object
that we have in memory and
actually convert it to a PDF.
And you'll see here that
write PDF will do that for us.
COLTON OGDEN: OK.
DAVID MALAN: So let's
go ahead and use that.
And let's see, write PDF
takes a couple of arguments.
One is the file name.
One is optionally the
CSS that we want to use,
but we won't even bother with any CSS.
So let's just try that.
COLTON OGDEN: And it
looks like it's implying
you can use multiple style sheets too.
DAVID MALAN: Yeah, a whole
list of them, which is nice.
Which is consistent with
having multiple CSS files.
All right, so let's
go ahead and do that.
So let's call now document.write pdf.
I want to make the file
name be ultimately cs50.pdf,
but remember that we
took that as an argument.
So we can just say write pdf output.
And then let's go ahead
and say done, and see
if we're actually done with this.
All right, let's go ahead and save this.
I'm going to clear my screen.
And I'm going to go ahead and run
Python of render50, cs50.c, cs50.pdf.
Cross my fingers, as I
always do when coding.
It's taking a moment, taking a moment.
Maybe it's done.
We shall see.
Let's go into Preview.
And OK.
COLTON OGDEN: Nice.
DAVID MALAN: Now, it's ugly.
COLTON OGDEN: Back to where
we were, but we've reduced
our workload by a tremendous amount.
DAVID MALAN: Yeah, by writing code
for like half an hour, but yeah.
COLTON OGDEN: It's all
automated now, which is nice.
DAVID MALAN: Right, but we've
reintroduced those same problems.
So we need some CSS.
We want to rotate this thing 90 degrees.
And notice that other problem
is back, which is here.
COLTON OGDEN: Yeah, we're not
escaping the certain tags, yeah.
DAVID MALAN: Yeah, so let's
fix the tags thing first,
because this feels like
it should be pretty easy.
How do I go about doing this in Python?
Do you know?
COLTON OGDEN: Getting rid
of the tag, or are you
going to approach it from the
regular expressions point of view?
DAVID MALAN: Well, ideally.
But we want to use, ideally,
a library to do this.
COLTON OGDEN: Right, so you could
use read, yeah, in this case.
DAVID MALAN: Oh, well not even.
We don't have to do it.
COLTON OGDEN: Oh,
escaping HTML, yeah, yeah.
OK, sorry.
I thought they meant the same that
we did algorithmically before.
DAVID MALAN: No, there's a few
ways we can do this, for instance.
So this library will
apparently escape code for us.
And I'm trying to remember
what we do in render50.
Let me just do a quick check, because
it'd be nice to be consistent.
COLTON OGDEN: WEBSTREAM is saying,
finally, an example of HTML
in Python, very excited.
Is there a reason, says
MKLOPPENBURG, for importing
the libraries in a try in render50
instead of doing this on top?
DAVID MALAN: In a--
oh.
COLTON OGDEN: Like a try [INAUDIBLE].
DAVID MALAN: Oh, are you looking--
is [INAUDIBLE] looking at the actual--
COLTON OGDEN: Yeah, MKLOPPENBURG
is referring to, I think,
the actual final distro.
DAVID MALAN: Yes, because we wanted--
in the actual render50 source
code, which we'll look at in a bit,
we actually try to
import some libraries,
because we want to detect
in a user friendly way
if the user has forgotten to
install some of the dependencies.
And we can do that best
by trying to import
and then responding in a useful
way if the human has forgotten.
COLTON OGDEN: Makes sense.
It there something for Bash, like
what we did in Docker file for Docker,
write a file and it downloads
every dependency that's needed?
Brian showed something like that for
Heroku in one of the recent streams.
DAVID MALAN: Indeed.
Skipping ahead a bit.
We'll come back to this.
But let me pluck that off now.
Yes, in render50 itself, in
the actual distribution, which
we'll look at in just a bit,
there's a setup.py file, which
is a standard Python packaging file.
And you can specify literally this.
What does this installation require?
And you'll see a list of all of
the libraries, the dependencies
that we want Python to pull in for you.
This is similar in spirit
to requirements.text,
which PIP uses as well.
So yes, this can all be automated.
COLTON OGDEN: Something
like node modules,
too, in like a node environment even.
Like it's a pretty common paradigm.
DAVID MALAN: Exactly.
And let me just search for one more
thing, because I want to be consistent.
But we don't necessarily have to be.
I'm just flipping
through the source code
here for-- do you want to take a look at
any comments that are still coming in?
COLTON OGDEN: I think we've plucked
off all of the current questions.
I think we're all caught up today.
DAVID MALAN: All right.
So I'm going to go ahead here,
and what do we want to do?
OK, let's just do this another way.
Or even if it's not quite consistent.
String literal, here we go.
Our old friend Stack Overflow.
CGI escape it looks like we can use.
So let's see if this is good enough.
But this is not, I think,
what I use in render50.
We do it a little more robustly.
But let's go ahead and try this.
Let me go into my code.
And at the top, I'm going to
open up render50.py again.
At the top I'm going to go ahead and
import CGI, Common Gateway Interface.
And then in here, when I
concatenate on the lines,
I'm going to do cgi
escape of those lines.
So I pass as input the lines, which are
from the raw file, which we read in,
and hopefully escape all
those dangerous characters.
Let's go ahead and rerun that
same render50 command as before.
OK, so this is why I
didn't use cgi escape.
It's deprecated.
COLTON OGDEN: It's html.escape.
Well, that would cause issues,
though, because we're already using--
I guess you could do from html import
escape, and then I would [INAUDIBLE]..
DAVID MALAN: Yeah, that's fine.
So let's do that, actually.
Let's do this properly,
like they're saying.
So from html import escape.
And you're right, now down here I don't
say cgi escape, I just say escape.
COLTON OGDEN: Yep.
DAVID MALAN: Let's save that.
Clear my screen and
rerun Python on render50.
COLTON OGDEN: [INAUDIBLE]
DAVID MALAN: There you go.
Just a rabbit's hole.
OK, done, question mark.
That's for me.
That's not my interpreter
questioning my abilities here.
Let's go back to this.
And scrolling, scrolling, scrolling.
All right.
COLTON OGDEN: Nice, looks good.
DAVID MALAN: So programmatically, right?
No find and replace like
we did in Vim before.
COLTON OGDEN: Yeah.
This will fix all issues.
This won't be just the
context of include statements.
DAVID MALAN: Exactly.
COLTON OGDEN: This will work for
Boolean expressions and other areas.
DAVID MALAN: Anywhere there's an
angle bracket that's dangerous.
Yep.
All right, so now let's
fix the CSS style.
And then we're almost there.
And frankly, in retrospect,
I'm wondering why
I struggled so long to write render50.
COLTON OGDEN: That's OK.
DAVID MALAN: But here we go.
Let's go in here and add some CSS.
So I'm going to go ahead
and preemptively say
that my CSS is going
to have a few strings.
And what did we want earlier?
Well, we had like the app page sign.
And I think we had a
margin of 0.5 inches.
And we had a page size of letter,
landscape for the US here.
And I'm just doing this as a one-liner.
I could pretty print it like I
did before, but again it's code.
I could put this in a
separate file, a template.
But honestly, it's so short, that ah,
it just feels fine, to me at least,
to bake this into my program itself.
COLTON OGDEN: Yeah, if this
were a massive set of style,
probably the separate page, or a
separate file would make sense.
DAVID MALAN: Yeah, and I'm
never going to change this.
It's not like I'm collaborating
with a colleague who's
going to make the CSS change over time.
This is kind of a one-off, so
I'm OK with this if you are.
But I am going to specify
that line break stuff.
So I want to concatenate
onto this for my pre tag
that I want to do-- what were
those lines that we had before?
COLTON OGDEN: It was--
I forget that exact style.
DAVID MALAN: Yeah, me too.
Here we go.
COLTON OGDEN: [INAUDIBLE] use it.
DAVID MALAN: It was overflow wrap.
So overflow wrap is going to
be break on words when you can.
And then lastly was white space--
oops, white-- it's got weird
dashes in weird places.
White space is pre wrap.
COLTON OGDEN: Tune in next
week for our CSS stream,
which we actually are
having on Thursday.
DAVID MALAN: Oh, yeah.
That would have been a
lot-- this is something
called CSS, Cascading Style Sheets.
So this will make sense
next week, I guess.
OK, I think that's enough.
And then as I recall, this
thing takes what, style sheets?
Was that the argument?
COLTON OGDEN: Yeah.
And then it takes in a list.
DAVID MALAN: In a list.
So I'm going to use square
brackets because I only
have a list of one thing.
I could actually have declared
two variables, like CSS1 and CSS2,
though that feels a little
messy, and passed in a list.
But I'm just going to
make a list of one string.
COLTON OGDEN: We're on CSS3 currently.
DAVID MALAN: That's funny.
Version 3 of CSS.
All right, now I'm going to--
oh, wait.
But that goes in the print
function, the write function.
So let me move this over here.
I goofed.
I think that goes here instead.
COLTON OGDEN: JSNOBS,
I love in line CSS.
DAVID MALAN: Here we go.
I'm going to now exclamatory
emotion, done, hopefully.
This is where I'm going to
screw up my [INAUDIBLE]..
COLTON OGDEN: You might be
shooting yourself in the foot.
DAVID MALAN: I know, I'm
going to screw up now.
Here we go.
Dammit!
OK, so what did I do wrong here?
No such file or directory.
Oh.
COLTON OGDEN: I think it might expect
it to be a file reference there,
maybe, instead of a string.
You might have-- oh, you have
to do the CSS constructor,
remember, for the CSS object.
DAVID MALAN: You're quite right.
COLTON OGDEN: Documentation.
DAVID MALAN: You're quite right.
Let's go back to the documentation.
See, I naively--
I looked into the-- stared into the sun.
COLTON OGDEN: I mean, I screwed up.
I forgot as well.
DAVID MALAN: I totally forgot this step.
As Colton's pointing out,
we need a CSS object.
So let's go ahead and do that.
Let's go ahead and call these
properties, just to distinguish them.
Properties.
And then let's, yes, go ahead and
create a CSS object called CSS.
And the input is called string.
And that string is going
to be my properties.
And now, down here, what I
pass in is that CSS object.
Thank you.
So in retrospect, I don't like the
variable names I've come up with here.
Like I used document
for HTML and HTML here.
So we'll clean that up in just a bit.
Let me save this.
Let me go ahead and clear
my screen and rerun it.
And definitely-- I didn't cross my
fingers before, that was the problem.
COLTON OGDEN: That was your problem.
DAVID MALAN: All right, so now let's
go back to Preview to look at the PDF.
And OK, now it's getting pretty cool.
COLTON OGDEN: Yeah, so
definitely landscape.
DAVID MALAN: Landscape mode.
And decrease the font size, just
to demonstrate one other property.
Let's go into that.
COLTON OGDEN: We can test the wrapping
too if you go down a little bit.
DAVID MALAN: Oh, yeah.
Let's keep going, lest we be sweeping
something under the rug here.
Yep, it's wrapping down and
toward the middle there.
All right, let's just make a font size
now of font size 10 point, we decided.
Now even Vim is wrapping,
but that's fine.
Let me go ahead and re-render this.
OK, reload.
And now it's just a little smaller.
COLTON OGDEN: Nice, looks great.
So CSS is working.
DAVID MALAN: Yes, so I think we
just need to integrate Pygments
and then we're kind of done.
COLTON OGDEN: Yeah, yeah.
DAVID MALAN: So how do we do this?
Well, Pygments has a
pretty rich API too.
If we go into Pygments
here, let's go under Docs.
And Quick Start I like the sound of.
I tend to be a little impatient.
I don't want to read
all the documentation,
so just give me a quick
demo to get me up and going.
Let's zoom out a bit.
Oh, look at this.
Perfect.
So here's an example for how
you highlight Python code.
That's perfect, except C code.
So we import a highlight function.
We import what's called a lexer.
And we import what's called a formatter.
So a lexer is kind of
like a parser. it's
one step in parsing that
actually tokenizes the input
into recognizable, distinct thing.
So like an argument to a function, a
variable declaration, a function call,
parentheses, any colons.
So those kinds of units
of logic in a language.
COLTON OGDEN: It's going to sort of map
those, like the HTML formatter is going
to look at all those individual
tokens, wrap them with the HTML styling
that it needs to, or whatever,
however that's going to work.
DAVID MALAN: Yeah, exactly.
So let's go ahead and copy some of this.
But we don't have Python code, so we're
going to have to look up something
to figure out how to do this and see.
I'm going to go into my code up here.
And I'm going to go ahead and import
from Pygments those three things.
Maybe it's called C
lexer, but I'm not sure.
So I'm going to go into
Lexer and Formatter Help.
And I'm going to go down here.
I want to see available
lexers, but I don't
think I'm going to see it on this page.
Where are my lexers?
COLTON OGDEN: There's
a guess lexer function
which will sort of try and predict.
DAVID MALAN: There is.
And that's what we
actually do in render50.
We infer with high probability
what the language is.
We don't require you to tell us.
OK, lexers for C and C++ language.
OK, it is called C lexer.
So I could've guessed.
COLTON OGDEN: CPP would have been
a little bit trickier to guess.
Maybe somebody might do CPP all caps.
DAVID MALAN: Indeed.
So let's see if we can copy
that and go back up here.
It's probably exported
here, but let's see.
I might need to be a
little more specific
as to where we're importing it from.
But we'll come back to that.
And let me clean up some of my
variable names, because I really
don't like the mess that I made here.
So this, yes, is my HTML object.
But what can we go ahead and call this?
This is some of my properties.
Let's go--
COLTON OGDEN: To be clear,
this isn't your HTML object.
This is your string block.
DAVID MALAN: Exactly.
COLTON OGDEN: Document is your HTML--
DAVID MALAN: So I'm going
to call this markup here.
And let me go ahead and
do this programmatically.
So HTML-- whoops.
Do it again.
HTML space, a literal plus character.
Let me go ahead and change this
to markup and then a literal plus
character.
There--
COLTON OGDEN: Dammit.
DAVID MALAN: OK.
I'm not very good with my Vim
regular expression sometimes.
So we'll just fix that.
Sometimes doing things
manually is faster.
And then down here,
let me just fix this.
My string should be my markup.
So I like this now.
COLTON OGDEN: I like that better, yeah.
DAVID MALAN: My HTML--
and then actually, let's
call this my HTML variable
for concordance with the classes
in WeasyPrint that I'm using.
So now my HTML variable is the return
value of my HTML function call.
My CSS variable is the same
for my CSS function call.
And then I've got markup and properties
as my sort of intermediate steps.
I'm more OK with that.
COLTON OGDEN: Well,
the nouns are correct.
DAVID MALAN: Yes.
All right, so we've cleaned up the code.
Now we need to actually use Pygments.
And let's go back to that
demonstration a moment ago.
Looks pretty simple.
We just need the code
like this, and then we
need to go ahead and
highlight it in some way.
Print highlight code.
So how do I go about doing this?
So let me go ahead and say--
we don't want to print
the HTML just yet.
We want to go ahead and
syntax highlight it.
So I'm going to call
it, say highlighted.
COLTON OGDEN: Do we want to--
do we want to do this before
it gets passed to markup?
DAVID MALAN: Yes, indeed.
Right?
Because we don't want to
syntax highlight my HTML.
Very good point.
So let's go ahead and do that.
I'm going to go ahead
and escape my code, yes.
But I want to go ahead and
highlight that resulting code also,
passing in the code itself.
Now I need to specify
my lexer, so C lexer.
COLTON OGDEN: Will it be messed
up if you escape it first?
Because it's going to replace the
less thans with the less than--
DAVID MALAN: Potentially,
depending on the lexer.
But the problem here is if we render
it-- if we syntax highlighted it,
we're going to have a whole bunch
of spans we're going to see.
And the problem is if we escape that,
we're going to escape all of the spans.
And we're literally going to see all of
the HTML that Pygments is generating.
So we shall see.
And this is why it's not
necessarily 100% robust.
But let's see how far this gets us.
So the reason I'm doing this
is I'm passing to highlight
function the code that
I want to highlight,
saying parse it as though it's C, and
use the HTML formatter as the output.
Why?
Well, I'm just following the
documentation here, per this example.
COLTON OGDEN: Yeah,
that takes in the lexer
and the formatter constructors
as second and third parameters.
DAVID MALAN: Exactly.
And you know what?
I think I can retract.
Now I know why I couldn't
figure out before what
function I'm using to escape code.
I'm using Pygments.
I just realized.
So notice that, per the documentation,
it says that sample code like that
is going to print something like this.
So a div with a highlight
class, then the pre, then
some spans with these weird class names
surrounding things like the function,
surrounding things like the strings.
And notice, it's giving
me the quotes for free.
COLTON OGDEN: It
already does it for you.
DAVID MALAN: So I'm going
to go in and tidy this up.
I don't need the pre, because
I'm getting that for myself.
And I don't need the escape anymore.
I can just pass in the
lines of code, which means
I don't need to import this library.
Oh, I also broke it anyway up there.
And now, in retrospect,
I realize this is
why when I was Control-F-ing
before in the GitHub repo,
I couldn't find it because I'm getting
that functionality from Pygments.
COLTON OGDEN: That makes sense, OK.
That's much simpler.
DAVID MALAN: So let's save
this and go ahead and run it.
All right, so HTML
formatter is not defined.
I think I just must
have made a typo then.
So oh, it's lowercase html.
So that's my fault. I got a
little too nit-picky myself.
COLTON OGDEN: I would have done
it the way you did, all caps,
but I guess that makes sense.
DAVID MALAN: All right, so let's
now save it and rerun it again.
Oh, cross my fingers.
OK, good.
COLTON OGDEN: Good thing you did that.
DAVID MALAN: Just in time.
Open this up.
And now, interesting.
Not actually working.
COLTON OGDEN: Yeah, it's not
syntax highlighted at all.
DAVID MALAN: All right,
so this is interesting.
But how do we go about solving this?
Well, you know what, if the
Pygments library's purpose in life--
that was an amazing alliteration--
does not-- it's supposed
to generate HTML for me.
let's introduce a new baby
step where we print out,
like we did a little bit
ago, like the actual HTML
we think we're turning into a PDF.
COLTON OGDEN: OK.
DAVID MALAN: So rather than just
write this PDF now, let's go ahead
and print out first,
if we may, the markup.
And then let's just sys.exit
right here and just get out.
OK, no done message this time.
All right, OK.
I've just printed it out.
And notice there's a huge amount
of CSS here-- or of HTML here now.
And it's weird stuff.
All these spans with all these classes.
But that's what Pygments
is doing for you.
It is using regular expressions
to match what looks like C code.
And its just surrounding them with
these weirdly named classes, which
you can think of as just red,
green, blue, purple, yellow.
But they're configurable.
And I don't know what the
specific class names mean.
They're short and succinct, so that
you're not bloating the code too much,
but that's what Pygments
is doing for you.
So this is good.
This suggests that we are,
in fact, printing out--
generating actually-- all of this HTML.
But there's no colorization happening.
So what might be missing?
COLTON OGDEN: So I'm guessing we haven't
defined the colors for the classes.
DAVID MALAN: Yeah.
COLTON OGDEN: So the
classes are saying--
DAVID MALAN: Yeah, we
don't have a style sheet.
COLTON OGDEN: --we're here, but
yeah, we don't have a style sheet.
DAVID MALAN: Indeed.
So the way we do this-- let me see how
we do this in render50 for consistency.
If I go in here and go into render50
itself, actually, I do have it here.
Let me see if I can find it real quick.
COLTON OGDEN: MKLOPPENBURG
was hypothesizing,
are the comments messing it up?
Not quite.
DAVID MALAN: Are the
comments messing-- nope.
And in fact, this is just now one
lack of piece of code from me.
And actually, here I just found it.
So I'm going to go into my code here.
I need another style sheet.
Right here I have just my one that
I had been making from before.
So I'm going to go ahead and
call this, let's say, CSS--
we'll call it CSS1 for now.
And we're going to do
one other, CSS2, get CSS.
And now I want to pass into this a
CSS that I'm going to get dynamically
from Pygments itself.
COLTON OGDEN: In the URL?
DAVID MALAN: Using-- no, it
doesn't have to be a URL.
COLTON OGDEN: [INAUDIBLE]
DAVID MALAN: Yep, I'm going
to go ahead and say this--
string equals HTML formatter, get style
depths of the, quote unquote, highlight
class, using the highlight class.
So what does this mean?
Let me go ahead and just now
go ahead and print out CSS2
and see what we have.
And then exit immediately.
So I'm asking, hey Pygments, in
your HTML formatter class, go ahead
and get all of your style definitions.
COLTON OGDEN: Specifically for
the highlight class only, though.
DAVID MALAN: Indeed.
So let me go ahead and save
this and rerun my code.
Oh, and it's printing this out.
How best to pluck this out?
Let us see here.
That's going to go ahead and do that.
Let's see, line numbers,
text lexer, HTML formatter.
All right, just trying to figure out
the easiest way to get this stuff.
Let's go ahead, and can I print that?
Let me try casting it
to a string and see what
happens, but I'm not sure
this is going to work.
Let me go ahead and run this.
No, I want to see the actual HT--
oh, I'm being stupid here.
Let's just do this.
Let's just call this
my actual string, S.
COLTON OGDEN: Oh, yeah.
We're wrapping in the
constructor for CSS.
Right, that makes sense.
DAVID MALAN: And let me just do this.
OK, so now I have a temporary variable
called S. I was overthinking that.
My apologies.
And let's just see
what comes out of this.
So now if I rerun my program, ah--
COLTON OGDEN: There we go.
DAVID MALAN: There I now
have a default style sheet.
So this has nothing
to do with themes yet.
This is just some default
colorizations for C.
And you can see that everything
inside this dot highlight class, which
is on my div per the
documentation, there's
all these one and two
character class names that
now have these hex codes for color.
COLTON OGDEN: So highlight
wraps the entire thing,
and then it has subclasses,
the MH and the MI?
DAVID MALAN: In a sense, yes.
COLTON OGDEN: Or not subclasses,
but like sibling classes.
DAVID MALAN: Yeah, this
wrapper class highlight
you can change to
whatever-- dot foo, dot bar.
The motivation is so that you can use
Pygments in a subset of your document
and isolate it to just
one div, for instance.
All right, so what is the point of this?
I can clearly extract from Pygments,
by default, all of the styles
that I want to apply.
But previously I wasn't.
So in short, I didn't have Pygments
fully talking to WeasyPrint just yet.
So let's go ahead and fix this.
I'm going to go back to
the way it was before,
where I had my second CSS object--
CSS1 and CSS2.
I'm going to go ahead
and get rid of my prints.
And I'm going to pass in now CSS1
and CSS2, thereby telling WeasyPrint,
here's my CSS, and also,
here's Pygments' CSS,
please combine them together,
just as CSS supports.
COLTON OGDEN: Nice.
DAVID MALAN: All right, so now
I'm going to be really bold here--
double exclams.
Save it.
Clear the screen.
Double crossed fingers.
OK, there we go.
Two of us.
COLTON OGDEN: A lot of crossed fingers.
DAVID MALAN: All right,
that's a bold claim.
Here we go.
Oh, my goodness.
COLTON OGDEN: That looks nice.
DAVID MALAN: Beautiful.
Nicer, how about we'll say?
COLTON OGDEN: Even have a
little bit of a background too.
DAVID MALAN: Yeah, which
actually I find hideous.
So we'll have to fix that.
And actually, render50 does fix that.
I just think this is stupid
that there's a background
color when you're printing on paper.
COLTON OGDEN: It's a lot
of ink, too, over time.
DAVID MALAN: But this is easy.
And you'll see more on this
next week in our CSS stream.
You can do like a background dash color
property and just change it to white,
or get rid of it all together.
COLTON OGDEN: Yeah, we'll do something
as complicated as background color
in our CSS.
DAVID MALAN: And so suffice it to
say, you could change those styles.
You could download
different style sheets
in order to get slightly
different colors, just as we do.
But in this case, you're
getting some defaults.
COLTON OGDEN: You probably have to--
if you have a particular
style sheet of your own,
you probably have to process it, right?
Because it's going to expect
the certain classes to exist.
DAVID MALAN: Yeah.
It does have to-- you do have to
define it in terms of those classes.
But this is all well
documented in Pygments.
COLTON OGDEN: OK,
that's very satisfying.
DAVID MALAN: Any questions
we should pluck off here?
COLTON OGDEN: A lot of people
thinking that it's really cool.
DAVID MALAN: Oh, thank you
for crossing your fingers too.
COLTON OGDEN: And then BELLA is
asking what CSS [INAUDIBLE]----
DAVID MALAN: Yeah, just did that.
COLTON OGDEN: --style
sheets, which we did.
MKLOPPENBURG was hypothesizing
that comments messed it up.
VERONI was saying that it's great
that you explain everything in detail.
DAVID MALAN: Oh, good.
Hope you appreciate when I screw up too.
COLTON OGDEN: And that was
about where we stopped reading.
DAVID MALAN: OK.
COLTON OGDEN: So
everybody, we've got a POG.
So this is what's called a POG
or POG champ, very excited.
DAVID MALAN: Nice.
COLTON OGDEN: And everyone
is saying it's really cool.
DAVID MALAN: Cool, all right.
So we're-- I mean, we're kind of there.
Let me propose that this
very simple render50 gets
us what, like 80% of the way there.
Because we've rotated to landscape mode.
The motivation for which is just to
have room on the paper for a teacher
to write or type comments.
I don't love the page
background, but you
know what, let's just get rid of that
because it's bothering me already.
COLTON OGDEN: You could easily do
that by swapping the CSS order,
couldn't you, and defining background
color in your own properties?
DAVID MALAN: We can.
Yeah, indeed.
So let me go ahead and do that.
Let me go ahead and say--
how did we do this here?
We stacked this.
Let me go ahead and do--
how best to do this cleanly?
Let's see what happens when we
go ahead and just add it here.
But more on this next week too.
I'm going to say inherit.
Give me the default page color, whatever
that is, which is white by convention.
Let me go ahead now
and rerun the program.
No fingers crossed, so no promises.
Reload.
It's still there, but it's--
Colton's alluding, cascading style
sheets is all about cascading.
And because my styles came before
Pygments', Pygments is overriding mine.
So honestly, the simplest fix right
now might be just to reverse those.
Let Pygments do its thing first and then
let me override anything that I want.
Let's save that.
Reload the PDF after re-rendering it.
Go back here.
Oh, and it's still here.
So why is that?
COLTON OGDEN: Strange.
Is it a different property?
It's not background color.
DAVID MALAN: Possibly.
Let's see.
I can-- background color.
Let me change it to--
oh, you know what it must be.
Background color must not
be applied to the pre.
It's probably applied to maybe
the div that's surrounding that.
So we're going to have to fix
this in a slightly different way.
COLTON OGDEN: Oh, I see.
Yeah, that would probably
be the div, yeah.
DAVID MALAN: Let me do
properties plus equals,
and then I'm going to say dot highlight.
And then, let ,e go ahead
in here and say-- oops--
background color inherit.
And actually, let's override
the whole background.
Inherit whatever the default is.
And let's see if this fixes it.
Honestly, I'm guessing now.
I'm just inferring based
on the documentation.
We could also just print out all the
HTML and figure this out ourselves.
We could even output the HTML,
save it and an actual HTML file,
open it in Chrome, and then tinker
with it in Chrome with developer tools.
Let's rerun this.
Let's go ahead and reload.
And voila!
COLTON OGDEN: Nice.
DAVID MALAN: It's
subtle, but now there's
no more gray box, which is definitely
better for printing sake, and so forth.
COLTON OGDEN: [INAUDIBLE],,
what are you guys building now?
And we just finished, but it's
a source code to PDF file.
Can be used for a lot
of different purposes.
DAVID MALAN: But you know, can
we show off one more feature?
COLTON OGDEN: Yeah, let's do it.
DAVID MALAN: It turns out that
this highlight function that
comes with Pygments has
some fancier features too
that we make use of in render50.
So let me go into my highlight
function, which is up here.
And in addition to passing in
those arguments, my code, my lexer,
my HTML formatter, let me also go
ahead and say, line numbers equals--
I think I can say true to
give myself some line numbers.
COLTON OGDEN: Oh, this is cool.
Yeah, this is nice.
DAVID MALAN: And let me
go ahead and save that.
And if I did this right, rerun that.
OK, got unexpected keyword line numbers.
Oh, I'm sorry.
That is an argument
to the HTML formatter.
Line numbers equals true.
So that makes sense now, because you're
adding the line numbers to the HTML
file.
So that was just user error.
Now let me go ahead and rerun Python.
OK.
COLTON OGDEN: No errors is always nice.
DAVID MALAN: And nice.
OK, it's a little weird.
There's a little bit
of a weird offset here.
COLTON OGDEN: That's strange.
DAVID MALAN: And in fact,
now I really broke things.
Notice-- look, I can't scroll.
There's no more pages.
COLTON OGDEN: Ooh, I have
no idea what that is.
DAVID MALAN: So this is why
I think render50 took me
many days, if not weeks, early on,
because I ran into headaches like this.
So let me propose this.
Instead of saying true--
actually, let me show you--
what's the best way to do this?
Let me go ahead and show you what is
being outputted right now before we
actually look at this.
COLTON OGDEN: Oh, [INAUDIBLE] was saying
we didn't cross fingers this time.
DAVID MALAN: Oh, well that
also was the bug, yes.
Let's go ahead here
and just do a sys.exit
after printing my markup that results
from calling that highlight function.
So again, we're just debugging now.
Let me go ahead and rerun Python.
Whole bunch of stuff on the screen.
And let's go up to the top where
we can see the first line first,
just to make this a little
more straightforward.
And you can see this.
So you see all of the line numbers
inside of-- if we go up high enough--
a table.
COLTON OGDEN: Oh, is it going to
be like floated left or something,
and we're advancing
this to get it to work?
DAVID MALAN: Related in spirit.
COLTON OGDEN: It's absolute
position or something like that.
DAVID MALAN: WeasyPrint print
does not handle HTML objects
that are taller than the page.
I mean, it's as simple as that.
And because this file is
so long, and because we
have so many line numbers, what's
happening is the whole second page
and beyond are just getting chopped off.
And just weird stuff is happening.
So there's another way.
And thank god that Pygments
handles this, because that
would've been a big pain to fix.
Let me go ahead and change line numbers
to be, quote unquote, in line instead.
Instead of true.
True gave me a table.
And that table was too tall.
And just because of the
library, it breaks on that.
So let me go ahead and save this,
rerun it, and look at this HTML now.
And you'll see now--
COLTON OGDEN: OK, you have a
span with the class line number
before everything.
DAVID MALAN: Exactly.
So in this way, you don't have
this massively tall table.
You still just have individual lines.
And WeasyPrint is
really good at breaking
on lines, really bad at
breaking on really tall tables.
COLTON OGDEN: Yeah, apparently.
DAVID MALAN: So let's go ahead and
not just print that, of course.
So let's get rid of my debugging lines.
Let's go ahead and let
it finish its thing.
Let's be really bold now, three exclams.
I'm really done now.
And go ahead and run this.
Oh, three fingers, quick.
COLTON OGDEN: I've got them.
DAVID MALAN: OK.
And now let's open the PDF.
Now, that's pretty good.
COLTON OGDEN: That looks great.
OK, cool.
That must have been satisfying
when you discovered that.
Did that cause you a lot of
trouble when you first ran into it?
DAVID MALAN: If by trouble you
mean tears, yeah, absolutely.
Because that's the kind of
thing where you feel like you're
doing everything right.
You've read the documentation,
and it's still not working.
And so honestly-- but how
did I go about doing this?
It's been a while, but I'm pretty sure
I went to WeasyPrint's GitHub repo.
And honestly, I started searching
through the issues, open and closed,
to see if other people
have reported this bug.
And I do believe this
is well documented.
Just doesn't handle really
tall objects like that.
So you have to work around it.
COLTON OGDEN: I don't
know when you would ever
want to use the other
form instead of this one.
This seems just objectively
better in every way.
DAVID MALAN: Well, web pages.
So for instance, let me think
about a context for this.
COLTON OGDEN: [INAUDIBLE] is maybe.
DAVID MALAN: Well, so in web--
actually, I can give you
a very specific scenario.
When you're rendering code,
syntax highlighted in a web page,
and you want the human to be
able to highlight the code,
you want to use a table so that all
of your numbers are in separate TDs,
so that when you're
clicking and dragging
you have the ability to highlight
just the code and not the numbers.
Because in this version, these
are just spans all together.
There is no way to
highlight just the code.
Now, we claim that in
a PDF, not a big deal.
You're printing it,
you're writing on it.
You're not highlighting and copying.
After all, you have the
original source code.
But on a web page, like
this would be darn annoying.
COLTON OGDEN: I understand now.
DAVID MALAN: GitHub had to deal with
the same thing with their UI as well.
COLTON OGDEN: But in that case, I guess,
does it have the same issue in the web
if you use the table?
Does it have the--
does it truncate your line
numbers if you do it pages long--
or a web page that's certain wide--
DAVID MALAN: No, HTML is fine.
It's the PDF, because
WeasyPrint and libraries
like it have to do a bunch
of arithmetic to figure out,
for a given letter size
page, where do you chop it?
What does it mean, though,
to chop a table right there?
It just gets confused.
COLTON OGDEN: Real curious
here, Brenda was saying,
if by trouble you mean tears--
quoted that.
And then MKLOPPENBURG
followed up by David
showing his soft side right there.
DAVID MALAN: Yeah.
Oh, that's happened.
Late at night, 4:00 AM, you'll see
me curled up in a ball on the floor.
So here we have a very simple render50.
It's not that feature full.
So why don't we, in our
remaining time, take a look
at the real render50 for
just a couple minutes
and see what it does differently.
COLTON OGDEN: Yeah, let's do it.
DAVID MALAN: All right, so I'm going
to go over to GitHub here where we have
the source code for actual render50.
And you'll see that there's
a few files in this repo.
GetIgnore, which is get
specific, just prevents us
from checking in specific files.
Travis.yml, which is a
CI/CD technology, Continuous
Integration/Continuous Deployment
technology, a third party cloud
service that we use to automatically
deploy new versions of render50.
Readme is just a readme, of course.
Setup.py I mentioned earlier,
has to do with Python packaging
so that when you install
render50 with PIP,
it knows to pull in those dependencies.
And then the real script is
right here, render50 itself.
COLTON OGDEN: Nice.
And you don't even
have an extension on it
just so that you can run it at the
command prompt like a Unix application.
DAVID MALAN: With a shebang, yeah.
This first line here,
a hash bang or shebang
actually says, please use
Python 3 to execute this file.
COLTON OGDEN: Nice.
DAVID MALAN: And yes,
hello to [INAUDIBLE]..
Sorry for mispronouncing.
Best of luck with Pset 5.
And here we have render50.
Feel free to take a look at this in
more slow detail at your leisure.
But you'll see that we're importing
a whole bunch of libraries
to solve different problems
that we didn't even solve today.
But there's some familiar
ones, like here, for instance,
is all the Pygments stuff
for importing lexers.
Colton mentioned that
you can guess the lexer.
So I don't even have C
lexer or Python lexer.
I have all the lectures assembled here.
COLTON OGDEN: Yeah,
because you don't know,
somebody might want to use this
for a language we don't teach
or we haven't used before.
DAVID MALAN: Yeah, and Pygments supports
dozens, if not hundreds, of languages.
So render50 actually supports
dozens or hundreds as well.
There's another PDF library
in here-- fun fact--
for rendering PDFs side by side.
One here, one here on the same page.
We actually use a separate PDF library
that Colton actually figured out,
because the goal is to
essentially render two PDFs
and then concatenate them together
so that each one, in the US,
would be 5 and 1/2 inches wide.
And then you jam them
together like this.
So it was kind of a pain.
Can't do that alone with WeasyPrint.
Can't really do it well with HTML.
So we actually, for render50, create
two PDFs if you do side by side mode.
And then combine them.
COLTON OGDEN: I'm just now
remembering adding that feature.
I had totally forgotten to this point.
DAVID MALAN: Yeah, so
you know this code.
COLTON OGDEN: A little bit.
DAVID MALAN: We have
a whole bunch of stuff
here for error checking
and line wrapping.
We use Beautiful Soup too, to extract
information as needed from some HTML
if we want.
And a few other features
like argument parser
to automate the process
of dealing with sys.argv,
which I just very easily did manually.
Here, for instance, we're
requiring Python 3.6,
just because that's what we
do in CS50 now so that we
can assume certain language features.
COLTON OGDEN: We're
up to 7, though, now.
DAVID MALAN: We are, but we don't
really use any of its features.
COLTON OGDEN: 3.7 doesn't have
any compelling features for us.
DAVID MALAN: Get the version--
this is just so that if you do -v,
render50 can tell you
what version is installed.
We have some other nice things.
Here's our main function.
We handle Control-C, so if you want
to abort if it's taking too long,
that'll handle that cleanly.
Here's all the code we have for
parsing command line arguments.
And you'll notice there's
a bunch of features
we didn't even talk about today.
We have a browser mode
with render50 so that you
can go fetch, just like WeasyPrint
can, a URL and download it.
We use it for downloading
GitHub code or Gist,
if you want to render a PDF of a gist.
You can turn off color if
you just don't want it,
and if on your black and white
printer, grayscale isn't very nice.
You can include multiple files or--
you can include multiple files so
as to render multiple files at once.
You can specify the size of the
page, depending on your country.
We default, I think, to letter just
because we are biased in the US.
But you can override
that in any country.
And then -y, like with diff,
is a side by side mode, which
similarly lets you do two side by side.
COLTON OGDEN: Best feature in the repo.
Just going to say that.
DAVID MALAN: Indeed, well,
why don't we show that off.
So let's go into now--
COLTON OGDEN: I think we're
good [INAUDIBLE] time here.
DAVID MALAN: No, we're not.
So let's go ahead and do this, and
do render50, the real render50,
which is installed in CS50 CLI for you.
I'm just going to go and
say this, and cs50.c.
And let's just do the same
file twice side by side.
And then -O.
We'll call it colton.pdf
because he made this possible.
Enter.
It's rendering cs50.c.
Rendering cs50.c.
So you'll notice I added some nice
pretty printing of progress messages.
Let's go into my Downloads folder.
Colton.pdf.
We'll go ahead and open it with Preview.
And voila.
COLTON OGDEN: It gets ugly
a little bit with the--
because the comments break so
much when it's that narrow.
But--
DAVID MALAN: But we would do this not--
to be clear-- to render the same file
side by side.
It's usually a file that's
been submitted as homework,
and then another file that that
student's work is awfully similar to.
And we don't use this
so much internally.
This is when we refer to other
administrators at the university,
if they--
often as less technical
folks who wouldn't
be as comfortable using a command line
software that we provide and such.
Can just literally print this out or on
their own Macs or PCs just open it up.
And even though it's not perfect,
it fits on a traditional page.
And therefore, just has a nice
compatibility with the target audience.
COLTON OGDEN: Easy to print and whatnot.
DAVID MALAN: But this is,
essentially to Colton's points
to his feature, two PDFs that
we've just concatenated together
at the right size.
COLTON OGDEN: Yeah, makes it
pretty easy, the py PDF 2 library.
DAVID MALAN: Indeed.
And you'll notice a couple
other things, if I may.
If we use the real render50, and let's
execute it not just on two files,
but just instead on one,
outputting cs50.pdf this time.
You'll notice that we
have features like, do
you want to overwrite the existing
file, just as a user friendly feature.
And telling you, again in
green, what you've done.
And if I go and open cs50.pdf,
this is slightly cleaner,
I think, than the ones we created
today for the following reasons.
So one, I found it very obnoxious how
big and black the line numbers were.
They were kind of a distraction
when you really care about the code.
So render50 makes them gray.
We have this little border at the
top, a line with the file name,
just to make it obvious,
honestly, for people
like me when I'm trying to figure
out what file am I looking at.
COLTON OGDEN: Yeah, because
in lectures for example, too,
you mentioned printing
out a list of files,
you'll glob like an entire
list of maybe C or HTML files
that you want to print onto your thing.
And then you staple them all together.
It's easy to lose track
of what you're doing.
DAVID MALAN: Indeed.
And we actually did address
that highlighting feature.
Well, you'll see this still
happens in the PDFs here.
But notice we do have a second page.
So we are using the inline line
numbers to avoid that same issue.
But beyond that, we pretty much
implemented render50 entirely.
We just don't have additional
support for web pages and such.
But we can do that.
Let me go into, say, how about CS50's
Python library, called Python CS50.
Let me go into the source
directory and grab cs50.py.
COLTON OGDEN: Show those to Kareem
[INAUDIBLE],, by the way, [INAUDIBLE]..
DAVID MALAN: Kareem is one
of our big contributors here.
You'll see that this on
GitHub is just the source code
now for our library for Python, which
has GetString and GetInt and GetFloat,
and a couple of others.
Let me go ahead now
and just copy this URL.
And if we're not getting too greedy,
let's go ahead and render a whole URL.
Oops-- let's go ahead
and render a whole URL.
COLTON OGDEN: That's weird.
DAVID MALAN: My terminal is messed up.
Let's see if we can do that
again. render50 of that whole URL.
It's ugly, but that's just--
COLTON OGDEN: It's like an NES type bug.
DAVID MALAN: Yeah, it's a Docker
bug, actually, in this case.
Let me go ahead--
COLTON OGDEN: It looks like
it in NES, you would like--
tiles from different
maps would like move over
to the other side of the screen.
Yeah, it was a bit weird.
DAVID MALAN: But the code
is hopefully still working.
Yes, let's overwrite it.
And notice it's grabbing
that URL via HTTP.
And if I open this PDF
now, it looks similar,
but you'll notice we've rendered
an actual URL from the web.
And that URL is even clickable too.
So this is all thanks to HTML and CSS.
COLTON OGDEN: That's awesome.
I love it.
DAVID MALAN: So that, then, is render50.
And I think what's cool about this
is, honestly, not the specific lines
and the specific use case, because
not many people in the world
have a need for syntax highlighted PDFs.
But I think it's this combination
of technologies, like you
understand a bit of HTML, a bit of CSS.
You need a library for generating
the HTML, the syntax highlighting.
You need to understand CSS to
stylize it the way you want.
And then you need a library for turning
that text, or HTML and CSS, into PDF.
And that's what WeasyPrint does for us.
COLTON OGDEN: Yeah, I think it's great.
Learning the basics of
Python is one thing,
but actually going and using
it to accomplish something.
And something tangible too.
This is something that's physical.
I think it's great.
Yeah, it's awesome.
DAVID MALAN: So if you'd like to play
along at home or learn further, again,
the first URL we started
with today, we can
type it one more time for new folks.
cs50.readthedocs.io/render50.
You can install it there on
your own Mac or PC or Linux box,
so long as you have
Python and PIP installed.
There's some documentation
there as to how it works.
And then you can follow the link
there at the bottom source code.
Go to the GitHub repo if you
want just kind of look through.
And suggest if you find bugs, or missing
features, or poor design, by all means,
submit a pull request if
you know how on GitHub.
But that, then, is render50.
COLTON OGDEN: That was awesome.
Thanks so much for
coming in and showing it.
DAVID MALAN: Thanks for hosting this.
Really nice to see everyone out there.
COLTON OGDEN: Yeah, absolutely.
There's a few comments
that we didn't grab.
DAVID MALAN: Sure, let's plus those off.
1:00 AM in Nepal.
Wow, OK.
Thanks for staying awake.
COLTON OGDEN: [INAUDIBLE].
And then [INAUDIBLE] was very happy.
Gave us a holiday present.
Very cute.
And then [INAUDIBLE] was
saying, used to find cheating.
And you're referring to
the two PDFs side to side.
DAVID MALAN: Yeah, not
so much used to find it.
We use other software to
actually find the similarities.
We then use render50 to
show the similarities.
COLTON OGDEN: Right.
Nice.
If anybody has any last
questions, definitely let us know.
But it's been about a little over
an hour and a half, actually.
It's been closer to two hours.
It cut after the first
drop in the stream.
So we've actually been streaming
for, I think, an hour and 50
or something like that.
What time is it?
2:40.
Hour and 40, I guess.
DAVID MALAN: Yeah.
COLTON OGDEN: Yeah, let us
know if you have any questions.
Otherwise, next week,
so you and I are going
to be doing the code
review on [INAUDIBLE]..
DAVID MALAN: Yeah,
looking forward to that.
COLTON OGDEN: That'll be awesome.
DAVID MALAN: Send us your submissions.
Do we have that URL just in case?
COLTON OGDEN: So
bit.ly/cs50twitchcodereview.
DAVID MALAN: OK.
That's our long URL.
COLTON OGDEN: Yeah, I
couldn't find a good short one
that actually made sense.
DAVID MALAN: That's OK.
COLTON OGDEN: But if
you go there, you'll
be able to submit some source code
that you have on GitHub, or a gist.
I mean, both of them,
you'll find on GitHub.
Either a repo or a gist.
We'll take a look at it.
It's catered towards
beginner/intermediate programmers
more so than somebody who's maybe
working on a massive code based
in a more professional environment.
But even if you are, we're
happy to take a look at it.
But it's more of a
style and design grade.
We're not going to be doing bug
testing or the like on camera.
But yeah, definitely let us know if you
want some feedback on your source code,
and we'll give it to you next Friday.
DAVID MALAN: Sure.
And there was--
COLTON OGDEN: [INAUDIBLE] thereafter.
DAVID MALAN: --one other
question, very library specific.
[INAUDIBLE] asked, is
does GetString in cs50.c,
which we've been
printing today as a PDF,
free up allocated memory at the end?
Yes, we actually take advantage of
some more modern features of compilers
to actually do garbage
collection of memory
so that students do
not need to know about
free in the first few weeks of CS50.
COLTON OGDEN: Cool, cool.
BELLA'S saying, thank
you, David and Colton.
Stream was awesome.
Thanks BELLA for joining.
And thanks BARTLED for the follow.
And HEXADECIMAL.
I like that.
I like the 16 in the hexadecimal there.
That's awesome.
DAVID MALAN: That is cool.
COLTON OGDEN: [INAUDIBLE]
not just on the turntable.
Nice, OK.
Thinking [INAUDIBLE] referred to DJing.
DAVID MALAN: Colton does some DJing too.
COLTON OGDEN: A little bit.
It's been a little while.
Thanks for the awesome
stream says [INAUDIBLE]..
Thank you for tuning in.
Thanks everybody who tuned in today.
We had very consistent and
good viewership that ramped up.
R3MOTE, thanks for the stream.
Awesome.
Oh yeah, and next week--
so we have the code review.
On Monday, I'll be doing Minesweeper
part 2, so I'll finish up Minesweeper.
We did a good solid 50% to
60% of it on the last stream.
Kareem's going to do Flask next week.
And shoot, I'm blanking
on the other stream.
DAVID MALAN: CSS.
COLTON OGDEN: Yes, I will be
doing CSS for next week as well.
OK.
And these all go on the Facebook page.
You'll get event notifications
if you're on that.
So go to facebook.com/cs50 to keep
up with all of our Facebook and live
video.
And cross your fingers
that Facebook doesn't
keep blocking our videos after today.
I have to figure out what caused
that, caused the dropped stream.
If you're watching this on YouTube,
follow us on Twitch, twitch.tv/cs50tv.
And subscribe to our
YouTube channel as well.
Anything else?
DAVID MALAN: No, I think that's it.
Thanks so much for having me.
COLTON OGDEN: Cool, I'm
going to switch us back here.
DAVID MALAN: Nice, this is--
oop, there we go.
COLTON OGDEN: Let me make sure
we didn't miss anything here.
[INAUDIBLE] will be on
the Facebook soon as well?
Yes, MKLOPPENBURG [INAUDIBLE]
will be on Facebook as well.
I will get all of them
posted today, actually.
And then this video will
also go posted onto--
oh, nice.
This is-- I like that.
Very nice.
What's the name of the woman
who does Wheel of Fortune?
DAVID MALAN: Vanna White?
COLTON OGDEN: Yeah.
DAVID MALAN: Vanna White, there we go.
But all the letters are already
there, although my head is perfectly
over the zero, so this is CS5.
Today will be a more
introductory stream.
COLTON OGDEN: OK, time for breakfast.
Have a great night
everyone, says Brenda.
Thanks for joining, Brenda.
DAVID MALAN: Take care, Brenda.
Bye everyone.
COLTON OGDEN: All right, everybody.
Have a great weekend, a
great rest of your day.
And we'll see you next
Monday for some Minesweeper.
This was CS50 on Twitch.
