All right
Apologies for the delay
Welcome to computer science S-75 - Building dynamic websites
My name is David. I'll be your instructor
this summer
uh... and it said pretty brief summer so
we're gonna dive right in tonight into some
material then wall take of breath
a look at the structure of the course,
expectations thereof and then conclude with
some additional material
Along the way, please interject with any
questions that you might have but first
some questions from me
you go ahead on the internet on your
laptop or desktop you pull up your
favorite browser you type in www.google.com and hit enter
What happens?  Let's tell this story and
we can be as high level or low level
as we want, and I'll steer us in both
directions. So you've hit enter.
What happens?
Anything you got?
Oh. Good. So that's the whole story. That's very good.
Let's tease it apart a little bit now and
I'll repeat some of the answers
sometimes into the microphone so that
our folks who are taking the course
from afar can hear everything
so your computer makes a request through your modem goes to your
ISP, reaches google.com servers
and they've replied
with the response, so good.
now let's dive in deeper there, and let's focus on
the act of hitting enter
Does someone want to propose, just give me one step
in more technical detail what happens
next and then we'll get to that same
endpoint eventually
Perfect. So we first need to translate
the name of the site in this case the
www.google.com
into an IP address and, someone else,
what is an IP address?
Good, so an IP address identifies
a server or computer on the internet
and an IP address is simply a number of
this form.  Let me go ahead and pull up a
little scratch pad for notes here
so an IP address as you've probably
seen as something in the form of w.x.y.z
and little internet trivia:
each of these placeholders can be a
digit from what to what?
...or number from what to what?
Perfect. 0-255, and there's
some restrictions on what numbers can
be where, but essentially you have number
dot number dot number dot number. And each
of those numbers can be again 0-255.
If we've really wanna start pressing
deeper here, how many bits is used to
represent an entire IP address under
this schema,
for those familiar with bits.
32. So why is that? Well, for
though less familiar/unfamiliar, if you
want to represent the number 0 - 255
which is a total
of 256 numbers
you need 8 bits because 2^8=256,
But we won't go into
too much detail on along those lines
but if you've seen that IP addresses
are just 32 bits
it is because each of these numbers
is 8 bits itself so actually let's
go here. There will be much math in this
course after at the following sentence
really
but if you have 32 bits:
how many possible IP addresses are there
for the world's computers?
so it's to 2^32 which is roughly..
those who are good with math in their heads..?
So it's roughly 4 billion. So that's a lot.
But these days most of you have
laptops. Most of you have desktops
Most of you have telephones in your pockets
or Ipads or
the like. So there's more to places
these days that are consuming
IP adresses
so if you follow the popular media
of late you'll find that people have
been freaking out that were about to run
out of IP addresses but that's
because we've been using version 4
for far too long.
Thankfully version 6 (u.v.w.x.y.z) has begun
to get rolled out
and version 6  (u.v.w.x.y.z) will have
128 bit IP address
..which is great, because that's
2^128
which is huge!  Barely pronounceable.
But it will also become a
little more complex to break these
things down so we can squeeze a few more
years of discussion out of these
addresses but realize the world is
transitioning
now just for the sake of the experience
for those at home let me actually pause here
just so we can plug in this recording
device so we can capture to another
format so let's leave that is that
cliffhanger for just a minute or two
and I'll be right back.
Where did we leave off?
You've just hit enter.
We had proposed that your computer
had translated or needed to translate
the hostname www.google.com
into an IP address and then we talked
for a moment about various forms
of IP addresses
so let's now push a little harder on how
this translation happens
so Google has a numeric address of this
form (w.x.y.z)
and as an aside Google actually probably
has a whole bunch of IP addresses
of that form.  All of which lead to
the same experience but perhaps
different servers
so how does your little Mac or PC or
Linux computer know
what the IP address of www.google.com
actually is?
OK, good. So, it has to do a domain name
look up using a DNS server.
For those unfamiliar, DNS is domain name
system
and this is an infrastructure on the
internet that pretty much does exactly
that.  It converts domain names and
host names
to IP addresses
and vice versa and will see tonight that
it does a few other things in terms of
helping with the routing of email
with validation of ownership of
domains and the like
so there are these servers out there now
your computer or your home probably
doesn't have its own DNS server
but probably Harvard does if you're on
campus or Comcast does or Verizon
or your company does.
Now if you're at a small college for
instance and you're not visiting google.com
but your visiting some random website.com
It's very possible that you were
the first person on a campus to visit
that website ever
or at least in a long time
so what if your small little campus's
DNS server has no idea what
this IP addresses is?
Are you sort of out of luck because you
went to that school, and not one where
there's more people using that websites
or equivalently, it's kind of the chicken and the egg problem (which came first?) if you're the first person to
ever need to visit that website and
therefore your campus's DNS server
has no idea what that mapping is
how do you solve this problem?
Exactly.  So there's a hierarchy,
thankfully to the DNS system whereby
even though you might have your own
DNS server on campus or company
but that doesn't necessarily store all
possible domain names and IP addresses in
the world.  In fact, that would be quite a
large database otherwise and it's just
not efficient to keep all of them around
if they're not being accessed at all
or very frequently
but your ISP
knows some bigger fish and maybe that
bigger fish knows an even bigger fish that
has its own DNS servers that might
know, but in the worst case
if no one along this hierarchy knows,
there also exists in the world what are
called root servers
which are spread out geographically
across the several continents
and it's those root servers that
essentially know
who does know, what the IP addresses of
some random website.com
in other words those root servers know
who the authority is for instance for
all of the .com's in the world, for all
of the .net's or the like
so that you can have this initial
request from little old your computer
bubble up to these very high-level
servers and then bubble back down to
some authority
who does actually know
and the reason why that works
is because when you go and buy your own
domain name which is a
process will discuss in just a bit
you have to tell the world what the IP
address is of
you DNS server, so someone has to
be informed proactively once really
and only once when you buy the domain so
for now let's come back to our story
We've hit enter
Google.com was in my browsers window.
My computer has somehow figured out that
it is 1.2.3.4 or something like that
so now my computer puts together a
message to send it across the internet
to Google.com.
What does that message look like?
Well in it's simplest form - it's a message
that pretty much looks like this. It is
literally the word GET in all caps
a space a forward slash ( / ) if you're just
requesting
the root of the web server marked
typically with /
and then HTTP /
version number
Now in reality, there is a few more headers,
so to speak, HTTP headers that get
sent
from browser to server, and we'll see
those in action in just a bit,
but this message captures
really the most important aspect of the
request
so your little computer creates a
virtual envelope more technically called
a packet of some sort inside of that
packet
is a message like this
Put on the front of that virtual
envelope is a "To" address namely 1.2.3.4
or whatever Google's IP addresses is.
In the return field of this virtual
envelope you know just like you were
mailing something to a human there's the
return address who which should be who's
IP address probably?
Your own IP addressing, and your
computer does know that if you have an
internet connection
and then your computer sends it out on the
internet.  Now we can dive deeper and
deeper and deeper but for now assume
that your ISP
has what's called the default-gateway
also known as a router
and routers are the computers on the
internet that know how to get data from
point "A" to point "B", or if they don't
know precisely how to go for "A" to "B"
they know whom to pass it off to
who can then get it one step closer to
point "B"
so in reality a packet, this virtual
envelope,
might go from router to router to router
to router sometimes as many as thirty
different routers across the globe
until finally it gets to its actual
destination Google.com.
Google receives this virtual envelope,
sees that it's for its IP address,
opens the envelope up, sees this message
Google.com server happens to be
running a web server and so that
webserver looks for the file called "/"
now "/" is typically a synonym for an
actual file name like index.html
or index.php or any number of other
default standard file names
so Google grabs that file from its hard
drives and then puts it it's an own
virtual envelope
flips the two IP addresses the from and
the sender
sends it back to the internet via these
routers
it arrives on my computer. My computer,
unbeknownst to me, opens this envelope
sees a whole bunch of a language called
HTML
renders that HTML top to bottom and I
see
the search page for Google's main site
What is the function of the slash?
so whenever you type in a URL
There are several different
components to it. HTTP typically
followed by :// followed
by something like this (www.google.com/)
and so this is let's say a
representative URL, but we can
actually to tease this apart into a few
components.
This is the protocol or schema at the
beginning,
even though in a browser we almost
always used HTTP://
Have folks seen others?
HTTPS, similar, but different, in
that it uses cryptography - a topic we'll come back to.
FTP://
SFTP://
WEBCOW://
Some of these are more standardized than
others
but the schema is typically an indicator
to some piece of software
how it should view the contents at
that address
so what comes after the ://
It typically has something called
a hostname
or sub domain name
followed by the domain name, which in
this case is google.com
or followed more precisely by a domain
name witha TLD - top-level domain
a   .com  .edu   .gov   .uk
would be the TLD and then you have
what we call a path
and a path specifies exactly what file
or folder you wanna access
A single slash means get me the root of
my hard drive
and if you come from the windows world
this is essentially equivalent to C:\
Or on a mac it's equivalent to
that, or on a Linux computer it's
equivalent to that.
So that is truly the root of your hard
drive,
the folder in which everything else on
your hard drive lives
now it turns out in a browser these
days you don't have the type
most of that. you can omit the HTTP://
You can typically omit the www.
You can
omit the slash, and things just work
Why is that? both for the most part it's
because browsers have just gotten a lot
more user friendly
right there what was the time a few
years ago where advertisements
in print and on TV would actually have
HTTP://
but then the world kind of realized that
you know anytime you see www.
something
probably a website so we started
omitting HTTP://
Now the world has gotten acclimated to
any mention of .com or .gov so
we don't even really need the www
anymore and so whether or not www
works or doesn't work
is actually completely configurable by
the system administrators of the website
and in fact
i don't have a
sort of a soapbox to hop on right now
but invariably during a semester, I'll
come across some website
for which foo.com or whatever the
domain is .com
just doesn't work you have to type in
www.something.com
and that's just a foolish technical
design decision on their part.  We'll talk
today about how you can configure things
to just work, and it involves a bit of
DNS
a bit of web server configuration
but typically you don't see that dead end
because browsers these days if you type
in foo.com and hit enter and
there is no foo.com IP address out there
the browser will presumptuously or helpfully
prepned "www." to the start of the
address
and then retry that one
some browsers if you just type foo
will automatically try foo.com, foo.net, foo.gov
some of the most popular ones so in
short
a lot of the technical processes that
are happening are being sort of hidden
now by browser user friendliness
for better or for worse
So, the story began with hitting enter
the story ended with you're seeing the
home page of Google.
Any questions on the various steps in
between,
whether high level or lower level?
Allright, so that's the story told from the
perspective of a user.
Why don't we tell the story from the
perspective now of someone who owns a
website or wants to operate a website so
suppose one of your goals in this class
or some other
is to actually have your own presence on
the web
to actually buy your own domain
name and have your own business or
personal home page or whatever the case
may be.
How do you go about doing that? You need
more than just a laptop and a browser now
you need a server
on the internet because even though
every computer on the internet,
your laptop included
has an IP address
it's not necessarily publicly accessible
because even that statement's a bit of an
oversimplification.
You do not necessarily have a
public IP address.  In fact if you go
home
and you have internet access at home,
especially wireless
you probably have a home router like an
Apple Airport Extreme, or you have a Linksys
router or some device with antennas
that gives you wireless internet
access but Comcast or Verizon or whoever
you're paying each month to give you
internet access into the house via your
cable modem or DSL modem
which in turn is probably connected to
that router
if it's not one in the same device which
some of the ISPs provide these
all-in-one devices these days
Odds are you have one IP address and if you
have 3 brothers and sisters
or parents or grandkids in the house
all of you are sharing
that one IP address
and yet the individual computers in the
home still need an IP address.
so what actually is the case is that
when you're in a home network you have
what's called generally a private
IP address something of a form..
Anyone know what up popular internal IP
addresses is?
Exactly. Anything, in fact, starting with
192.168.x.y
is a private IP address, so the folks
who invented the internet
along the way decided "You know what? We
should probably have some IP addresses
that should never be given out."
So that within the company or home or a
little test network
you can have IP addresses that are
guaranteed not to exist on the public
internet
so what home routers typically use iss
192.168.0 or 192.168.1
and then the last digit, it can be again
between 0-255, but some
exceptions. It really it can't be 0 or 255,
so there are some constraints, but it gives you
roughly 250 or so possible IP addresses
If you don't like that, there's:
172.16.x.y
There's a few more constraints
on this one, but then if you really need
a lot of internal IP addresses
you can have what's called a "class a"
private network
10.x.y.z is a private address
and this actually gives you millions of
IP addresses for your home or your
business
or your data center, but in short
any IP addresses beginning with these
few other prefixes are considered
private
but the problem then is that even if
after this class you know HTML and CSS
all the better. You know PHP, and SQL, and Javascript
and you creates a website and you've
run it on your laptop using software
we'll introduce you to.
A web server called Apache
no one in the world
is going to be able to visit your
website because
your address probably starts with one of
these prefixes
and your home router or cable
modem or DSL modems is not going to
let outside random people into your home
network
to access this IP address
because
frankly there's tens of thousands of
people who probably have that exact
same private IP address, so it's just
uniquely identifiable
and because your home router
and your cable modem is sometimes a
firewall into itself this traffic not
gonna get into your home
so in short that won't work..
but you have at least two options, two
alternatives, how can you get your website
out on the internet?
You can. Port forwarding. So let's go
there. For those unfamiliar when you use
a protocol like HTTP:// you're actually
using other protocols behind the
scenes and in fact you probably at least
heard the
the buzzword TCP/IP transmission
control protocol internet protocol
It's actually two protocols, two different
standards or languages so to speak
that govern how data can be transmitted
on the internet and this is a bit of an
oversimplification but for today's
purposes assume
that IP, the internet protocol,
is just a set of conventions that
humans came up with years ago
that govern how you associate numeric
addresses with computers
so IP address derives from this
protocol so IP is just the standard for
assigning computers addresses however just
signing someone an address
doesn't mean you can get data to that
address for that you need another
standard another protocol
and that's typically TCP transmission
control protocol
So TCP
is the standard
that web browsers and web servers speak
in order to actually physically move data
or electronically move data
from point "A" to point "B"
using the higher level notion of an
IP address to actually uniquely
identify points "A" and point "B"
so for those
who might want to go further in computer
science and in networking in particular
there's typically what's called the
TCP/IP stack
and so there's topics like there's the
transport layer down here there's the
others the IP or addressing layer here
there's the application layer in short
much of the internet is the result of
smart people having design things and
then design things on top of things on
top of things
and so we just typically over simplify
and say TCP/IP.
So what's the point there?
TCP/IP allows not just the web to
work but all sorts of applications
There's the web. There's email.
There's instant messaging.
There's things like Spotify. There's dedicated
applications they're using the internet
but aren't necessarily inside of a
browser
so
a server can actually do multiple things.
It can receive email like Gmail can.
It can be a website and get HTTP:// traffic
so a server because it can do multiple
thing somehow needs to be able to
uniquely identify
the various things that it can do
and so the world introduced this notion
of port numbers
and typically for a web server
Rather, for HTTP:// it uses this
protocol TCP and the world decided some
years ago
the number 80 will arbitrarily
but consistently identified this service
so if you have a server and you have a
website, and a website uses, as you
probably know, HTTP:// but will look at
what that means in a bit
it is running so to speak on port 80
it is listening so to speak on port 80
and the motivation for that
is because you might also have
an email server on the same physical
box , right? Gmail, kind of an
oversimplification, but they are both a
website and an email service, and if you
want to be able to send email to Gmail
you can also used TCP but you have to
use port 25
in other words if you've go to http://www.gmail.com
a with a browser you obviously want to
web page back
so even though you, the human,haven't typed 80, it's automatically
inserted for you by your browser, behind
the scenes
but if you send an email from Eudora
or apple mail or Outlook or whatever
you're using
you again probably don't have to care about
this detail but that program
is going to send data still to gmail.com
but specifically to port 25.
So, when a computer's on the internet, a
server, and it's listening for traffic all
of that traffic comes in on a specific
port
a specific like pathway into the server
so that it knows if it's a webpage or an email, right? Because especially email; emails can
contain HTML now
so you need some way of distinguishing
the two fundamentally
so when you propose port forwarding, what
does this mean?
Well, if your home network
has a public IP address, and you usually, again, get 1 from your ISP
and that is some address of the form
w.x.y.z
and your individual laptop on which
you've created your final project that
you wanna make publicly available
is that one of these IP addresses,
doesn't really matter what it is,
what you can do is configure your home
router AKA firewall AKA cable
modem, it depends on what make and model
you have,
but that device,
you can configure it to say
any internet traffic that comes from the
internet to my home
on my public IP address
destined for port 80
should be "port forwarded"
to IP address 192.168.x.y
port 80
in other words you can tell this machine
to take incoming data on that port
and then route it very specifically to this
computer, your's,
so that it just works.
Now, there is one gotcha here.
Especially if you have siblings, for instance or
other technically minded family members
or roommates
if you're doing port forwarding in this
way
only one of you can operate a 
webserver
behind your cable modem because you only
have one IP address to uniquely
identify your website and if you've
already claimed 80 as your own and
that's the default for the world
browsers to use pretty much only your
webserver can be accessed
now there is a work around here if your
roommates really ticked off at you, you can
say "Fine, fine, fine, I will give you port 81."
but what does that mean?  That means the
entire world has to type out a URL like
let's say your address was indeed w.x.y.z
this would be your IP address
your URL
your roommates, unfortunately, would be this
crazy looking thing (http://w.x.y.z:81/),
right, or any number really.
Now, there are some restrictions on the
numbers.  Probably can't use 81,
but the point is the same.
This is not standard, and you probably don't want your users having to remember such an
esoteric detail as an arbitrary number.
However if on the internet
you visit any website with :80,
odds are you will get to the website
with which you're familiar it's just the
browser is again for user convenience
inserting the port number
automatically for you.
and little trivia for HTTPS, the secure version of HTTP, what port number does that use?
443, and you sometimes do see
that in the URL and you also see some other
ports commonly like :8080
:8080 is just kind of arbitrary
popular port that some companies used to
run certain services but in short using
anything non-standard these days
especially for commercial production web
sites where you're trying to make money
or trying to stay online up one hundred
percent of the time
using non standard ports is bad, because there
are certain companies, there are certain
campuses that will pretty much block any
ports besides 80 and 443,
but thankfully there's a work around,
even if you wanna run some random server
like a bit torrent server, or something like
that
all you have to do is change the port
number to be 80 or 443
so the reality is that with firewalling and will
have this conversation toward the end the
semester, when we talk about security
more generally,
in a lot of security mechanisms are kind
of a joke because all you need is a
modicum of savy or you know,
having listened to the past
30 seconds of words that I just said
you can circumvent these kinds of
restrictions.  Hotels do this a lot,
Starbucks does this a lot
the port numbers are really just this very
basic
mechanism, and the world and adopted some
standards, alright,
so, perfect!  We have a solution.  All you
have to do is somehow figure out
how to download the manual for your
Linksys router or Apple Airport
and you can configure all this port
forwarding stuff and run a website
from your own,
so not quite. Because if you actually
have a popular website, Verizon and Comcast
might very well notice and just shut you off entirely,
because that huge disclosure agreement
you probably clicked through and never read when
you signed up for internet service
probably said you may not run
a website on your home computer
so plus that this was a pain in the
neck to do anyway, plus I might
unplug my laptop sometimes and so my
websites gonna go down anytime i go to
go out
so not the best solution even if you
have a desktop so let's at least try to
push a little harder and assume that we
need to outsource this problem, or we
at least need to put your computer
on the internet itself, in a data center,
on the campus, where it can stay plugged in
perpetually, under your desk at work if
the system admins allows it,
and moreover i don't want my website to live at
w.x.y.z,
or any number for that matter,
I want it to live it david.com or
some URL
that is sort of distinctly my brand or
my name,
so that begs the question how do you go
about
getting your own
domain name?
Has anyone done this before? Yeah, how do you do it?
Okay, where do you purchase them?
Okay, so namecheap.com is a very
popular place, fairly inexpensive
Go Daddy is another very popular place
This one (Go Daddy) is kind of riddled
with
up-sell attempts, trying to get you to buy
everything in the kitchen sink,
but you don't need to do that.  There's 
all there's all sorts of
domain name registrars out there these
days. A bunch of years ago
network solutions was the only one,
but then the market was created and so
there's a lot of places to buy domain
names.
For the most part, it doesn't matter
where you buy your domain name from, but
you do sometimes get different features
in particular you get DNS features
sometimes, more control over your DNS
servers.
They might throw in free email accounts,
free hosting,
but for the most part, it doesn't matter a huge
amount in particular, you don't need to go
to someone like network solutions and
pay thirty dollars a year, when you could
go to someone like Go Daddy and pay
$9.99 a year or namecheap and
pay $4.99 a year
so in short paying more for domain name
isn't necessarily giving you anything
more
uh... in the way of
uh... functionality. It depends on what maybe the add-ons are.
So, how do we go about doing this? Well,
let's go to something like Go Daddy.
Go Daddy's kind of a...
Well, let's actually try namecheap.
Let's go to namecheap, see what they
look like,
much of my friends have indeed used this
website.
right so let's see domain name to search
and the search for david.com
probably take. Oh,l that is a good
price. Already doing better than Go Daddy
All right. So as I expected it is taken
as are almost all forms of david.
*Ha* They've suggested I name myself "David
John", "David Smith", "David Johnson", "King
David", "David Photography.us"
So one of the hardest things, frankly, of
starting a business these days is
finding an available domain name, let alone
your own personal vanity domain names
for people's names
but if we found something we like.. Maybe
I do want DavidTV... Well, that's atrocious.
$6,000 for this domain but it's not yet taken.
It's probably one of the cheaper ones
up above so let's assume we found something
we're happy with
so we add it to our cart and we check out
I now own some domain name,
David something.com.
So what now do I do with it?
How do I associate it with my web server?
and for that matter, how do I get a web server?
Let's assume I have a web server, and we'll
cross that bridge in a moment,
but I have a domain name.
What do I need to do with it to start
using it? Well I need to tell the world what
my IP address is.
So I need to, somehow, tell the world that
my server.. I don't know who's going to be
hosting it, but i know it will have a
IP address, by nature of how the web works.
so let's assume I know the IP address is
going to be
w.x.y.z.
I somehow have to inform the whole world
that david.com's IP address is
w.x.y.z.
So one of the things I'll have to do at
namecheap.com or Go Daddy or
networksolutions.com
is I tell the registrar
not what my own computers IP address
will be
but rather what the IP address
of my domain names
DNS server's will be
and the convention is typically that
every domain name in the world should have
2 DNS servers: primary and secondary
so a main one and a backup one.
They can be one in the same, but the
world really pushes people having at
least two for the sake of up time and
redundancy
so I need to know not my own IP
address, per se,
but I need to know the IP address of one
and then a second DNS server
Now I don't have my own DNS servers
and I want to configure two more
computers in addition to my web
server
so this is where web hosting companies
come in.
So in addition to buying the domain name
I also wanna host my website somewhere
and it could very well be the same exact
company. Could be Go Daddy, it could be
Name Cheap
depending on the service that they
provide, but
we need to have
a web hosting
option.  So what's the web host going to give
us?  A web host is going to give us
hard drive to put my files on, you know,
maybe not hard drive, per se, but some
illusion of storage space
they are going to have their own
connections to the internet
this web hosting company
there hopefully gonna have a pool of
IP addresses so that I can have at least
one of them.
They're also going to have some
RAM. they're also going to have technical
support staff.  In short, they're gonna
have a server, and all of the things
necessary to keep a server alive on the
internet,
and hopefully they're also going to have
at least 2 of what..?
DNS servers.
So if I decide to host my website, let's
say dreamhost.com.  This is a very
popular sort of "El Cheapo" (basic)
kind of web hosting company
that I've used myself in the
past like $6.95 or $8.95 a month.
So that's pretty good but
again you get what you pay for.
I wouldn't necessarily build a big business on it
So for $8.95 a month
I have the ability to upload my HTML and
CSS files and soon PHP and Javascript
files to their server.
Their server has nearby 2 DNS
servers, each of which, have their own
IP addresses.  So once I know what
Dream Host's IP addresses are for its
nameservers, I'd tell Name Cheap
or Go Daddy, or wherever I bought my domain
name
and that's it
the only time I have to talk to my
registrar again, most likely, is in a
year when they charge me another $5.99 or $99 for my
domain name.  Unfortunately, "buying,"
you're really renting your domain name
from these registrars.
Now there's a whole bunch more involved
in setting up of the web server
and getting my files there
but at least now I've told the world
that if you want to know where david.com is
Ask these people. These two IP
addresses of the name server.  Either one
and those IP, those DNS servers
should hopefully know. Why?
Because so long as I keep paying
dreamhost or someone else $8.95
per month
they will ensure that both of those
DNS servers
know what my own website's IP address is,
and how will they know?
Because of what I'm paying for is some
storage space and some internet
connectivity on one of their servers, one
of their servers has an IP address
so they just tell their DNS servers
that david.com's IP address,
is whatever the IP addresses is of the
server
they've told me to put my content on.
and will actually look in little more
detail
what's involved in that.
..but any questions?
so in answer to the somewhat
frequent problem where a website
does work at www.something.com but not at
something.com
How do you fix something like that?
There's usually two pieces to the
solution
1. You have to make sure that there's a
DNS record for something.com
that is, there's an IP address associated
with it. In addition to one being
associated with www.something.com
and you need to configure the webserver
to accept requests for either something.com
or
www.something.com
but really let's focus on just this
DNS piece for now.
so DNS... turns out the DNS is
relatively straightforward and once you
start operating a whole bunch of
services on your own website. Maybe you
have an email server,
maybe you want to use hosted
services like Google calendar
Google docs.  You can do things like
actually for CS75, for this course,
the TFs (teaching fellows) and I
use Gmail essentially to host CS75.net's email
so that's the website as I'll soon reveal
if you haven't pulled up the website,
and we want to be able to have an email
list so that each of us can email
everyone else very easily.  So we want email addresses of the form malan@cs75.net
Now how do we do this? Well, we could
set up a mail server, we could pay
someone to do this, but an amazing service
out there is Google apps, some of you
might be familiar and for small fish
like us where we only have a few people
on staff
you can actually have hosted Gmail,
hosted Google calendar, hosted
Google documents
for I think 20 or fewer people for free
and what you do is you configure your
own DNS servers
to map
something like mail.cs75.net
to essentially gmail.com so that
whenever we send an email to
something of the form mail.cs75.net, it figures out
via DNS to actually go to Google.
We could have calendar.cs75.net
and you hit enter,
you actually end up at Google Calendar,
but our copy of
Google calendar, and this is all thanks
to DNS and there's only a few
settings with which need to be familiar
and we already talked about this one
an NS record
is
a record in a DNS server
that tells the world what the IP
address is for that domain.
So, what's inside a DNS server?
Frankly it's a database, and you can
think of it as like a database with
excel files so spreadsheets that just
have rows and columns
and those columns essentially represents..
In each row, rather, you would
have for instance a domain name and an IP address.
Domain name : IP address.
Domain name : IP address.
That's really all
that's underneath the hood in a DNS
server at least so far as we're concerned.
But there are different types of rows.
So one of those rows can be an official
record
that says the name server, NS, for this
domain is whatever IP address Dreamhost gave me.
For instance
Now, what else can I have?
Well there's an "A" record. So an "A"
record, a row of type "A" in that this
spreadsheet of sorts
is literally
Domain name : IP address
it's as simple as that, so if I had something.com
In it's like the address should be
xd w dot x dot y dot c
that's what's called in a record
and i can also have mail docs are
something dot com or calendar dot
something dot com
and i can associate with an i_q_ dress
and how do i do this they totally
depends on your registrar or on your
d_m_s_ provider whether extreme poster
go daddy like
but these days it's usually a web
interface back in the day
it was a command line you added a text
file on the server but these days it's
been made to be more user-friendly but
it's essentially
a spreadsheet
thirst to his slightly fancier features
a c name
or canonical name
isn't alias so it turns out with a lot
of these webservices like school blacks
where google's providing the service
you don't necessarily want to have to
know what google's i_p_ addresses
writers one deep right on the only one
who works there and so you can't really
ask them out frankly could run a command
in figured out
but if you hard code into your work in a
server
the i_p_o_ dress of google dot com
implication is that if they ever need to
change their i_q_ address which happens
not everyday pretty however few months
few years for whatever technical reasons
now your website goes down
it's common kinda be better
at least like a conceptually if
calendar dot something dot com
didn't resolve to google's likey address
but rather what if pounder dot google
dot com could instead resolve more
generically too
calendar dot google dot com
so don't have your domain map to a nike
address
had your domain name map to someone
else's domain
and then let baird cnf server tell the
world
what the ac current i_q_ addresses of
calendar dot google dot com
so in other words if you want this layer
of obstruction where you don't care what
the i_p_o_ dresses you just care that
your domain india synonym for someone
else's domain name
then you use a c name record and what
the two columns look like our domain
name domain
instead of the main aim aki address
so it's a wonderful useful feature
especially these days if you look into
hosted solutions not just like to
collapse
but companies that have services like
uh... you know i customerservice forums
if you go to a website know often have
an address like support dot dell dot com
or the like
well there's a lot of companies days
that to provide
our customer service web sites
at but it would look kind of lame if i
go to del dot com and i get redirected
to customerservice dot com
del would provide read rather re brand
someone else's service
to look like dell even though someone
else implemented and is hosting it so
why us by a c name someone like dell
could say support dot dell dot com
should actually result to
customerservice dot com but the user
should never know that because the oral
stage support dot dell dot com so that's
just one of the things you can do
with these things called seanangs
and lastly in annex record as a male
exchange record
and mail change record simply states
what is the i_p_o_ dress
of the server or servers that should
handle inbound mail
for this domain
and this is great
because when you use your door over
gmail or outlook
and you type in the year uh... david
mail-in at harvard dot e_d_u_
and hit enter
similarly there you have no idea what
harvard's i_q_ dresses
but your computer does but it's not the
i_q_ dress of harvard that you do you
per se that your email client needs
it's the acted as a partridge mailserver
so thanks today that's your mac or pc
can ask your icici is deanna server org
dot doc dot this whole hierarchy we
discussed earlier
in state what is the annex record for
harvard idea
and harvard dot e_d_u_ still means name
servers should be able to say send all
mail to the psyche address
and what's nice about annex records as
you can have multiple wants with
priorities
so websites orf rather
uh... domains that had very old large
numbers of users were you really don't
want their mail servers going down
you could have two or three or ten
different mail servers
and the d_n_a_ system all say try this
one than this one than this one than
this one
just in case any of those go offline
it's all thanks to d_n_ s and while
research take all this for granted
once you start developing your own
websites maybe creating your own
companies or confirming back your own
school having these abilities is
wonderfully powerful and it really boils
down to
basics
any questions
all right so that was kind of a lot want
to take out three or so minute break
characters restrooms in the hallway
there soda machines i think around the
whole corner
on the move rejoin in about three mins
alright we are back so
why don't we take a look at the course
itself and what you are in four and with
the courses expectations are so in terms
of prerequisites the official
prerequisites are these so much more
years of programming experience as well
as comfort
with html c_s_i_s_
so what does this mean in real terms so
summer school is very short since about
six weeks and of course has three na
lunch for billy sized projects
and the gold really is to make sure that
at the end of this short semester
you feel quite comfortable going off and
doing much more on your own in the way
of website development not just html
once yes s
informed static websites but really
dynamic websites that are driven by
a language like p_h_p_ and javascript
back in the database like mysql
so it's a fairly intense course uh...
if you believe takin
something like computer science asks one
of the introductory computer science
class
we're just one or two courses i will say
from past experience
you'll probably find the core
challenging to say the least and
typically willy estimate about thirty
hours per project so there's three
projects universally nine days for each
of them
on that's about thirty hours each but
that was beyond average so students for
whom programming is a little less
familiar or it's been a bunch of your
since you program to read only taken one
or two introductory courses but don't
really think of yourself as a programmer
on the course is definitely more
challenging so do beware uh... diving in
i will say if you're on that fence and
not sure if your comfort level in
background is there
but you can go to c_s_ seventy-five dot
tv which is the opencourseware site for
the course
where we have several previous semesters
worth of lecture videos handouts of
projects some of which will be are
similar to this summer's
so by looking for the past you can
perhaps in for as to what this summer
will be like and get a sense then
if the p_b_s_ of pastors projects
completely overwhelm you
or completely excites you
so i would try to use that as an
additional input tonight
uh... before deciding whether this is
the course for you
um... in terms of expectations
and there are these three projects in
attending or watching if distant or
unable to make it all in-person the
lectures
on the lectures will be structured as
follows two tonight our focus is on
h_t_t_p_ and for the mechanics the
underlying fundamentals of the internet
that for years you prolly taken for
granted
but once you really start building your
own website and having to negotiate
things like configurations of servers
and coding databases tonight will start
looking at some of those more technical
details
on wednesday and han next week will look
at p_h_p_
itself so one of p_h_d_'s on most
compelling features these days is one
it's in tactically very familiar are
very similar to mine which is with which
many of the folks in this room in at
home or familiar syntactic lee it's
similar to ciency plus plus
on and other a procedural languages
on this very much in vogue very popular
it's pretty only present these days in
terms of the web hosting companies that
they're out there and its super easy get
setup even mac o_s_ comes with p_h_p_ an
apache the webserver pre-installed
and their packages for windows in the
nixon other computers that make it super
simple to get setup on in terms of
related languages rop iphone in ruby
you're probably the to close this
contenders in terms of popularity with
p_h_p_
and none of these is necessarily better
than the others who could very quickly
devolve into a religious debate
but one of the nice things about p_h_p_
is again the omnipresence of support for
a doubt there
and also i think pedagogical e the
documentation for p_h_p_ is outstanding
and as you'll see
the p_h_p_ dotnet online reference
manual for functions and whatnot is rich
with examples intelligent discussions
and so we've just found
that it's a very nice way of diving in
deeper to web programming in front of
course like this you should be able to
continue on
if you haven't come from that direction
are ready to the likes of pipeline
of ruby andrea speed for the job of
world they espy's for the windows world
there's a lot of commonalities among
them
will transition lecture three
to looking in exile alyssa when it comes
time to actually store data whether
statically or dynamically
you don't need a full-fledged database
you don't need my sick we don't need
oracle you don't need anything along
those lines you could just use text
files
but he'd be nice if it's easy to read
and write those text file select some
now
is a very popular language of sorts of
metal language with which to right now
textual files and to representative more
generally of the topic will come back to
when our javascript lectures
on the document object models so they'll
be some commonalities there
uh... as sql are structured query
language is what's used by a many
relational databases these days among
them my sequel
oracle postgresql theirs
also in vogue these days or are no
sequel servers document uh... storage
engines which will look at later in the
semester as well
but will primarily used for the courses
projects my sequel
will look at lecture six and seven
eighths javascript and at this chat more
general technique of ajax the ability to
you
whose javascript
to query a server even after pages
loaded
to get back more data for instance
google maps does this to get more
reptile squares of mapping information
when you click and drag
a face book does this to push of live
updates from your news feed in the light
i will look toward the end of the
semester then
at some higher level concepts like
security which wool interlaced
throughout the semester but will really
focus on it
in lecture eight looking at common
attacks on web servers on websites on
databases
so as to not necessarily acquaint you
with everything they can go wrong
but to at least plant the seeds in your
mind of things you should be thinking
about indeed there's so much code out
there
that is just one ruble because the
people don't think do things like
sanitize user input that is they don't
check it for dangerous characters
so we'll talk about things like sequel
injection attacks will talk about
uh... prostate scripting attacks in any
number of other ways
that are so darn easy to avoid
yet many people just
don't realize it or don't know how
inferno typically
simple little function calls conflicts
and in the last lecture will look at
scale build so would be of great problem
to have if you've got so much traffic
that all of the lessons you learn from
lecture zero two eight
uh... starts at your website search
crumbling under the load and so will
conclude the semester by looking at
okay now you have to build now for a few
dozen people two hundred people your
school
but several thousand or maybe even
several thousand people per second how
you actually scale
from one little webserver to a bigger
one
but then once you have the biggest and
most expensive available webserver what
do you do
we start to scale as they say poor is
awfully so you get multiple servers
maybe even cheaper slower web servers
but you somehow figure out how to
balance lowdown so to speak traffic
across them
how do you do that we databases how do
you do that geographically
uh... how do you do that with cloud
computing a buzzword that's all the rage
these days but has a very interesting
technologies inter-lining
will wrap up this master looking at
those bigger picture
issues
in addition to second lectures
we will have most weeks on sections in
office hours for the course has
uh... for teaching fellows folks who
have either taught or take in the course
before
who'll be with us in the form of
sections and office are sections will be
a more slightly more intimate
opportunities so on wednesdays in
monday's typically right after lecture
if you'd like to stick around to dive a
little deeper into that weeks project
so in addition to the pbf specification
you'll get of a project
one of its yes will walk you through the
week uh... the week's project to give
point now
offer some design tips awesome helpful
direction answer any confusion
why did something quite poorly in
lecture we can revisit those kinds of
luck topics in sections that you get
another perspective altogether
and then office hours which is meant to
simply follow
section so one section officially wraps
after an hour
uh... office hours will be an
opportunity for one-on-one human a with
one or two fifty s
uh... and this will be an opportunity in
particular
for questions with the projects if
you're having trouble understanding
something trouble chasing down some bug
in addition to reaching out to us on
line with a bs in person opportunities
for those of you who are local for those
of you were dissent
more on the online opportunities in just
a moment
in addition to
um... the courses classes there are
projects three of them and they will
outflow
roughly in the order of the topics on
the syllabus portal start in terms of
p_h_p_ which would be new to song or
most people in the room
then will introduce mid semester
databases in my sequel being will
introduce javascript and ajax
and so that all sense would be the
tripartite approach
of the courses projects in terms of the
topics
um... in terms of grades on projects are
graded fairly holistic lead because
you'll be encouraged to make a lot of
design decisions on your own
you won't necessarily have to implement
per slice lee
what we tell you to rather you'll have
to meet certain feature and technical
requirements
so will to evaluate the three projects
on these axes cisco will be in axis that
and it'll be a numeric sport that
captures how much of the project you
actually attempted
uh... correctness will capture how much
of your code works in accordance with us
back
if it's very body that would not be
considered very correct
design is more subjective design is pain
might work might work perfectly
but does it look like a mass underneath
the hood you had like ten nested four
that's that is not good designed for
instance
and so designed to be an opportunity for
pf particularly qualitative feedback
from the teaching fellows on your code
and style
is the more of the aesthetics or your
variables reasonably named are is your
code well commented is it nicely
indented
the sordid easy things that are good
habits to getting to meet if you're not
yet enough
that's what we define a style
and then just for reference things are
weighted in roughly the order the amount
of time that's required to get things
right
so for instance this is the formidable
used a computer total score for each of
the project
were correctness for instance is
weighted more heavily
then style and uh... batch of capture
the fact that
inventing a code polish and take you all
that long but she seemed on bob's
concert retake
quite a bit of time
so the format was meant to capture
um... the courses website which help
open just a moment has everything that
you will need for the course including
videos of lectures if you can't make it
some evening work it's tricky because a
full time work
totally fine to watch the courses
lectures online all the handout
similarly will be available there
what will be rolling out over email this
week is access to a tool that actually
uh... we've used in another class of
mine concious fifty but it's a
discussion tool that will allow you to
interact with classmates with myself
with the courses t_s_ online
sab online discussion forms of sorts but
using some of the same technologies that
will talk about
in the class including a jackson similar
on so you will soon receive emails from
us with invitations of sorts to create
accounts within the uh... website
so that we can you can start directing
questions to classmates poor privately
to the staff
biopsy is fifty discuss
so any questions on the structure
expectations of
but here in the right place
must of course itself
in terms of
attendance is uh... expected and
encourage bucks not factored
so see it in a sense of protracted
uh...
then in terms of lecture typically we
reese were slated for three fifteen to
six fifteen i think will rarely go lol
three hours typically the same course
during the year on these two hours of
classical typically have a little bit of
wiggle room and let me not commit to
just two hours per night
on but we will typically not go i think
as many as three hours for the frankly a
lot to take and com twice a week no less
uh... so uh...
where we end up its act
any other questions
olin over there are two sections
the implication of that detail
is that sections will not start
necessarily eta preordained time what
will try to do is the test will come a
bit early
so if we do end up wrapping up a lecture
we'll take a short break the band i've
been right outright immediately to
section
an office hours so that you don't sit
here awkwardly just waiting for an
arbitrary time to come around
for the distance students sections will
be filmed as well and we will be making
ample use of online interactions for
students who are primarily distance on
and we've also experiment in the past
with things like skype in video
conferencing or
online chats com were quite flexible for
whatever works uh... pedagogical for
folks
good question and typically not for
distance jeans with sections we do film
them but there is some leniency in when
we post them we may experiment with
trying to straighten some things online
but this room is not equipped for that
so i shouldn't makeup promises to that
just yet
but either way things will be available
in synchronously after the fact
the office hours will typically be right
after sections on mondays and wednesdays
which are right after lectures
uh... the motivation being especially
for folks who commutes
we figured we try to compact things from
mondays and wednesdays you'll have to
come to campus yet again
and we're flexible too if for instance
you're really struggling in the class
you have lots of questions or
your schedule you have a nighttime class
right after work
'em were happy to do things by
appointment as well
soldier will will meet you half way as
best we can
somewhat hosting companies we talked
earlier about d_n_a_ aspin sort of
getting traffic to some destination b
but once they get their what's waiting
for the use or where are your rates tim
allen's yes essence in p_h_p_ files
actually store
so this is a little screenshot from this
one company dreamhost and i don't
necessarily recommend them over any
others
but they're popular among our own and
super cheap and just to give you a sense
of what you get
and what you don't really get
here's a screenshot of what you get for
parity for eight ninety five per month
so you'd have to get unlimited tera
bites
of disc storage space
uh... you get unlimited tera bites of
monthly bandwidth you get an unlimited
number of domains hosted and you get an
unlimited number of user counts email
account bicycle databases
and at the turn of the distribution of
the next fall back
took ourselves to good to be true
so what is the capture
but that's an amazing deal for eight
ninety five
unlimited everything
so what are some of the taxes are what
are they doing here technologically to
make this possible
exactly so a lot of these web hosting
companies are shared services whereby
you might get this
but they're also promising the exact
same thing to ten other people to one
hundred other people
now turns out that its g_d_p_ the
protocol we discussed earlier
there is a future for these days for
what's called virtual hosting
so back in the day for the web every
website needed unique i_p_ address
essentially so that when you type in
something dot com
you went to one web sites and that
website madonna server that server had a
nike address and if you want to the
second website
you better get a second server or these
give that computer a second i_p_ address
however in more recent versions of
h_t_t_p_
will see through some experimentation
with actual browsers
browsers send and other issue to be
hatter they don't just send gets
they also send a reminder to the web
server as to what the user typed into
the u r l
so that you can now have these days
multiple websites foo dot com bardot
combats dot com all living on the same
physical server
at the exact same like he address
and because the browser's remind the
server what the user typed in
from dot com or bar dot com or about dot
com
the server
even though it's receiving traffic for
three different websites
can figure out from those so-called
headers what was requested
and then return the appropriate
domains
homepage
so in this case
that's great 'cause it makes this
possible we only have four billion i_p_
addresses in the world in the are
legitimately running out and so this is
great that we can multiplex servers in
this way and put multiple people
multiple websites on the same mikey
address
but there's a couple dot that's what's
the implication of yes
the fact that multiple customers on the
same machine
goods of the mission crashes now all of
you are affected rather than just the
one
contention for resources right so you're
kind of in bad luck a bad place it for
instance when the other customers on the
website server is face book dot com or
something then take xiv unexpected
popularity on all of a sudden
or maybe it's a web site that's really
text someone off
and is getting some kind of
internet-based attacked like a denial of
service attack
because people going after that website
and just because your server your web
site on the same server
now you are down or otherwise offline as
well
moreover at one of the res one of the
ways in which these
uh... companies offer such discounted
prices this 'cause it's not just you and
to other websites as prime not ten
it could be a hundred could be a
thousand other customers on the same
server
and so there must be some fine print
hopefully there's some fine print
somewhere
that does say this is subject to
something right they don't have infinite
tera bites on their web service infinite
bandwidth there's gotta be some catch
here
otherwise the world monopoly thousands
of dollars a month to host a real
large-scale websites
so you again sort of get what you pay
for
and this is actually expense of years
ago i signed up for some fly-by-night
operation for like two ninety five by
year
um... to host my website and it was a
website that i did not care much for and
that was good because it went down quite
a bit um... so what they're not
guaranteeing here's unlimited uptime
for instance so there some
some doctors
but frankly if you're just starting
small you just one experiment you need
it
place for testing website or
you don't you're eight ninety five is
more compelling than several hundreds of
dollars or even more
um... this is certainly fights
compelling
but as an aside things like emailing
calendar and what not there's another
alternatives you don't need to get those
thirty-year west coast when places like
google exist
but suppose you are
not so comfortable with that approach
and you suppose to
that you're not comfortable also with
the fact that
we used to not have any control over a
dreamhost like server
because it's being shared by other
people in it because it's being managed
by other people
which is to say if they are running
p_h_p_ five point two which is a few
years old
so rio like you're running p_h_d_ five
point even if you want to take the
advantage of new language features they
were introduced in p_h_p_ five three
and more recently p_h_p_ five four
you're outta luck like you're gonna have
to other funny web poster just deal with
it you can't just install it yourself
typically
so similar they can you not upgrade
different versions of software you can
assist early reconfigured the webserver
at will
now they might give you some form of
control but you'll reach a point perhaps
we're just too frustrating not to have
administrative access to the server
so you can still achieve that so virtual
private servers d_p_s_ is or an
alternative to shared web hosting model
and the d_p_s_ world
you get a dedicated server to yourself
lasorda
you get the allusion of the dedicated
server to yourself so thanks to a
technology generically known as
virtualization
these days you can park by our server
with like a bunch of six user a bunch of
course lots of ram lost of disk space
and then you can run virtualization
software on it uh... something known
janet generally is a hyper by sir
like vmware or parallels or virtualbox
there's a whole bunch of these products
free in commercial like out there
that once you run them
and install them on the server
on top of that software you can then
install multiple instances of windows
multiple instances of lyrics multiple
instances if they allowed it of mac o_s_
so you can create the illusion of
multiple distinct computers
each of which has its own usernames and
passwords its own administrative or
so-called route account
and even though they're sharing the
physical hardware
they are not sharing the same software
so what you would get as the customer is
the root login or the administrator
login to your machine
now there's still the risk of resource
contention
because these players to go to the quick
over provision freshly if your pet
spending
in nineteen ninety five a month and not
a hundred and fifty-nine ninety four
ninety five months
you're prolly gonna be on the server
with fewer resources or with more
customers
but at least here you gain something
and if you've been following along
what is it fundamental you're gaining
from a d_p_s_ that you didn't get from a
web host
exactly
control you can to keep things up to
date you can install whatever you want
and also
it's all morning altar servers
compromised
all it's our your words
might not p
whereas if the web hosting server is
compromised everything on that server
is potentially for durable
so still not perfect because the reality
is too
even though you are the only one now
with through ordnance traded access
because it's a dedicated albeit virtual
server for you
that she was kind of a white lie who
else has access
the people there
even if they don't know your password
they have physical access the machines
easier physical access to any machine in
the world pretty much
you can't compromise that you can do
analytics computer for instance can be
glued in what's called single user mode
by pretty much any letter x when its
booting up
and that sort of bands any request for
password at which point you can even
change it
um... even on p_c_s in computers you can
usually reset certain passwords by
opening the case up
putting a little metal connector onto
pins in a short circuits out the
password and clears it out
so in short physical access bat for
security so you're not gaining more
security off fundamentally
you're just making it less likely that
someone else's compromise will affect
you
and in some of these systems some
software
that system administrators we'll have
the password or at least access to the
root account on your server
so in short we should just assumed that
this is for you
but probably at least one other person
could physically access
your conscience
so what do you get uh... for the money
well here and frankly these numbers are
a little more compelling 'cause it's not
on limited so what you're kind of
inclined to believe a bit more about the
quality of service for getting here
but twenty gigabytes of storage no
that's fine for typical website unless
your website
has a ridiculous amount of traffic in
database traffic and logs which could
build up and start taking mags are
bigger bites of space
uh... or if you're allow users to upload
files are photos then you might need a
lot of space
but many websites even if they are
dynamic this is probably planning
transfers an interesting one twenty two
hundred gigabytes per month
for most websites that's probably fine
and less your website is a photo website
or worse a video website
then you have to start to do the math
and figure out exactly how much traffic
data will be coming in and out of your
server
based on users patterns and moreover
there's also corner cases this will
discuss toward the end of semester you
gotta worry about the bad guys out there
if someone just unlike you words board
or download some free piece of software
that bangsar bangs the heck out of your
computer
they could just you've got your monthly
allotment of bandwidth
just by sending bogus traffic or
downloading the same video again and
again and again
so there's very interesting adversarial
tax
when you have
finances somehow tied to usage so you
need to be where that especially with
cloud computing
uh... and let's see you get some amount
of ran five hundred twelve megabytes
here
and so forth one of the things will look
at the ring the semesters we start
playing with apache
is all give you a sense of uh... how you
can assess how much ran your computers
using image disk space it's using
i dare say when the most common
platforms for web hosting these days
whether it's a vp s
or it's is a
uh... sir by post
izla nixon some form whether it's debbie
in or for the world war ubuntu war red
hatter sent offer any number of versions
of the next
will happen to use the door of the class
but it's representative of many a
similar
operating systems
you can use mac o_s_ but it's not really
use commercially to host websites just
'cause that's not really xoxo
geared toward that
you can use windows on but you really
there's no good reason
there's no
come technically compelling reason to
use a windows machine to host things
like p_h_p_ your python ruby
'cause you're paying money for windows
license to run free software so it's not
necessarily compelling unless you are a
powerful licenses and have the machines
uh... generally going with
these open-source tools is uh... quick
comment compelling because none of the
software reviews in the course
uh... cost any money whatsoever and
that's nonetheless quite popular and
robust
so
went out
declines so we will introduce it tonight
but we will in the form of the first
project so that you have an experience
in the class that is as realistic as
possible what will actually have each of
you do
is run your own webserver and run your
own database server and actually run
your own copy of lennox itself
for this will use another tool that i
use another possible i'm told this is
that the appliance and this will be a
downloadable file
that inside of which he is in
installation of land expat or lennox
pacific lee
but also installed for you in advance
will be apache which is web server
software my sequel which is database
software
p_h_p_ as well as support for a bunches
of other languages in standard tools and
the like
and the up side of this is that rather
than have you connect for instance to
some random harbored servers on which
you only have
temporary access
this is a virtual machine that you'll
have a new computer for as long as you
want to keep it around and it's very
representative of the configuration
you would find added d_p_s_
or at a commercial web host and because
you'll have a real access on its
and because it will be live on your
machine only you'll have perfectly
secure access to it
ominous your laptop or desktop is
compromised
and you'll be able to configure apache
and p_h_p_ and really tinker with things
and best yet best yet
if you showed up
that's fine used on the new and you're
back to sort of beginning comes along as
you've saved your code somewhere which
will
encourage you a how and where to do
so more on that in a week or so
and how would you connect to this kind
of thing uh...
so as the states
zen one u
news
so why not use this as a ship
status
so if you have any specific stands for
secure shell
and this is a way sort of all school way
but now much more secure
of connecting to a remote server and
executing commands on it so this is just
uh... free program that comes with mac
o_s_ terminal
and there's analogues for the windows
world party has a free program for
windows that a lot of people like to use
and it allows you to open up essentially
a black and white window or white and
black window
and see
type in a username and password
connect to some remote server and
execute commands on it and those
commands can be to create files removed
files configure the webserver turn the
database on or off for the like
so what they'll find once we start using
the c_s_ fifty appliance a virtual
machine albeit running on your own
computer not some server
is you will be able to connect to that
appliance
as though it's a remote server looks so
that you'd never even need to see the
appliance itself literally once you turn
it on you can minimize it and pretend
that it's a server somewhere else on the
internet
because once you install it the
appliance
into a virtual machine
is gonna have its own i_p_ address
but it's going to be what type of i_p_
address
so it'll be private to your own laptop
so no one else can you can access it but
you'll be able to go through precisely
the same motions that you would if you
are actually paying some third party to
host your website or you own some server
outs where on the internet services
dates will be one of the techniques that
we use
s f three p for those unfamiliar this is
a screen shot of a popular of windows
client for transferring files called
secure effects but others exist free
ones in particular
just listed writing drop files from your
computer to a server
but in this case the server
is going to be a virtual machine running
on your own computer that may be
maybe or maybe not is minimize but again
experience of the precisely the same
sold where does that still be bouts so
turns out when you are writing
html
uh... you
at fairly static content but you do have
these mechanisms and i'm guessing most
people in the room have some were a lot
of experience with
it's you know and basic web sites on the
life
but these ultimately already
basic input mechanisms by which we can
start
making dynamic websites in other words
we have text field password fields
hidden fields checkboxes radio buttons
drop-down menus
so these are the mechanisms by which we
can start to get input from users
so that when they interact with our
website they don't necessarily see the
same thing
rather they might see different things
every time we visit that website so
let's do a little
example here
i'm gonna go ahead and
don't download this on your own just yet
because we will be posting a newer
version soon but i'm gonna go ahead and
open up
a program coldfusion
vmware fusion this is what's called
generally again a hyper visor
virtualization software
and what i've just done is essentially
run lynn x on my own that
and you too can do this i'm gonna
actually use the lyrics desktop discuss
its here but i could similarly minimize
it as i will just few minutes and will
connect to his well so now i'm running
lyrics on my syndicator and notice i
currently have no idea dress that they
should change in a few seconds once
identity there we go
so my lyrics computer virtual machine
has just asks for the network give me i
p address and it came
uh... the protocol
the computers use to get idea dresses
dynamically does anyone know
dac p are dynamic host configuration
protocol that's what did that
it's also how your own personal computer
works at home and gets a nike address
from your links this router or airport
extreme so i'm gonna go ahead and to do
with this first let me go to mike back
to my mac
i'm gonna open up the simplest the
program's text edit which is what we
used earlier
animated just make a very simple web
page on first thing i do doctype
html so in the course will use html five
which is the sort of the latest and
greatest
version of html
and so i'm gonna say hello
now let's do this
let's call this
and then body
and close body
and then on to say cool
on ok gonna save this ominous go ahead
and save it on my desktop as google dot
html
i'm gonna say yes uses two million of
the text file
and now i'm going to go ahead and pull
up
google dot html
so i'm not exactly google just yet i can
do a little better so let's make this a
little different slowly going here
did style text align center
so into any of this looks completely
critical new these are the kinds of
things that we will take for granted and
of course that the stuff looks familiar
so let me say the relocate now looks a
little more like a low but it certainly
lacking in some key features among them
i searched far right so let's go there
so let me start to make a simple web
page that i get on thirty many of you
could have done are ready cuz you know
html and forms and what not so
let's go ahead and do this it down here
on an essay form
close by for a minute ago and fear
and i needed inputs
tight equals taxed
and i prolly need a summit by hand so
let me did
input by you poor summits if this about
you starts with
victory it as best we can saved
reload okay so now it's getting there
this isn't the prettiest things so let
me go ahead and do style wear flats sade
two hundred pixels
school back here and say it it still
looks a little small three hundred
pixels
uh... looks more global alike
and we really want to be an all here
we could do this
line lucky
so now we've got roughly google but in
black and white
so unfortunately takes away more work to
implement the back end of this website
writes a front-end pretty easy were
pretty much done other than some colors
and some other features these days
but what about the back end so if i
actually wanted a patch into google
let's see if we can now revisit that
conversation we started earlier when i
tried to dublin dub dot google dot com
is the answer
what really is happening let's take a
look underneath the hood let's look at
the age to be traffic and think about
what it is we're gonna start building
next week in terms of the actual back
and so let's
suspend this mental thread
pull up the actual global dot com
and take a look at what's is here
i'm gonna go ahead first so that this is
new lightning think it is
we have the disabled this annoying
instant search feature
select this
here i come
search settings
google instance disabling
never showed instant results so the
reason i wanna do this is ready to talk
about javascript and ajax the
technologies that nobody's on
annoying or beneficial instant search
feature we wanted to sort of all swollen
she d p
searches right now
sub disabled that's so now hopefully i
can save
okay now i'm gonna go back to build a
dot com is what it is like five years
ago when you want a search for something
on the internet
so now i'm going to go ahead and take
for instance harper
so it's still doing auto complete but
it's not immediately showing me the
search results
so now notice before here's the or l on
that
dub dub dub dot google dot com
and now after early hit enter
now known as the oracle
so they're this is now hinting
at the fundamental functionality of
h_t_t_p_ we had just issued one of those
debt requests we had two of them in fact
the first one came up when i visited the
home page then i hit enter in it appears
that another get request has been sent
wide because when you are all changed so
generally anytime the
keyword get is involved
it's because the orals changed
or equivalent leave the aural changes
you just to get
most likely
so there's a whole lot of distracting
stuff up there but what is relevant in
uh... what looks familiar up there
in the u r ellen gray
i have no idea what h_l_ is
site so i don't know source i don't know
but what looks familiar
okay harbor
so let me delete manually all the stuff
that i had no idea about what it means
at least not yet so mummy i don't know
what this is
q equals harvard that i did
okay legals harvard apple q i don't know
i'm just gonna
presumptuous the whittle it down to that
analysis answer
so interestingly still works
and what's nice is that there's much
less distraction we can have the same
story boat with fewer distractions in
detail so it looks like
when hitting enter on the previous page
if i throw away the distractions
i have now been fitted snot slash but
slash search
question mark
q equals harbor so what is q she was
generally known as the h_t_t_p_
parameter so it isn't input to a web
server
that's generally comes from a four but
as we'll see a few weeks ago also
confirm javascript doesn't have to come
from a form per se
harvard is obviously what i typed in so
what is slash search
well it's not obvious here what
programming language google uses before
on face book we would probably see
search dot p_h_p_ his face book is known
for using p_h_p_ there also known for
not hiding their file tensions which is
very easy to do but they just don't for
some reason
gold does hide their file tension but a
lot of google's code at least run-in has
written a currently in python
or in some other languages so
it's not clear what language is on the
server but slash search
is referring to some files or some
folder on the server
what is the question mark the no
the start of the parameters so anytime
you have a question mark in your old
that the marks the path
and the preceding part of you are all
from all of the parameters and
parameters are key value pairs something
equal something and if you have multiple
parameters what separates them even
though i already deleted the others
ends the ampersand say assemble so if i
had it deleted all of that
recall that we saw something like this
just a moment though and
boat hwy quills harvard and i don't know
what took you is but
that's how you would separate parameters
with m percent
so this means we have submitted
uh... ke of q anna value of harvard to
the server
so now let's use a fairly common tool
built into chrome it's also built into
safari
firefox or something similar when we
recommend something called fire bug on
the courses website
with which to do the same kind of thing
but i'm in a good of you
developer
and developer tools and i will say these
days certainly when using lamp clinics
apache mysql and p_h_p_ which is this
courses focused
uh... many people are increasingly using
chrome one because it's popular too
because it's faster because it comes
with some developer tools
i would say firefox is also wonderfully
convenient for doing development the
mifsud certainly task on multiple
browsers as well
uh... require in one of the prep in the
first project spec
on you can get windows of rather intern
explores getting better about having
some integrated development tools from
the course of perspective
we don't care what browser you use
because you'll be using again the
appliance as a server
you can use whatever browser whatever
operations from on your own computer
that you're most comfortable with but
you are coming to this with some
i'd say uh...
less familiarity uh... with various
tools
chrome is pretty popular in firefox tend
to be packing better for development
purposes in indonesia task on all of
them
so one of my senior i'd just opened the
developer tabs
and now i have elements resources
network scripts timeline profiles audits
and consul we're not gonna use all of
these but if you would never quite
helpful one
the elements have shows you the page's
html
but it pretty princess for u and it
makes of hierarchical so that with those
little triangles
you can dive in deeper n c
even though if we look at you source for
the page
it is an automatically of the page afaik
uh...
over here
and view page source
this is what came back from the server
and i would argue this is not very
readable
unanimity actually still al
even the html not that readable color
coded maybe still not useful
so what is this still
x the developer toward actually parses
it for you see can start to navigate and
this is actually wonderfully compelling
whether it's your site or someone else's
if it's someone else's
it's a wonderful way of learning how
they did something or how they stylized
something if it's your own site so
wonderful way of chasing down bugs
and also as you'll see changing on the
fly some of the aesthetics
without having to change actual files
and then relo
notary upload so we don't care so much
about elements right now
we do care about network so let me go to
the networks have and what this tableau
do first is
uh...
smith
all of the network traffic between my
browser and the server
and will show each peacekeeping request
one per line at the bomb
some gonna leave this window open and
click reload
and again this is my u r l
that's a lot i'd only hit reload once
but why in the world that so many rose
appear down here
by clicking wants
but look how much stuff just happened
each of those again represents an eighty
d p requests or virtual on bola from
browser to server and back yet
what does that mean behind the scenes
okay good civil winnie support other
things to me and concrete example of
something as the polar
so inside of the
htmldiff initially downloaded there
could be an image tag outsource tackling
two tak two si es esta javascript images
could be flash files
could be a whole bunch of other assets
so to speak
and to get those the browser is
predefined
to sort of a personal ego get those
assets so if it sees a source tagore
image tag
it will send another virtual envelope
requesting that file specifically it
might do it over the same network
connection the same t_c_ p socket so to
speak
but each of these rose represents a
different file that was downloaded
ironically it seems that's harvard's not
behaving well win the terms of the auto
previews but that's good another day we
can look at why
uh... but let's look at the first one
because that's the one that will be the
most enlightening for now
and when i click on this
there's a few details so one
the preview is just what was returned
and here's another big mess of results
from the web server but we don't care so
much about that
i care about the headers
so many zoom in on this
and rather than look at this fairly
pretty printed version of it i want to
look at the brawl source over diving in
deep sort of uh... intellectually here
so let me look at the source
now this
is what was literally sent
in that virtual envelope
that we started tonight's discussion
with
so there's the top-line debt slash
search question mark you week was
harvard stay safety p slash version
number so that was in the envelope
and we did promise to some other stuff
in there
second line
is a reminder to the server
as to what the user type and so what is
the hostname
now frankly but was not sharing their
servers with other companies most likely
so this doesn't really matter there but
for shared web hosting companies the
fact that i'm being reminded what do u r
l was means i can serve up food dot com
or bar dot com about us dot com so
thankfully
it's d_p_ does that
there some arcane information here
related to passing and connections
inefficiency
only wait my hand it back
for now user-agent is interesting
you might notice are ready but if you
don't every web page of adverb visited
is every website you've ever visited
knows what computer you have and what
opportune system you're running and what
browser you were using
why is that what browsers by default
reveal precisely that information i'd
have just told global
behind the scenes that i have a mac
running mac os x are ten point seven
point four and if i scrolled down
further or they will be able to infer
that i was using chrome versions
something or other
so why in the world is that useful
good so hard to believe this is use full
from the bugging purposes useful for
demographic purpose is to know who your
users are
goods
so there's some features that could be
dictated by what type of o_s_ or browser
someone's using for instance if you go
to a website that lets you download
software
it's not necessary that you detect with
users opportune system browser are it's
kind of a nicer user experience if the
server only shows you the mac software
because you clearly have a mac as
opposed to me having to figure out which
of the links to click for lenexa
windows or mac o_s_
uh... another argument frankly is that
this is completely unnecessary and we
should never gotten to this habit in the
first place because it for uh... it's
not necessarily used all that much
and indeed
writing websites that of require knowing
what the user's browser is is actually
generally bad practice because there
will be certain privacy tools
that users can install on the computer
glitches hide this information
altogether
for better or for worse and if you're
relying on certain headers to be sent
your own website could misbehave
so they work turns out there's other
tricks for doing detection
and typically is will see in javascript
it's better generally to detect
whether a browser has a certain feature
rather than is that a specific operating
system for specific version of a browser
however databases for the billboards
this
that allow you to figure out based on
the so-called user agents strings
what version
of browser inopportune system some
answers because frankly this is a little
hard to read so softer exist that
simplifies this
so you can just check a boolean variable
is mac ortiz pc
our butts below
so more pain details that always my hand
that
cookies will come back to in a week or
two when we have to start using them to
our advantage but will also talk about
the security implications of them
but in a nutshell
all of these headers just text is what
was inside of
that virtual on book and the most
important one arguably was the very
first one 'cause that tells people what
to return
but now we see that it's not just last
it's a full path
so google has hopefully parks that
string
so-to-speak slash search question mark
cubicles harbor
and then you
lustig you equals harvard as import
to its database or what not
to return customized results to me
now if you scroll down let's see how
google reply
so this is just a chrome thing
it is just kind of dumb down display of
the query string printers justice useful
specially for developer you can see it
more easily this way
but let me go ahead and use source now
for response headers
this is what the server responded with
so turns out
many of you have seen numbers return by
servers who has ever seen message four
oh four comeback
it was for foreman
file not found right so it's a nation's
bee status code it's an arbiter number
the world decided on years ago that
means files not found what are some
others you might have seen
sizes
five oh one so internal server error of
some sort
five oh three audience another internal
error
or resource are forbidden
there's four oh three rather which is
forbidden
uh...
three oh one and three oh two are
redirects which are actually quite
useful will start using those in the
next lecturer to such aren't there some
codes that you've probably seen for four
th may be the most popular
two hundred you might not have ever seen
because the best one of all
two hundred is literally ok in these
everything worked out well see just
don't see it
because it indicates success as those
little green icon that we
uh... saw a moment ago before i expanded
this so this is the servers response two
hundred means found what you're looking
for here it is
now what else comes down
we have the date from the server which
might be useful
uh... expires in past controls so
directives to the browser saying do or
don't pass this even though these are
not necessarily reliable talk about this
when we get the p_h_p_
uh... this is interesting set cookie
set cookie is amazingly powerful if not
a little unsettling especially northern
advertising in tracking but we'll talk
about that in the context of p_h_p_
notice that the servers telling us that
it supports jeez it which is like uh...
compression utility
which is a compression utility in this
just means a you can compress your date
of june from me
uh...
the name of the server g_w_ s probably
good google web services
and then some headers that they use for
some the security things will talk about
later in the semester so that's what
school was returned
in addition to
the content that has come back from the
server
so let's see this outside the scope of a
browser on the boat open
uh... a program called terminal
the switches comes with mac o_s_ for
those of you with windows padi as
another option will look at that
uh... or encourage that info my music
supply
uh... bull book at that
or recommend that for future projects an
interim program call telnet telnet is
like associates put uninfected now
that's a bit of an oversimplification
i'm gonna go ahead and tell that to
google dot com
and nothing actually happens but he did
figure out the was likely address so
that's interesting
but tell that by the fall uses port
twenty uh...
has been so long
twenty-one
tom is a sport twenty want to see people
are twenty-one bikes there's no tone
that server their tone it used to be to
send messages and connected email
servers in the like
but what if i instead say eighty
so there's no colin in this program
there is in a browser
but this is going to connect from my
laptop to google dot com on ccp port
eighty
so this is interesting
now i've connected to their server
wine are how do i know that
stony connected to tell doubly dot dot
dot l dot google dot com
where did this all come from
the days of the n_ s_ trickery tally for
load balancing purposes they had
multiple server so they're fraud gotten
one of them specifically
but i'm gonna pretend to be a browser
unless they get me slash
using issue d_p_ version one point one
and the answer
and then if that's the last of my
headers i have to hit entered twice
and wala
what do i see
well the fonts kind of big in the html
javascript kind of mina five but that's
exactly what my browser got back but if
i keep going up and open up and up
notice i can say
exactly what the servers response was
so i see my age to be headers they came
back from the server
uh... set cookie and all those same
lines exactly what i saw on the browser
so i just pretended really
to be a browser and we can do this with
any websites
and it's more than just a curiosity can
actually help with the bugging great
actually seeing what's coming back from
a server
i can do dot dot dot dot harvard dot
e_d_u_ eighty
death slap
slashes vp one point uh... one
answer answer
so interesting bad request
now why is that said we see some html is
this the webserver assumes that a
browser will typically be doing this
why might this be a bad requests
i'm actually in a guest here let's try
this gets slash is tepee one point one
post
prioritize u
so it didn't like the fact that i did
not say
the host hatter
which means harvard's web servers
probably using something called virtual
hosting which is that feature i alluded
to earlier when a website
can support molt when webserver can
support local websites
bought for that to work browsers have to
cooperate and the fact that i did not
send that had her meant that the server
didn't know who's hoping to return
so we gave me that
four hundred response of i don't know
what to do
now let's try one other thing on the
cancelled s
and let me do
telnet sue not dubbed the dubbed a
harbored idea let's try this one to see
what happens
so get slash h_d_t_v_ one point one
answer answer
they didn't like that so let's fix this
again
so gets slash uh... issue the p one
point one
post harbor dot you do you answer answer
interested
this is not the home page where i get
this time
some message about
it's moved
harbor but you do has moved permanently
you know that's and if i scroll up
more esoteric lian headers is one of
different status code three oh one which
you mentioned earlier three oh one these
permanent redirect
if a browser receives a three oh one
it should never asked that question
again
it should just remember harbor dot
e_d_u_ moved
any move to wear
to the value of the location field which
should also be included in the response
centers
how did that happen
well some system administrator gene at
harvard just decided arbitrarily
that's but reasonably
that we don't want to standardize on
harbor dot e_d_u_ ann arbor authors in
people's browsers we want them
automatically to be redirected to dub
dub dub dot harvard dot the id a dot
harvard dot e_d_u_ white
one printing minutes one
it's perfectly reasonable
to more technologically it can be better
for
securities a bit of an overstatement but
for technical reasons having the w_w_w_
means your cookies can be isolated to
w_w_w_ dot harvard dot e_d_u_
whereas if your cookies were instead
sent to harvard that you do you that
means your cookies could be read really
bite any websites so including c asp dot
harvard dot e_d_u_ or xd summer dot
harvard dot you do you so they sing
doubly david w
you're also forcing uh... released by
default cookies to be more precisely
define so there's some technical reasons
as well
only a year or two ago was this problem
fixed on the recent a few years back
someone on a new came to harbor tour on
the news office and one of her first
things
i one of her first acts was to fix
eucharistic omission for years recovered
that you do you did not exist
devito v_w_ dot harvard dot e_d_u_
existed and they weren't even
redirecting south that is a bug that's
now been
uh...
and questions then
uh... what just happened
and one of the terminal window open the
meat offer up some other troubleshooting
tips
anissa lookup nameserver lookup is a
wonderful way
of doing those d_m_s_ lookups we talked
about before what i've just done is
asked the nearest city in a server which
happens to be this
because that's how harvard's configured
the campus that's the deanna server
and i've asked what's the idea dress of
harvard i do u and it's given me the
psyche address
so far we get curious that we do that's
uh... they should be peak on five slash
i can address
interesting
whatever why this is not working will
again be hosting like the website is not
configured to understand mikey addresses
by default
however let's try another one and it's
lookup cnn.com
what c_n_n_ dot com
interesting
so it turns out we denounce you can also
do what's called round robin you can
return multiple mikey addresses
for web server and those can rotate
literally in the order in which their
return
to do but balance
and we'll discuss that topping again for
the end of the semester and scale
ability but let me just one of these
and c_n_n_ either pretty day on guessing
they don't really share with some other
websites so let's just go to their i_p_
address
and indeed
there's works
now notice my you are all hasn't changed
so now if i really want to get sort of
phone
uh...
i really want to get sort of
uh... creative i'm gonna do this on my
mac and you can do this on a windows
machine is well there's typically file
on macs in my next computers is called
at sea coasts which is a text file
bitmaps that hard cuts
ita dresses
four
domain names
this is useful generally for internal
corporate used for development purposes
so will be able to do this with projects
as well
uh... i'm gonna go ahead and uh... sent
the case here
just a text file and noticed this is
some basic ones that come with the
system this is my t_v_ six version six
address written in a
uh... crazy form
i'm gonna go ahead and paste in not that
u r l
but the i_q_ dress of c_n_n_ and i'm
gonna say this is
david news dot com
so this is like manually overriding
the mapping bataya ki address to
something else here only for my own
computer i'm not running again a server
it's just that my
opportune system ako s in windows is
supposed to look at a file like this
before asking india nasser
so now let's see
if this works doesn't work with all
websites but let me go to
h_t_t_p_ colon slash slash
david news dot com
kamon
areas
i'd just made my own
or new site
frankly this is kind of stupid of them
um...
they go i was just joking with some
friends the other day that you could
kind of have fun with this and make
fairly offensive domain names of the
only to c_n_n_ somehow uh... and why is
this
so this is trivial defects frankly in a
web server i a web server could be
configured as you will be able to do
with features of the past you before
long
of checking
upon receipt of one of those virtual
envelopes what was in the to field
if the to field does not match something
that were happy with
redirect the user how you respond with
what sascha
three oh one
somebody's actually trivial to fix this
kind of things that could still be they
can't stop david news dot com from
needing to c_n_n_ dot com
but they can stop the browser from
staying there or the spy encouraging it
without three oh one to redirect
elsewhere
and this redirection is super com and
not just for harbor dot you do you
but even the courses own website if i go
to issue p
colon slash slash two seventy five
dotnet and hit enter
notice what the oral changes too
a few things happen there
so this is the courses website what are
some of the things that got answered in
automatically it seems
deaf indeed so idea then just go to the
dub dub dub version i also went to the
secure person why
we've just gone into the habit and i
personally guns out of using us as offer
everything it's relatively cheap to do
it's relatively trivial to turn on
and it's only getting cheaper it's
accuser getting faster and
some of you might be familiar with about
a year-and-a-half ago uh... toolkit
caught fire shape was released
which was a wonderfully free proof of
concept of something called the a
session hijacking attack something we'll
talk about in a few weeks time in the
context of security
long story short if you are visiting the
website that uses h_t_t_p_ colon slash
slash
it is fairly trivial
for someone in your nearby wireless
vicinity whether in this room at
starbucks even in your own home if you
have adversarial for gaza blends are
roomates
to can't login to your facebook account
or your google account or i cannot go
facebook account twitter account or any
websites that's not using h_t_t_p_ ass
and that is because
if you're not using a g_p_s_
nothing's infected known
probably knew that but among the things
that are infected or things called
cookies so if you are just broadcasting
cookies
and cookies it turns out is hosting a
weaker to hard the mechanism by which
users are remembered as being logged
into websites if you're just sending
that cookie tool website again and again
to remind unlike n on login on login
that sign creek did anyone in starbucks
can sniff that cookie
and with the right technical savvy as
you will soon have senate has their own
another log into whatever you work
doesn't mean they know your password
but it doesn't mean they can hijack your
current sessions so to speak
so a protracted google because global
about a year or two ago thanks to some
of the issues in china they had with
crack hacking and whatnot
they transition all their services to a
steep yes at least if you opt into at
face book also find the offers this
but again we come back to this uh... you
a few more weeks of insecurity in your
lives if you don't mind
but will come back to this and talk
about how you can have certain defenses
up
and i tragically
even websites that redirect
strong law
the uninfected version to the infected
version might still be vulnerable
because many of those websites will
first do a redirect
that sends your cooking in the clear and
then it realizes all of this to be
secure
by then it's too late so even though
banking websites almost always uses
there have been certain banks to be
known
to be not to technically
uh... savvy
who are still leaking cookies
on four uh... reasons that will soon
reveal so i said the duh duh duh based
ups and then this stupid main page which
is a media with the thing which is the
tool we use for the courses website it's
free with the software
and that's not really intellectually
interesting just their own software
contact any questions on a cd p
will delete david news dot com
and again that wouldn't work with all
websites in fact harvard's probably
would not work for us
so which is the party's hdv headers
still become in invaluable resource when
it comes time to chase down blogs are
features in your own code
let me take another quick look at
something within krone's
uh... developer toolbar in more than one
last thing with regard to google and see
if we can implement a little more of our
own version of the world
so let me go to
the elements tab
and just as a proof of concept here goes
what's adds a little complex underneath
the hood
but let me go ahead and right-click on
harvard university and they choose
inspect elements
and inspect element is nice because it's
going to jump me right too
the part of the html
that relates to that portion of the page
which is wonderful useful for diving in
deeper to specific place
so that is the a_h_ wrath that got me at
harvard university link
now suppose i'm actually google's
designer
and we're not quite happy with the shade
of blue where the font size or font face
it so i want to take over the website
but frankly i don't have to log into the
server and change font size and save the
file dinh reload the browser
and go thro the suit
kinda like to do it
uh...
in line in the browser albeit at without
saving any changes
so notice you're on the right if you've
not used from before
that gordon developer tool bar
notice on the right you have a summary
of all of the styles that relate to that
specific elements in the web page
so from top to bottom here's the cv
uh... there's the ciency assessed
cascading top-to-bottom
here all of the rules that apply to that
all so
there's apparently somewhere and google
cssc apparently the file called search
online sex
and a link uh... dot w class mention in
all of this where they're specifying the
color of the link in the cursor that
should be used over it
so let me just right for text legally
here
let me just change this to
let's say something we the
random like or just
so that is what i've done a their change
who was elected or of course non
permanently and only for the links in
the page for which that csrs rule
applies but the point here is that this
is just a wonderfully quick and dirty
way of experimenting especially if
you're quite evil
and you want to get let the pics
alignment perfect in something
being able to just tweaked it ever so
slightly here and then figure out the
values are and then write down in the
actual file on the server
just wonderfully useful
also too if you're trying to figure out
what font website uses
i mean fighting the computed stop as
this can be a little overwhelming like
my god there's so many rules that apply
to this element because of the cascading
nature of c_s_ s
let me just look at the computed styles
which is a summary of the end result of
all of these styles that have been
applied
enough i look for font-family
indeed this is
uh... appliances aerial followed by
sensor so i know what font now google
said just the debugging trick if you've
not used it
this and the networks have one abusing
quite a bit
most likely
so now back to our own version of google
unfortunately if i typed and harbored
at google search
doesn't go anywhere sort of the kind of
did where did it end up
urals almost the same it's a crazy
looking you are opposite is obviously a
file on my hard drive not on the
internet but what the change in the oro
yeah
good so the question mark unattended
but nope rameters got sent
so let me go ahead and take a look at my
file o
well no predators were sent to them and
give any of the main so let me go back
and fix this restrict the fought so we
can see more once
and the need to input name equal skew
here
and let me go back over here and reload
the page
and now we go ahead and take harbor and
now clickable search
so now we have some progress
so this is interesting
now unfortunately google dot html is not
a website
nor is that dynamic it's literally just
a static files you can send any
parameters you want
it's just you know inorg everytime
but what is byproducts each year
i'm kind of in a hurry to implement my
own search engine uh... you know i read
uh... knocked off my own new sites
now would you mind search engine
well i can actually do this form
action should actually go to let's say
dot dot dot dot google dot com slash
search
and i'm gonna say math the equals and
get in all our case here
slight inconsistency
with what we've seen before
now let me go back to this page
reload
and now mimetype harbored and watch the
oral as i could answer
uh... now i have implemented my own
version of gul but how
right all i did was i constructed a form
on a specified a method of deaths
inaction that happens to be a point b
elsewhere
but because of its dpm because a browser
knows how to handle foreign it compiled
all of the
key value pairs in this case just want
you equal something put it in the oral
consented
to that action
attribute sent it to that particular u r
l
so now we have implemented
our own version approval now of course
denies that they were our search results
and they were completely cutting corners
here
but that's where we'll need something
like p_h_p_ to do things
server-side
let me pause for just a moment peter do
want to say hello to class peters one of
our for teaching fellows the others are
at work right now
do on a lot too
actual hole up i have to come near you
with a microphone value what you want to
come this way so the cameras
little more really available
uh... although i a look forward to
working with all of you and i will see
you on wednesday
much
any questions that about the wolf
faqs
pgp
precaution that's it
had it not been named kit would this be
broken so let me go back into here and
let's just call a query thinking that's
a reasonable name to give a parameter to
mini reload mates gml me take that
harvard answer
and interesting they support we're so
let me misspellings
and that's probably not supported but
maybe let's say
family reload type harvard center
okay that doesn't work
so there's a bar cleary r_k_o_ for
whatever backwards compatibility agrees
other questions
yes and on the entirety apartments
campus your sessions can be hijacked
so if you have malicious roommates or
are you aria this thurs four technical
people around you you are vulnerable to
this issue hijacking can happen anywhere
worker can is not use whether between
you and the access points
the wireless access point we're between
you and the endpoints uh... webserver
and i should say especially those of you
are from out of town realize that harbor
does that fine print in rules about not
doing this to other people otherwise
they'd
that you can solve this problem by
expelling people that's one way if you
can't do technologically
so this is one of these things were
returned educators how it can be done
but don't go trying this in the dorms on
campus
which he go home on your own home
network
other questions
so where does this leave so we started
by talking about global and we keep
talking about who will be discussed so
popular but it above the story applies
to really any website out there
so
we talked about the in az and the
process of not only booking up a web
page is your rayl rather mikey address
but also getting your own ita dress and
getting your own web server in your own
domain name
we talked a little bit about html forms
which would which you might be tonight
are ready to the tools with which you
can get a little more comfortable with
diagnosing things in debugging things
and we've written so that you know
that's a that's a form
all that we haven't done yet is actually
implement the dynamic website for that
we've completely outsourced to c_n_n_
and global today
one of the first things we will do this
coming wednesday is diving deeper to
p_h_p_ and things like depts imposed
concessions and shopping carts
awesome the security implications around
them will dive in cbc is fifty appliance
in this virtual machine environment with
which you get to
uh... play you in terms of apache in
p_h_p_ and mysql
so tonight will turn abid early all
stick around with peter for anyone with
questions you might happen otherwise we
will see you again on wednesday and
after wednesday's lecture will flow
right intersection office hours if you
have questions about
contents even though the first project
will be released for a week or so
sienna couple days
