Hello, ladies and gentlemen of the Internet.
My name is Phuc Duong, Senior Data Engineer
for Data Science Dojo. And I'm here to teach
you how to web scrape with Python.
So in front of you, you see, is actually a
website that employs web scraping. So this
web scrape's actually a storefront of a website
called Steam. So steam sells video games.
And the cool thing about Steam is that they
do flash sales every day. So the user has
to come back every day and study this page.
What is a good deal? What is not a good deal?
And it's a lot of information. This is how
they've gamified shopping online.
Now there's a website that actually scrapes
steam's front page in real time and shows
you the best deals, and ranks them. OK. So
a lot of people ask me, how do I get all of
my data? And actually, in the absence of APIs,
if you learn, web scraping, it is actually
a very important tool for data scientists
and a data engineer to know, because the entire
internet becomes your database.
So-- I can scrape any storefront-- Nordstrom,
Macy's. Study the sales. Web scrape reviews.
I can web scrape baseball stats, baseball
players in real time. Wikipedia is also a
good place to web scrape. For example, you
can see that this frame over here of this
Harry Potter character, Ron Weasley, it's
very standardized. I could write a web scrape
script and then loop over every single Harry
Potter character, very quickly, and create
a data set.
All right. Today we're going to learn how
to do that. So today I'm on Windows, so you
can normally install Python if you're on Linux,
but if I don't if you're on Windows, I highly
recommend installing Anaconda instead. So
if you go to Google, and just type in Anaconda,
it should be a continuum dot and you should
just download based upon your operating system.
OK.
Next thing I'll be using, I'll be using a
text editor called Sublime Text. So you can
just go ahead and go to Google and type in
Sublime Text and then install that. I like
using Sublime Text 3. OK. All right. That's
where you get those things.
All right. So once you've installed this,
this is actually-- if you're using Anaconda,
its actually a pretty big file. It's like
500 megabytes, OK? So be warned of that.
All right. So what I'm going to do is I'm
going to go ahead and open up my command line.
And for those of you who don't know, if you
go to a folder, any folder, and then just
hold down the Shift button and right click
and say Open Command Window Here, this opens
up the command line for you. And this is where
you can work with Python. So if you type in
Python right here, right, and if you've installed
either Python or Anaconda, well this is show
up, right? So notice that I'm using Python
3.5 with Anaconda. And if I just do a very
quick two plus two, it should equal 4. That's
how I know I'm inside of my console.
All right, next thing is yes, now that I know
that if I push down control and hit C, Control
plus C, basically if I do a copy on Windows,
it will exit this console. OK and I get back
to, basically, the Windows command line.
So what I'm going to do now is, I'm going
to go ahead and install a package called Beautiful
Soup. That's the package that we're going
to use to web scrape, actually. It's a very
powerful package. I encourage those of you
who want to go further beyond this introduction
to go ahead and learn this package. So all
you've got to do is do a pip, install, bs4.
OK, bs4 stands for Beautiful Soup 4.
So here we are. So Beautiful Soup has been
installed. And how do I know if it's been
installed? Well, if I type in python, and
I type in import bs4, right? It should just
not err. OK. Awesome. So that's how I know
that the packet is online and ready to go.
Next thing I want is, I need a web client.
So Beautiful Soup is a good way to parse HTML
text. That's all is. It's a good way to traverse
HTML text within Python. Now I actually need
a web client to grab something from the Internet.
And how you do that in Python, is, actually,
you would use a package called your URL lib.
And inside of URL lib, there is a module called
request, and inside of that module is a function
called URL open. OK? I know it's a lot to
take in. But settle down, we're going to do
step by step.
I'm going to do a really quick import all-in-one
line kind of step. All right. So I can do
from URL lib dot request. So I'm calling a
package called URL lib. If you're on Python
2, this is a different package. It's called
URL lib 2. So I'm a calling module within
that. So notice, I'm importing only what I
need. I don't need all of URL lib. I just
need the request module. And I'm going to
import out of that.
OK, URL open, the one, basically, function
that I need. And it's going to import all
the basic dependencies, as well. And I'm going
to give it a name because I don't want to
type in URL open every time. I want to say
U request, uReq for short. That's how I tend
to do things. And also, I can also modularize
the import of Beautiful Soup, as well. So
I can do from BS4 import. And this is important,
capital B Beautiful, and then capital S for
soup. And then I'm going to just call it as
soup. So I don't have to call out Beautiful
Soup again every time I want to use this package.
And this is me working in the console. This
is me playing around. So if you want to, you
can actually start typing it into a script.
So in this case, I have Sublime open. And
I'm going to do a Control Shift P to open
up the command console. And then I'm going
to say set syntax is equal to Python. OK.
Beautiful.
So now I can do the same commands in here.
So if I just select this into the command
line, hit the Enter button, that will copy
it. So that way I can paste it into my script
here. OK? So there you have it, the first
two lines of this.
So now I'm ready to go. So Beautiful Soup
is going to parse the HTML text, and then
URL lib is actually going to grab the page
itself.
But what do we want to web scrape? Well I
like graphics cards. I'm going to web scrape
graphics cards off newegg.com. So some of
you might know it. It's basically Amazon but
for, basically, hardware electronics.
So I'm going to type in, for example, graphics
cards. So these are a bunch of graphics cards
that have shown up in my search bar. And it
would be nice to basically tabularize and
turn it into a data set. And notice that,
if a new data set, if a new graphics card
is introduced tomorrow, or if ratings change
tomorrow, or phrases change tomorrow, I run
the script again and it updates it into, basically,
whatever it is that I loaded into. I can log
into a database, a CSV file, and Excel file,
it doesn't matter.
So in this case, I'm going to grab this URL.
OK. That's all I'm going to do. So basically
I'm going to copy this URL, and I'll pasted
into my script. So, in this case, I can do
my URL is equal to-- so that is the URL I
want to use of this.
And in this case, I will actually run it in
my console. So when I'm web scraping, I like
to also prototype it into the command line,
as well, so I know that the script is going
to work. And then once I know that it works,
I will go ahead and paste that back into my
Sublime. OK so this is my URL. So I've gone
ahead and called a variable and placed a string
of the URL into it. Now this is going to be
good. So now I will actually open up my web
client. So in this case, I would do U request,
right?
So notice I'm calling you URL lib, and I'm
calling it from the shorthand variable that
I called it earlier. So notice I called from
URL lib dot request import URL open as U request.
So I'm actually calling the function called
URL open right now, inside of a module called
request, inside of a package called URL lib.
So the next thing is, I'm going to throw my
URL into this thing. So what this is going
to do, it's going to open up, basically, a
connection, it's going to open up this connection,
grab the web page, and basically just download
it. So it's a client. So I'm going to call
it a U client is equal to U request of my
URL. It's going to take a while depending
on your Internet connection because it's actually
downloading the web page. I noticed that.
OK it's done. So the minute I want it, I can
do a read, a U client dot read. If I do read,
it's going to dump everything out of this
right away. I can't reuse it. So before it
gets dumped, I want to store it into something,
a variable. So I'm going to call, I guess,
page underscore-- since this is the raw HTML,
I'm just going to call it HTML-- page HTML
is equal to U client dot read.
I can go ahead and show you this thing, but
it might-- depending on how big the HTMO file
is-- I can actually crash the console. So
I'm going to show it to you once it's inside
of Beautiful Soup. Bear with me here. And
any web client, since this is an open Internet
connection, I want to actually close it when
I'm done with it. So U client dot close is
what I'm going to do.
And knowing that all of these lines of code
have worked so far, I can just go ahead and
copy them into my script. So my URL is that.
And U client is-- and just add some documentation,
opening up connection, grabbing the page.
OK. And then what this does is, it offloads
the content into a variable. And then what
this is going to do, it's going to close the
client.
Then the next thing I need to do is I need
to parse the HTML, because right now the HTML
is a big jumble of text. So what I need to
do right now is I need to call the Soup function
that I made earlier. So notice I called from
BS4 for import Beautiful Soup soup. So if
I call soup as a function, it's going to call
it the Beautiful Soup function within the
BS4 package.
So in this case, I will do soup of, basically,
my page HTML. And then if I do a comma here,
I will have to tell it how to parse it, because
it could be an XML file, or, in this case,
I will tell it to parse it as an HTML parse
file. And I need to store it into a variable
or else it's going to get lost. So in this
case, I'll call it a page soup.
I know it's kind of weird that they call it
a soup, but it's standard notation. Now, when
you say soup, people understand that this
is the data type of it. It's derived from
the Beautiful Soup package.
All right. So in this case this does my HTML
parsing.
OK. So now, if I go to the page soup, and
I just try to look at the H1 tag, page soup
dot H1, I should see the header of the page.
So this does say video cards and video devices.
So I should see that somewhere. So notice
that they grab this header right here.
And just just, for good measure, let's just
see what else is in there. So Beautiful Soup
dot, maybe there's a P tag in there I can
look at. So newegg.com, a great place to buy
computers. So I think that might be at the
very bottom. Great place to-- actually, no,
it might be something that's hidden. It might
be just in a tagline.
All right. But I am on this page. So now what
we need to do is traverse the HTML. So basically
what I'm going to do is, I'm going to convert
every graphics card that I see into a line
item, into a CSV we file. To do that, to traverse--
now that I have a Beautiful Soup data type,
I can't actually traverse, basically, the
dom elements of this HTML page
So let me show you how to do that real quickly.
So if I inspect the element of this page,
so if I go find the body tag, for example.
I think the body type-- it starts off as a
body. So if I do a body, page soup dot body,
and then I can keep going. I can keep going
dot within the-- so notice that this body
tag can go even further into an A tag or span
tag. So if I type in the span tag, I should
find this span tag. Or body dot pan. See that?
Span class no CSS skip to. See that? No CSS
skip to. That's awesome.
So the next thing I'm going to do, let me
just make this HTML a little bit bigger so
you guys can see it even further. All right.
So what I want is if I'm in Chrome, you can
also use the Firefox Firebug to inspect the
HTML elements of a page. So I'm going to just
select this, the name of this graphics card
right here, and try to inspect that element.
It jumps me directly to this A tag. It jumps
me directly into this A tag. And I want to
grab the entire container that the graphics
card is in, because I know that graphics card
container contains other goodies, such as
the original price, its sale price, its make,
its review type, and the card image itself.
So I go out. So since HTML is an embedded
kind of tagging language, I can go out until
I find what it is that is containing all of
this. So notice that this div right here with
the class of item dash container, contains
and houses all of the items inside of this
thing. So basically I would need to set a
loop. I would write my script first on how
to parse one graphics card, and then once
I'm done with that, I can loop through all
of the class containers, and go ahead and
parse out every single graphics card into
my data file.
So in this class, I need this class. I want
to grab everything that has this class. So
I want to go ahead and do that right now.
So I want to go to-- my page soup -- There
is a function called find all. And it's capital
A with find all. And I want to find, what
do I want to find? I want to find all divs
that have the class item dash container. So
I would go back, and I would say, find me
all divs comma, and then I would feed it an
object. And the object says what is the name
of the tag that you're looking for? So it's
a class. If it was an ID, I would put ID here.
And then I would go ahead and paste in the
item that container is what it's called.
So in this case, I will feed this into a variable
called, I guess, containers. We'll call it
by what the class is. I'm going to copy this,
as well, and paste it into my script. Hopefully
it works. So from this, I will grab, grabs
each product. So notice that even though I'm
writing this for graphics cards, I'm betting
that Newegg has actually standardized its
HTML enough so that I can actually parse any
page, any product, on Newegg, if I just run
the script over.
So if I call this containers, so let's check
the length of the containers to see how many
things did it find. So it found 12 objects.
So it found one, two, three, four , five,
it found 12 graphics cards, basically, is
what that did. And look, there's six of them.
Yes, that is true.
OK so let's look at the first one. So if I
go to containers of the zero index, I should
see HTML for this thing. So I am actually
just going to copy this out into my text file,
and I'm going to read it in there, because
sometimes when you load a page, there are
some post-loading loading done via JavaScript.
And some things will show up, some things
won't show up.
So just to be sure, I'm just going to paste
it into my Sublime. And from my Sublime, I
can go ahead and figure out what is actually
in there. So I'm going to go Control new and
Sublime, paste it in. But notice, it's not
very pretty. So we'll deal with that in a
minute. I'm going to set my syntax to become
HTML. OK it's in HTML now. But that's not
pretty. I want to use an external service
called JS Beautifier. So it's going to do
all the spacing when there needs to be spacing.
So JS Beautifier, you basically just copy
an ugly code, and it turns it pretty. See
that? Everything is all now nicely spaced
and deliminated.
Here we are. Now let's read what's actually
in this thing. So if I open this up now, I
know it's going to be a little bit hard to
read. What kind of things do we want out of
this thing? If we go through, we can see that
there's some pretty useful things. We can
see that the items have ratings.
It has a product name. We want to grab the
product name for sure. Let's see, there is
its brand. I can grab its brand. So notice
that they call the image the name of the brand,
which is useful. So if I grab the title of
this image-- Notice that the image itself,
it says it says EVGA, but that's an image,
I can grab the image. I can grab the image,
I just can't parse what it says unless I use
image recognition. But notice that the title
encodes what type of brand it is for us. So
that's very convenient. So this is something
that we want to grab.
And also I want to be sure I want to grab
things that are true of everything. So if
not, I'm going to have to run into some corner
case if-else statements. So notice that this
guy right here is special. He doesn't have
any egg reviews. So if I wrote something to
parse reviews, I'm going to need to write
an if else statement, or I'm going to do I'll
have to do a try and catch with an index out
of error catch. OK. And then notice that it
doesn't even have what this number is. I think
it's the number of reviews here.
So I'll let you guys go ahead and handle the
scraping of that, but I'm going to scrape
things that are present in all of them. Notice
that I'm going to scrape the names. All of
them seem to have the names of the brand or
the names of the product. And then I'm going
to go ahead and scrape the product itself.
And not all of them have a price. You see
that? I have to add it to the cart to see
the price.
And let's see what else is good. And they
all seem to have shipping. So I'm going to
grab shipping to see how much they all cost.
So once you learn how to scrape one, it's
the same really for all of it. Now if you
want to loop through all of it, you have to
do those if else statements to catch all the
loose cases that aren't there.
So notice that if I do a container right now,
a container of zero a container of zero--
going to throw container 0 into just a variable
called container. Later I'm going to do a
for loop that says for every container in
containers. Right so right now I'm prototyping
the loop before I want to build the loop.
So I want to make sure it works once before
I even build the loop.
So this container contains a single graphics
card in it. I will call it container instead
of contain. So container dot, dot what? Let's
see what is in here. Notice that container
dot A will bring me this thing back. So if
I do container dot A, this brings me back
exactly what I thought it would. It would
bring me the item image. So the item image,
not that useful to us.
Let's see if there's anything that we can
redeem in here. The title, we might be able
to redeem the title, but it seems that we
can also grab that down here which I think
this might be the more efficient way to grab
it. So let's get it from there instead, because
that's what the customer sees. That's what
you will see when you go and visit the space.
So we will go instead of doing dot A, we will
do dot div. We'll go jump from this A, directly
into this div.
So I'll go ahead and push up, and say container
dot div. So that will jump me into this div
right here, and everything inside of it. OK.
Boom.
OK. So if I go into that container dot div,
I will just probably assume this is the right
one. I know web scraping HTML tends to be
hard because it hurts your eyes, unless you
know how to read HTML very well. But it's
something just to get used to.
So I know that I'm in this div and I want
to go into another div called item branding.
So div dot div. And inside of that div there
is, I think, an A tag. This A tag actually
contains some things that we want, which is
this guy right here. What is the make of this
graphics card dot div dot A. And there we
have it.
So here's the H ref of the link. So what I'm
grabbing is this guy right here, this EVGA
thing that I'm grabbing. Notice I hover. It's
a clickable link. That link is this guy right
here. But what I really want is this title,
the title of this link.
So what do I want? I want to do container
dot a dot image. So I want to grab this image
tag now. So notice I'm just using these handles.
I'm just referencing as if it was a JSON file.
And notice that I'm inside of the image now.
So the image is here. Now I need to grab this
title. So this is an attribute inside of the
image tag. So how do you grab an attribute?
Well you would reference it as if it was an
index, or I mean, a So I would say title of
this is equal to EVGA
So now that I have prototyped it, I can go
ahead and add that to my script. So I can
go ahead and copy this right here, and paste
that into my script.
Inside of my script, this is where I actually
can do that preemptive loop now. I can write
that loop now. So for container in containers.
It's going to go loop through, and it's going
to grab container dot div dot div dot A of
that image of that title is going to equal
to the brand or the make. So the that's the
first thing I grabbed.
So who makes this graphics card? That's the
first thing it's going to do. So what else
do I want to grab while I'm inside of this
thing? So let's grab two more things. All
right. Just grab two more things just to have
a really good file, because a CSV file with
one column seems a little tiny bit pointless.
All right the next thing I want to do is,
I want to go ahead and grab the name of this
graphics card, which is right here. Notice
that it's embedded within this A tag, and
this A tag is embedded within this div tag.
And this div tag is embedded within this div
tag. In theory, if we do a container, dot
div dot div dot A, it actually brings out
it seems like it brought out the item brand
instead. So the item brand is actually this
A tag, which is not what we wanted. We wanted
this A tag.
So notice that it's having trouble finding
this particular A tag. So what I want to do,
actually, is I want to do-- I can do a Find
All, and find just the direct class that I
want. So in this case, I can do a find me
all the A tags that have item dot title. So
in this case, I can do container dot find
all is equal to, I want to see the A tag,
comma, and then I want to throw it into an
object. And the object is, I'm going to say,
look for all classes that will go ahead and
start with item title.
So this will give me a data structure back
that has everything that it found. So hopefully
should only be one thing so that we don't
have to loop over it. So in this case, container
equals that which would be title underscore
container. If I look at the title underscore
container, I should have what I'm looking
for. Beautiful.
So the name of the graphics card is somewhere
in this thing. I'm going to put this and I'm
going to throw it into my script so I can
run it later. So going back-- So the title
container, notice this isn't the actual title
yet. I still have to extract the title out
of this thing. So in my title container -- notice
that it's inside of the bracket bracket, which
means it's inside of an array, or in this
case it's a list if you're in Python.
So in this case, if I go to zero, I want to
grab the first object. And inside of that
first object, I want to grab, nope it's not
inside of the I tag, it's actually a text
inside of the A tag. So if I do dot text,
this should get me what I want. Yes. So I
do title dot of zero dot text, and that gives
me exactly what I want.
So I'm going to place that in there, and I
want to call this the title, so the product
name. So product name is equal to title container
dot text. So that is that.
So I've got the brand, the make of the graphics
card, and the name of the graphics card again.
And now we can go ahead and grab shipping,
because shipping seems like something else
that they might all have.
So what we're going to do is figure out where
this shipping tag is inside of all of it.
How much does it cost for shipping, because
I think some of them cost differently for
shipping. Yes, this is $4.99 shipping. So
in this case, I need to find all LI classes--
basically, LI stands for a list-- with the
class price dot dash ship. So I want to go
ahead and do that.
I'm going to copy this class. And I want to
do container dot find all of LI comma of class
is equal to price ship. And this will give
me, hopefully, a shipping container. Shipping
underscore container and, hopefully, there
should only be one tag in this thing that
has shipping in it. And I need to close that
function. So my shipping underscore container,
if I can just copy this, shipping container.
You will see that it gives me back an array
of things that qualify. So in this case, only
one thing came back. So I can do that same
thing I did earlier where I reference the
first element, and then I think it's also
in the text again, right? So I can do dot
text again. And this brings me back. It looks
like there's a lot of open space.
Notice there's a return, and then there's
a new line. There's a return, and then there's
a new line. So in this case, I want to clean
it up a little bit because I just want the
text. So in this case I will say strip. So
strip removes whitespace before and after
new lines, all that good stuff. So it just
says free shipping now. So I can go ahead
and grab this, and throw it into my script,
as well.
So now I've grabbed three things. So in this
case, I also need the find all that I did
earlier. So if I go up a few times, I can
find it. So the shipping container itself
will be placed in here. And then if I close,
actually, the find all function, and there
we go.
So now there are the three things that I want.
So the product name, the brand, and the shipping
container will be actually shipping.
OK. So cool. So now this is ready to be looped
through. But before that, I want to print
it out. So I want to show you why is Sublime
is my favorite editor. It does multi-line
editing. S in this case, I'm going to go ahead
and enter three blank lines. I'm going to
copy my three variables. OK, copy, copy, copy.
I'm going to paste them in here. I'm just
go ahead and make it nice and formatted.
So I will print all of these things out into
the console, just so I can see. So in this
case I will copy this, as well. So that way,
I can go ahead and just say quote, and then
paste that. So I can see what it is when it
actually does print out. And then I can do
a plus for for a string concatenation. It's
going to print each of these three things
out for me, so the brand, and the product
name, and the shipping.
And basically, before I throw this into a
CSV file, I want to just make sure that this
loop works. So I want to save this web scrape
thing, too. I want to call this web my first
web scrape dot py. OK. So if I open this,
there should be a file here. If I right click
and open up another console, so notice I have
accounts before. But this one is running Python.
I want to open up this one.
And I want to tell it. So notice that I'm
inside of this file path now. So this file
path is a file path that contains this script
already in it. So what I need to do is just
do Python. So I want tell it to run Python.
And I want tell it, OK now that I'm in Python,
execute this script. So my first web scrape
dot py. Hit Enter. And then, hopefully, look
at that. It went through. It did that loop.
And it grabbed every other graphics card for
me.
So all I have to do now is throw this into
a CSV file. And I can then open it in Excel.
So let's go ahead and do that real quick.
Just finish up our code. And I don't really
need the prototype for this, because I know
that the script works now.
To open up a file, you would do just the simple
Open. And then, in this case, I need a file
name. So the file name is equal to, I guess,
products dot CSV. OK so I want to open up
a file name. I need to instantiate a mode.
So in this case W for write. So I want to
open up a new file and write it in it. So
this would be called F. So the normal convention
for a FileWriter is F.
And I want to write the headers to this thing.
So, in this case, F dot write is equal to,
now I need to call some headers to a CSV file
which usually has headers. In this case, headers
will equal to, I think I'll make it, brand
name, let's call it product name, because
if you load us into a SQL database later,
name is a key word in SQL. So product name,
and then I'll call this shipping. OK. And
then I also need to add a new line because
CSVs are delineated by new line.
So I'm going to tell it to write the first
line to be a header And then the next thing
is, I want to tell it to every time to loop
through, I want to write a file. So instead
of printing it to the console, which I'll
let it do actually, I'm going to do F dot
write. So F dot right is going to write so
these three things. So product, product name,
shipping. I paste that in there. That's going
to paste all three of them for me.
But what I need to do is actually concatenate
them together. And I need to concatenate them
with a comma in the middle. So comma. And
let me just double check something real quick.
See if my strings are clean. And no it is
not. So notice that the product names have
commas inside of them. So what that's going
to do is it's going to create extra columns
inside of my CSV file.
So before I print the product names out, I
actually need to do a string replace. So I
need to call a replace function as every time
you see a comma, let's replace it with something
else. And I like to do a pipe, but you can
delineate it as anything you want. This is
programming. You can do whatever you want
as long as it doesn't err. In this case, I
would go ahead and do that. And also, don't
forget this, it needs to be deliminated by
a new line.
So every time is going to loop through, it's
going to grab and parse all of the data points.
And then it's going to write it to a file
as a line in the file. And what I need to
do is, once it's done looping, I will have
to close to file. Because if you don't close
the file, you can't open the file. Only one
thing can open the file at a time.
All right. So I will run the script again.
So notice if I just push up, it runs the script.
So you have to save the script first. I'm
going to do Control S to quickly save it.
When you do control-- syntax error! I forgot
to add a concatenation with the plus N. So
I need to do a plus N to tell it to concatenate
that. So I go Python my first web scrape.
It went through.
So after running that script, it's gone ahead
and scraped everything and printed everything
to the console. But more importantly, it rewrote
everything to this file. I told it to write
everything to the CSV file. So if I open it
up right now, you can see that it has gone
ahead and scraped the entire page and thrown
every data point as a row, every product as
a row, into this CSV file.
So you can go ahead and scrape the other details,
like whether or not it is a sales price or
not, what the image tag might be. And then
there's multiple pages. So if you go to Amazon,
for example, there's multiple pages of probably
products. So you can start looping through.
So usually up here, there's a page equal something.
So you can just do a loop and just say, in
this case, do page two instead of page one.
And that concludes today's lesson on how to
web scrape with Python. And I hope you guys
learned a lot and had fun doing it.
Now I want to really know from you guys, did
you guys enjoy this kind of video? Do you
guys want more coding videos? More data science
videos? And if there's a better way to code
something, also let me know. I'm always happy
to hear from you guys. What do you guys enjoy?
I want to make this content for you guys.
All right. Now I'll see you guys later, and
happy coding.
