Hi, I'm Gayle Laakmann McDowell, author of Cracking the Coding Interview. Today I want
to talk about graphs and common
operations, like breadth-first search and
depth first search. So to go back to the
beginning, a graph is basically a
collection of nodes where each node
might point to other nodes and these
edges can be directed so like one-way
streets or undirected so kind of like two-way
streets. Now suppose you want to go walk
through this graph, and specifically
suppose you want to do something like,
figure out is there a path from one node
to another. There are two common ways of
doing this and we'll talk about both of
them. The first one is a little more
simplistic and it's called depth-first
search, and it's a typically recursive
algorithm and the way it looks like it's
saying, okay we have have this initial
node that I'm gonna call s.
Now you basically are asking a question, hey S, do you have a path to node T, and S
says, hmm, I'm not sure, let me go ask my
children. And first S goes to node A
and it says hey A, do you have a path to node T? If you do then hey I'm done I can
give you my answer.
If you don't then let me go ask B and
then C and then D. And so,
the first person we ask is A, so A gets asked, hey A do you
have a path to T? I'm not sure let me go
ask my children. And eventually we might get
to a node who says why yes of
course I have a path, I am T. And so then
we go and say boopboopboop all the way back
up.
Yes there's a path and that's the basis
of depth-first search.
It's called depth-first search because
we go deep into some node before you
even ask any of the children. Now the
problem with this is that we might run
really really far away.
So imagine for example that B actually
has a path directly to node T has edge
directly to node T.
A might go to all of its children, all
of its children, all of its children, before
you even get to B, and he was right
there. There could have been a really
fast connection so that's why we might
often prefer to use breadth first search instead.
Breadth-first search says hey go a
level by level out. So first we ask S hey
do you have a path to T? And S will say
well let me check if any of my nodes are T.
No, there's no edge right there.
So each of those get in line. And then we
ask the second level out, and then the
third level out, and the next level and the next
level and the next level. And so we go level
by level, breadth, we go wider before you
go deep.
That's why it's called breadth-first
search. So I want to talk at a high-level
about the implementation of each before
I dive into some of the details. So
depth-first search is implemented with a
recursive algorithm. It's probably the
simpler algorithm to implement. The only
little trick is that we have to make
sure to use it is visited flag so that
you don't wind up in some sort of
infinite loop where there is, you know, a
cycle and you keep asking each node of
it has, you start running around
in a circle. With breadth-first search,
the main trick need to remember is you
want to use a queue. So when you look,
when you look at S, you say hey, do you have a path to T? You're going to go add all of its children to the
queue, and rather than going recursively
you pull out the first element from
the queue, check if it has a path, check if it is this final element and if not
go add all its children to it. So use a
queue so that you go through things in the
correct order.
So that's the high-level, how it works.
Let's turn to the real details of the
implementation now. I've gotten a bit of
a head start on the implementation but
I'll show you just quickly what I've
done. So first we have this node class
here, and it's going to have some sort
of ID that represents the node ID and
what I've done is give ourselves a
mapping of from node ID to the actual
node and this is mostly going to be used
for things like get node and add edge.
This way we can actually
just go and get, get immediate access
to the node of the particular ID. And then I've
also given us this has paths
depth-first search method and it's going
to call after this recursive method. So
if you remember with depth
first search, we need to have a way of
flagging nodes to say hey I've already
visited this don't retry it.
So one thing we can do is we can
actually modify the node class to give
ourselves and is visited flag, but that
requires then making sure we clear
that flag later on. Another way of doing
it is giving ourselves a hash set that
lists all of the IDs that I've already,
of the nodes that I've already visited.
It's sort of a replacement for a flag so I'm gonna
do it this way, this way I don't have to
modify and add a whole bunch of flags in
and then make sure to clear them later
on. So I get the source node, I get the
destination node, and then I get this,
create this visited hash set and then I
go out and call this recursive
method. So now is where the fun begins.
So first if I've already visited this
node, if visited dot contains the source ID
then return false because there is no
path then. Okay otherwise, then what I
want to do is, so now I want to go and
make sure I update this, mark this node
as visited, and then I want to say okay
if I'm at my source then
if I'm at my destination rather, then
return true because yes there certainly is a
path then. Otherwise go and check
all my children and see any of them have a path
because if there's a path from a child
to me then then there's certainly a
final, then there's certainly a path from me
to my destination. So for each node
child in source dot adjacent, if there's
a path from child to destination passing
in again visited, then return true and
that will bubble all the way up the
stack. If I get down to the very end and I
haven't found a path yet then there is
no path from me to my destination. Now
let's turn to how breadth-first search
works. Ok so breadth-first search, so what I
need with breadth-first search is I need
a linked list of sort of what I'll call
the like
next step. So these are the, I'll call this
next to visit, these are the nodes that I need to
visit next. And just as before I need
this visited hash set that represents
everything I've already visited, and then
I want to say next to visit,
because the first I need to visit is in
fact my source. Then as long as there is
nothing in, so while visited, sorry
next to visit dot is empty so while
it's not empty,
keep going. Alright so first thing, I need
to look, grab my very first node to visit so
node node equals next to visit dot, I'll
call this remove so remove the very
first node in that list, if this is my
destination then there is certainly a
path.
Now I also want to do my visited
checking so if visited dot contains of
node ID, then actually just continue,
let's go to the next value, otherwise
visited dot add of node dot ID so mark it as
visited, and then go and actually add my
children so for each node child in node
dot adjacent go and add each of those to my next to
visit. So next to visit dot add of child. Right.
And that's all there is to depth and
breadth first search. And then of course
if i get down to the very end and I
haven't found a path yet return false.
Let's walk through this code again and
make sure that this makes sense. So has
path DFS takes in the source and
destination IDs and then I get those
nodes and then I create this hash sets,
that should be hash set not a hashmap.
And then goes and actually does this
recursive method, so the recursive method
has this visited thing so we use a
visited hash set instead of marking the
actual nodes with a flag. So visited, check
if it contains the source ID, if it
does contain that, if I've already visited this,
return false because there's no path.
Then otherwise go and add this visited,
mark this node as visited, read you know,
check my sources, my destination children
already there
return true. Otherwise go and search all
of my children. Then so that, and then if
I haven't found a path go to return
false.
So with BFS we are taking in a source
and a destination. So here are views nodes,
I'll switch this actually to be symmetric and
use IDs again and so I'll make that public
and this private now.
Ok so I take in the source and
destination IDs and then I go and call
this recursive method. My recursive method
are, sorry this other just helper method,
so this takes in the source and
destination, I create this list of the
next nodes I have to visit, and this
visited hash set, that should be hash set again,
and this marks all the nodes that
I've already visited. So I say okay next
one to visit is my source then as long
as there's something left to
visit pull out the next node to visit,
check if I'm where I should be
if so, return true, otherwise check and
update my visited, who I've visted
and then go and queue up my
children at the very end of the queue to
be visited next. And those, and this will ensure that my children not
visited immediately, but are visited once, sort of everything, everything
scheduled has already been visited, so it will ensure it
will visit level by level by level. And then
if I get down to the very end and I haven't
found my destination after all of this,
then just return false.
So that's how breadth-first search
operates. In many cases when we want to
find if there's actually a path,
breadth-first search is often the better
approach, because otherwise we could wind
up searching really really far away when
there's actually a very short connection,
and it's certainly better when we want
to find the shortest path. So now that
you've seen breadth first and depth-first search
why don't you try these out on a new problem.
Good luck.
