[MUSIC PLAYING]
BRIAN YU: OK, welcome back
everyone to an Introduction
to Artificial Intelligence with Python.
And now, so far, we've taken a look at
a couple of different types of problems.
We've seen classical
search problems where
we're trying to get from
an initial state to a goal
by figuring out some optimal path.
We've taken a look at adversarial search
where we have a game-playing agent that
is trying to make the best move.
We've seen knowledge-based
problems where
we're trying to use logic and inference
to be able to figure out and draw
some additional conclusions.
And we've seen some probabilistic
models as well where we might not
have certain information
about the world,
but we want to use the knowledge
about probabilities that we do have
to be able to draw some conclusions.
Today we're going to turn our attention
to another category of problems
generally known as
"optimization problems"
where optimization is really
all about choosing the best
option from a set of possible options.
And we've already seen
optimization in some contexts,
like game playing where we're trying
to create an AI that chooses the best
move out of a set of possible moves.
But what we'll take a look
at today is a category
of types of problems and
algorithms to solve them
that can be used in order to deal with a
broader range of potential optimization
problems.
And the first of the algorithms
that we'll take a look at
is known as a "local search."
And local search differs from
search algorithms we've seen before
in the sense that the search
algorithms we've looked at so far,
which are things like breadth-first
search or A* search, for example,
generally maintain a whole
bunch of different paths that
we're simultaneously exploring, and
we're looking at a bunch of different
paths at once trying to find
our way to the solution.
On the other hand, in
local search, this is
going to be a search
algorithm that's really
just going to maintain a single
node and looking at a single state,
and we'll generally run this algorithm
by maintaining that single node
and then moving ourselves
to one of the neighboring
nodes throughout this search process.
And this is generally useful in
contexts not like these problems,
which we've seen before, like a
maze-solving situation where we're
trying to find our way from
the initial state to the goal
by following some path.
But local search is most applicable
when we really don't care about the path
at all, and all we care about
is what the solution is.
And in the case of a solving a maze,
the solution was always obvious.
You could point to the solution.
You know exactly what the goal is.
And the real question is,
what is the path to get there?
But local search is
going to come up in cases
where figuring out exactly
what the solution is,
exactly what the goal looks like is
actually the heart of the challenge.
And to give an example of one
of these kinds of problems,
we'll consider a scenario where we have
two types of buildings, for example,
and we have houses and hospitals.
And our goal might be, in a world
that's formatted as this grid where
we have a whole bunch of houses,
a house here, a house here,
two houses over there, maybe
we want to try and find
a way to place two hospitals on
this map, so maybe a hospital here
and a hospital there.
And the problem now is we want to
place two hospitals on the map,
but we want to do so with
some sort of objective.
And our objective in this
case is to try and minimize
the distance of any of the
houses from a hospital.
So you might imagine, all right,
what's the distance from each
of the houses to their nearest hospital?
There are a number of ways we
could calculate that distance,
but one way is using a heuristic
we've looked at before,
which is the Manhattan distance, this
idea of how many rows and columns would
you have to move inside of this grid
layout in order to get to a hospital,
for example.
And it turns out if you take
each of these four houses
and figure out, all right, how close
are they to their nearest hospital,
you get something like this where this
house is three away from a hospital.
This house is six away, and these
two houses are each four away.
And if you add all those numbers up
together, you get a total cost of 17,
for example.
So for this particular configuration
of hospitals, a hospital here
and a hospital there, that state,
we might say, has a cost of 17.
And the goal of this
problem now that we would
like to apply a search
algorithm to figure out
is, can you solve this problem to
find a way to minimize that cost,
minimize the total amount if you
sum up all of the distances from all
the houses to the nearest hospital?
How can we minimize that final value?
And if we think about this
problem a little bit more
abstractly, and abstracting
away from this specific problem,
and thinking more generally
about problems like it,
you can often formulate these
problems by thinking about them
as a state-space landscape,
as we'll soon call it.
Here in this diagram of a state-space
landscape, each of these vertical bars
represents a particular state
that our world could be in.
So for example, each
of these vertical bars
represents a particular
configuration of two hospitals.
And the height of this
vertical bar is generally
going to represent some function of
that state, some value of that state.
So maybe, in this case, the
height of the vertical bar
represents what is the cost of this
particular configuration of hospitals
in terms of, what is
the sum total of all
of the distances from all of the
houses to their nearest hospital?
And generally speaking, when we
have a state-space landscape,
we want to do one of two things.
We might be trying to maximize
the value of this function,
trying to find a global maximum, so to
speak, of this state-space landscape,
a single state whose value is
higher than all of the other states
that we could possibly choose from.
And generally, in this case, when
we're trying to find a global maximum,
we'll call the function
that we're trying
to optimize them some
"objective function,"
some function that measures
for any given state
how good is that state such
that we can take any state,
pass it into the objective function, and
get a value for how good that state is.
And ultimately, what
our goal is is to find
one of these states that has
the highest-possible value
for that objective function.
An equivalent but reverse
problem is the problem
of finding a global minimum, some state
that has a value after you passed it
into this function that is lower
than all of the other possible values
that we might choose from.
And generally speaking, when we're
trying to find a global minimum,
we call the function that we're
calculating a "cost function."
Generally, each state
has some sort of cost,
whether that cost is a
monetary cost, or a time cost,
or, in the case of the
houses and hospitals
we've been looking at
just now, a distance
cost in terms of how far away each
of the houses is from a hospital.
And we're trying to minimize the
cost, find the state that has
the lowest possible value of that cost.
So these are the general
types of ideas that we
might be trying to go for
within a state-space landscape,
trying to find a global maximum or
trying to find a global minimum.
And how exactly do we do that?
We'll recall that in
local search, we generally
operate this algorithm by
maintaining just a single state,
just some current state
represented inside of some node,
maybe inside of a data
structure where we're keeping
track of where we are currently.
And then, ultimately,
what we're going to do
is, from that state, move to
one of its neighbor states,
so in this case represented
in this one-dimensional space
by just the state immediately to
the left or to the right of it.
But for any different
problem, you might define
what it means for there to be a
neighbor of a particular state.
In the case of a hospitals, for
example, that we were just looking at,
a neighbor might be moving one hospital
one space to the left, or to the right,
or up, or down, some state that
is close to our current state
but slightly different
and, as a result, might
have a slightly different value
in terms of its objective function
or in terms of its cost function.
So this is going to be our
general strategy in local search,
to be able to take a state,
maintaining some current node,
and move where we're looking at in
this state-space landscape in order
to try to find a global maximum
or a global minimum somehow.
And perhaps the simplest
of algorithms that we
could use to implement
this idea of local search
is an algorithm known
as "hill climbing."
And the basic idea of hill
climbing is, let's say,
I'm trying to maximize
the value of my state.
I'm trying to figure out
where the global maximum is.
I'm going to start in the state.
And generally, what hill
climbing is going to do
is it's going to consider the neighbors
of that state, that from this state
I could go left, or I could go right.
And this neighbor happens to be higher,
and this neighbor happens to be lower.
And in hill climbing, if I'm
trying to maximize the value,
I'll generally pick
the highest one I can.
Between the state to the left and
right of me, this one is higher.
So I'll go ahead and move myself
to consider that state instead.
And then I'll repeat
this process, continually
looking at all of my neighbors
and picking the highest neighbor,
doing the same thing,
looking at my neighbors,
picking the highest of my neighbors
until I get to a point like right here
where I consider both of my
neighbors, and both of my neighbors
have a lower value than I do.
This current state has a value that
is higher than any of its neighbors.
And at that point, the
algorithm terminates.
And I can say, all right, here
I have now found the solution.
And the same thing works
in exactly the opposite way
for trying to find a global
minimum, but the algorithm
is fundamentally the same.
If I'm trying to find a global minimum
and, say, my current state starts here,
I'll continually look at my
neighbors, pick the lowest value
that I possibly can until
I eventually, hopefully,
find that global minimum,
a point at which, when
I look at both of my neighbors,
they each have a higher value,
and I'm trying to minimize
the total score, or cost,
or value that I get as a result of
calculating some sort of cost function.
So we can formulate this graphical
idea in terms of pseudocode.
And the pseudocode for hill
climbing might look like this.
We define some function called
"hill climb" that takes as input
the problem that we're trying to solve.
And generally, we're going to start
in some sort of initial state.
So I'll start with a
variable called "current"
that is keeping track
of my initial state,
like an initial
configuration of hospitals.
And maybe some problems lend
themselves to an initial state,
some place where you begin.
In other cases, maybe not, in which
case we might just randomly generate
some initial state just by choosing
two locations for hospitals at random,
for example, and figuring out from
there how we might be able to improve.
But that initial state, we're
going to store inside of "current."
And now here comes our loop,
some repetitive process
we're going to do again and again
until the algorithm terminates.
And what we're going to do is
first say, let's figure out all
of the neighbors of the current state.
From my state, what are all of the
neighboring states for some definition
of what it means to be a neighbor?
And I'll go ahead and choose the
highest valued of all of those neighbors
and save it inside of this
variable called "neighbor," so keep
track of the highest-valued neighbor.
This is in the case where I'm
trying to maximize the value.
In a case where I'm trying
to minimize the value,
you might imagine here you'll
pick the neighbor with the lowest
possible value.
But these ideas are really
fundamentally interchangeable.
And it's possible, in some cases,
there might be multiple neighbors
that each have an equally high
value or an equally low value
in the minimizing case.
And in that case, we can just
choose randomly from among them.
Just choose one of them, and save it
inside of this variable "neighbor."
And then the key question
to ask is, is this neighbor
better than my current state?
And if the neighbor, the best
neighbor that I was able to find
is not better than my current state,
well, then the algorithm is over,
and I'll just go ahead and
return the current state.
If none of my neighbors are better,
then I may as well stay where I am
is the general logic of the
hill-climbing algorithm.
But otherwise, if the
neighbor is better,
then I may as well
move to that neighbor.
So you might imagine setting
"current" equal to "neighbor"
where the general idea is
if I'm at a current state
and I see a neighbor that is better than
me, then I'll go ahead and move there.
And then I'll repeat the process,
continually moving to a better neighbor
until I reach a point at which none
of my neighbors are better than I am.
And at that point, we'd say the
algorithm can just terminate there.
So let's take a look at
a real example of this
with these houses and hospitals.
So we've seen now that if we put the
hospitals in these two locations,
that has a total cost of 17.
And now we need to
define, if we're going
to implement this
hill-climbing algorithm,
what it means to take this particular
configuration of hospitals,
this particular state and
get a neighbor of that state.
And a simple definition of
"neighbor" might be just let's
pick one of the hospitals and move it
by one square to the left, or right,
or up, or down, for example.
And that would mean we
have six possible neighbors
from this particular configuration.
And we could take this
hospital and move it
to any of these three possible
squares, or we take this hospital
and move it to any of those
three possible squares.
And each of those would
generate a neighbor.
And what I might do is
say, all right, here
is the locations and the
distances between each
of the houses and
their nearest hospital.
Let me consider all of the
neighbors and see if any of them
can do better than a cost of 17.
And it turns out there are a couple
of ways that we could do that,
and it doesn't matter if we
randomly choose among all the ways
that are the best.
But one such possible way is by
taking a look at this hospital
here and considering the
directions in which it might
move if we hold this hospital constant.
If we take this hospital and move
it one square up, for example,
that doesn't really help us.
It gets closer to the house up here,
but it gets further away from the house
down here, and it doesn't
really change anything
for the two houses along
the left-hand side.
But if we take this hospital on the
right and move it one square down,
it's the opposite problem.
It gets further away from the house up
above, and it gets closer to the house
down below.
The real idea, the goal should be
to be able to take this hospital
and move it one square to the left.
By moving it one square
to the left, we move
it closer to both of
these houses on the right
without changing anything
about the houses on the left.
For them, this hospital is still the
closer one, so they aren't affected.
So we're able to improve the situation
by picking a neighbor that results
in a decrease in our total cost.
And so we might do that, move ourselves
from this current state to a neighbor
by just taking that
hospital and moving it.
And at this point, there's
not a whole lot that
can be done with this
hospital, but there's still
other optimizations we can make,
other neighbors we can move to that
are going to have a better value.
If we consider this
hospital, for example,
we might imagine that right now it's a
bit far up, that both of these houses
are a little bit lower, so
we might be able to do better
by taking this hospital and
moving it one square down,
moving it down so that now, instead of
a cost of 15, we're down to a cost of 13
for this particular configuration.
And we can do even better
by taking the hospital
and moving it one square to the left.
Now, instead of a cost of
13, we have a cost of 11
because this house is one
away from the hospital.
This one is four away.
This one is three away, and
this one is also three away.
So we've been able to do much
better than that initial cost
that we had using the
initial configuration just
by taking every state and
asking ourselves the question,
can we do better by just making
small, incremental changes,
moving to a neighbor, moving to a
neighbor, and moving to a neighbor
after that?
And now, at this point, we can
potentially see that, at this point,
the algorithm is going to terminate.
There's actually no neighbor
we can move to that is
going to improve the situation,
get us a cost that is less than 11.
Because if we take this hospital and
move it up or to the right, well,
that's going to make it further away.
If we take it and move it down, that
doesn't really change the situation.
It gets further away from this
house but closer to that house.
And likewise, the same story
was true for this hospital.
Any neighbor we move it to,
up, left, down, or right,
is either going to make it further away
from the houses and increase the cost,
or it's going to have no
effect on the cost whatsoever.
And so the question we might now ask
is, is this the best we could do?
Is this the best placement of the
hospitals we could possibly have?
And it turns out the answer is "no"
because there's a better way that we
could place these hospitals.
And in particular, there are a
number of ways you could do this.
But one of the ways is by taking
this hospital here and moving it
to this square, for example,
moving it diagonally by one square,
which was not part of our
definition of "neighbor."
We could only move left,
right, up, or down.
But this is, in fact, better.
It has a total cost of 9.
It is now closer to
both of these houses.
And as a result, the total cost is less.
But we weren't able to find it.
Because in order to
get there, we had to go
through a state that actually wasn't
any better than the current state
that we had been on previously.
And so this appears to be
a limitation or a concern
you might have as you go
about trying to implement
a hill-climbing algorithm
is that it might not always
give you the optimal solution.
If we're trying to maximize the
value of any particular state,
we're trying to find the
global maximum, a concern
might be that we could get stuck
at one of the local maxima,
highlighted here in blue, where a local
maxima is any state whose value is
higher than any of its neighbors.
If we ever find ourselves
at one of these two states
when we're trying to maximize
the value of the state,
we're not going to make any changes.
We're not going to move left or right.
We're not going to move left here
because those states are worse.
But yet we haven't found
the global optimum.
We haven't done as best as we could do.
And likewise, in the case of the
hospitals, what we're ultimately
trying to do is find a
global minimum, find a value
that is lower than all of the others.
But we have the potential to get
stuck at one of the local minimum, any
of these states whose value is
lower than all of its neighbors
but still not as low
as the local minimum.
And so the takeaway here
is that it's not always
going to be the case that
when we run this naive,
hill-climbing algorithm
that we're always
going to find the optimal solution.
There are things that could go wrong.
If we started here,
for example, and tried
to maximize our value
as much as possible,
we might move to the
highest possible neighbor,
move to the highest possible neighbor,
move to the highest possible neighbor,
and stop, and never realize
that there is actually
a better state way over there that
we could have gone to instead.
And other problems
you might imagine just
by taking a look at this
state-space landscape
are these various different
types of plateaus, something
like this flat local maximum here
where all six of these states each
have the exact same value.
And so, in the case of the algorithm
we showed before, none of the neighbors
are better, so we might just get
stuck at this flat local maximum.
And even if you allowed yourself
to move to one of the neighbors,
it wouldn't be clear which neighbor
you would ultimately move to,
and you could get stuck here as well.
And there's another one over here.
This one is called a "shoulder."
It's not really a local
maximum because there's still
places where we can go higher--
not a local minimum
because we can go lower.
So we can still make
progress, but it's still
this flat area where if you
have a local search algorithm,
there is potential to get
lost here, unable to make
some upward or downward progress
depending on whether we're trying
to maximize or minimize and,
therefore, another potential for us
to be able to find a solution that might
not actually be the optimal solution.
And so because of this
potential, the potential
that hill climbing has to not
always find us the optimal result,
it turns out there are a number of
different varieties and variations
on the hill-climbing algorithm that
help to solve the problem better
depending on the context.
And depending on the specific type
of problem, some of these variants
might be more applicable than others.
What we've taken a look at so far is
a version of hill climbing generally
called "steepest-ascent hill climbing"
where the idea of steepest-ascent hill
climbing is we are going to choose the
highest-valued neighbor in the case
where we're trying to maximize or the
lowest-valued neighbor in cases where
we're trying to minimize.
But generally speaking,
if I have five neighbors
and they're all better
than my current state,
I will pick the best one of those five.
Now, sometimes, that
might work pretty well.
It's sort of a greedy approach
of trying to take the best
operation at any particular time step.
But it might not always work.
There might be cases
where actually I want
to choose an option that
is slightly better than me
but maybe not the best one
because that, later on, might
lead to a better outcome ultimately.
So there are other variants
that we might consider
of this basic hill-climbing algorithm.
One is known as
"stochastic hill climbing."
And in this case, we
choose randomly from all
of our higher-valued neighbors.
So if I'm at my current state
and there are five neighbors that
are all better than I am,
rather than choosing the best
one as steepest ascent would do,
stochastic will just choose randomly
from one of them, thinking that
if it's better, then it's better,
and maybe there's a potential
to make forward progress
even if it is not locally the best
option I could possibly choose.
First-choice hill climbing
ends up just choosing
the very first highest-valued
neighbor, but it
follows behaving on a similar idea.
Rather than consider
all of the neighbors,
as soon as we find a neighbor that
is better than our current state,
we'll go ahead and move there, so
maybe some efficiency improvements
there and maybe has
the potential to find
a solution that the other
strategies weren't able to find.
And with all of these
variants, we still suffer
from the same potential risk,
this risk that we might end up
at a local minimum or a local maximum.
And we can reduce that risk by
repeating the process multiple times.
So one variant of hill
climbing is random-restart hill
climbing where the general idea is we'll
conduct hill climbing multiple times.
If we apply a steepest-ascent
hill climbing, for example,
we'll start at some random
state, try and figure out
how to solve the problem, and
figure out what is the local maximum
or local minimum we get to.
And then we'll just randomly
restart and try again,
choose a new starting
configuration, try and figure out
what the local maximum or minimum
is, and do this some number of times.
And then after we've done
it some number of times,
we can pick the best one out of all of
the ones that we've taken a look at.
So there's another option
we have access to as well.
And then, although I said the
generally local search will usually
just keep track of a single node and
then move to one of its neighbors,
there are variants of hill climbing
that are known as "local beam
searches" where, rather than keep
track of just one current best state,
we're keeping track of k
highest-valued neighbors, such
that rather than starting at one
random initial configuration,
I might start with
three, or four, or five,
randomly generate all the neighbors,
and then pick like the three, or four,
or five best of all of the neighbors
that I find and continually
repeat this process with
the idea being that now I
have more options that I'm
considering and more ways
that I could potentially navigate
myself to the optimal solution that
might exist for a particular problem.
So let's now take a
look at some actual code
that can implement some of
these kinds of ideas, something
like steepest-ascent hill
climbing, for example, for trying
to solve this hospital problem.
So I'm going to go ahead and
go into my hospital's directory
where I've actually set up the
basic framework for solving
this type of problem.
I'll go ahead and go into
hospitals.py, and we'll take a look
at the code we've created here.
I've defined a class that is going
to represent the state space.
So the space has a height, and a width,
and also some number of hospitals.
So you can configure,
how big is your map?
How many hospitals should go here?
We have a function for adding a new
house to the state space and then
some functions that are going to get
me all of the available spaces for if I
want to randomly place hospitals
in particular locations.
And here now is the
hill-climbing algorithm.
So what are we going to do in
the hill-climbing algorithm?
Well, we're going to start
by randomly initializing
where the hospitals are going to go.
We don't know where the
hospitals should actually be,
so let's just randomly place them.
So here, I'm running a loop for
each of the hospitals that I have.
I'm going to go ahead and add a new
hospital at some random location.
So I basically get all
of the available spaces,
and I randomly choose one
of them as where I would
like to add this particular hospital.
I have some logging output and
generating some images, which we'll
take a look at a little bit later.
But here is the key idea.
So I'm going to just keep
repeating this algorithm.
I could specify a maximum of
how many times I want it to run,
or I could just run it up until it hits
a local maximum or a local minimum.
And now, we'll basically consider all
of the hospitals that could potentially
move, so consider each of the
two hospitals or more hospitals
if there are more than that,
and consider all of the places
where that hospital could
move to, some neighbor
of that hospital that we can move
the neighbor to and then see,
is this going to be better
than where we were currently?
So if it is going to
be better, then we'll
go ahead and update our best neighbor
and keep track of this new best
neighbor that we found.
And then afterwards,
we can ask ourselves
the question, if best
neighbor cost is greater
than or equal to the cost of the
current set of hospitals, meaning
if the cost of our best neighbor
is greater than the current cost,
meaning our best neighbor is
worse than our current state,
well, then we shouldn't
make any changes at all.
And we should just go ahead and
return the current set of hospitals.
But otherwise, we can
update our hospitals
in order to change them to
one of the best neighbors.
And if there are multiple
that are all equivalent,
I'm here using random.choice to say,
go ahead and choose one randomly.
So this is really just
a Python implementation
of that same idea that we
were just talking about,
this idea of taking a current state,
some current set of hospitals,
generating all of the
neighbors, looking at all
of the ways we could take one hospital
and move it one square to the left,
or right, or up, or down,
and then figuring out,
based on all of that
information, which is the best
neighbor or the set of all the best
neighbors, and then choosing from one
of those.
And each time, we go ahead and
generate an image in order to do that.
And so now what we're doing is
if we look down in the bottom,
I'm going to randomly generate a
space with height 10 and width 20.
And I'll say, go ahead and put three
hospital somewhere in the space.
I'll randomly generate 15
houses that I just go ahead
and add in random locations.
And now I'm going to run this
hill-climbing algorithm in order
to try and figure out where we
should place those hospitals.
So we go ahead and run this program
by running "python hospitals."
And we see that we started-- our
initial state had a cost of 72,
but we were able to
continually find neighbors
that were able to decrease that
cost, decrease to 69, 66, 63,
so on and so forth, all the way
down to 53 as the best neighbor
we were able to ultimately find.
And we can take a look
at what that looked
like by just opening up these files.
So here, for example, was
the initial configuration.
We randomly selected a location for
each of these 15 different houses
and then randomly
selected locations for 1,
2, 3 hospitals that were just located
somewhere inside of the state space.
And if you add up all the
distances from each of the houses
to their nearest hospital, you
get a total cost of about 72.
And so now the question
is, what neighbors can we
move to that improve the situation?
And it looks like the first
one the algorithm found
was by taking this house that
was over there on the right
and just moving it to the left.
And that probably makes sense.
Because if you look at the
houses in that general area,
really these five houses
look they're probably
the ones that are going to be
closest to this hospital over here.
Moving it to the left decreases
the total distance at least
to most of these houses, though it does
increase that distance for one of them.
And so we're able to make these
improvements to the situation
by continually finding ways that
we can move these hospitals around
until we eventually settle at this
particular state that has a cost of 53.
Or we figured out a position for
each of the hospitals, and now none
of the neighbors that we
can move to are actually
going to improve the situation.
We can take this hospital, and
this hospital, and that hospital
and look at each of the
neighbors, and none of those
are going to be better than
this particular configuration.
And again, that's not to say that
this is the best we could do.
There might be some other configuration
of hospitals that is a global minimum.
And this might just be
a local minimum, that
is, the best of all of its
neighbors but maybe not
the best in the entire
possible state space.
And you could search through
the entire state space
by considering all of the possible
configurations for hospitals.
But ultimately, that's going
to be very time intensive,
especially as our
state space gets bigger
and there might be more
and more possible states.
It's going to take quite a long
time to look through all of them.
And so being able to use these
sort of local search algorithms
can often be quite good for trying
to find the best solution we can do.
And especially if we don't care
about doing the best possible
and we just care about doing
pretty good and finding
a pretty good placement
of those hospitals,
then these methods can
be particularly powerful.
But of course, we can try and mitigate
some of this concern by instead
of using hill climbing to use
random restart, this idea of
rather than just hill climb one time,
we can hill climb multiple times
and, say, try hill climbing a whole
bunch of times on the exact same map
and figure out, what is the best
one that we've been able to find?
And so I've here implemented
a function for random restart
that restarts some
maximum number of times.
And what we're going to do is repeat
that number of times this process
of just go ahead and run
the hill-climbing algorithm,
figure out what the cost is of getting
from all the houses to the hospitals,
and then figure out, is this
better than we've done so far?
So I can try this exact same idea
where instead of running hill climbing,
I'll go ahead and run random_restart.
And I'll randomly restart
maybe 20 times, for example.
And we'll go ahead, and now
I'll remove all the images
and then rerun the program.
And now we started by
finding a original state.
When we initially ran hill climbing, the
best cost we were able to find was 56.
Each of these iterations
is a different iteration
of the hill-climbing algorithm.
We're running hill climbing
not one time but 20 times here,
each time going until we find
a local minimum, in this case.
And we look and see each
time, did we do better than we
did the best time we've done so far?
So we went from 56 to 46.
This one was greater, so we ignored it.
This one was 41, which was less,
so we went ahead and kept that one.
And for all of the
remaining 16 times that we
tried to implement hill
climbing and we tried
to run the hill-climbing algorithm, we
couldn't do any better than that 41.
Again, maybe there is a way to do
better that we just didn't find,
but it looks like that
way ended up being
a pretty good solution to the problem.
That was attempt number 3
starting from counting at zero.
So we can take a look at
that, open up number 3,
and this was the state that
happened to have a cost of 41,
that after running the
hill-climbing algorithm
on some particular, random initial
configuration of hospitals,
this is what we found was
the local minimum in terms
of trying to minimize the cost.
And it looks like we did pretty
well, that this hospital is
pretty close to this region.
This one is pretty close
to these houses here.
This hospital looks
about as good as we can
do for trying to capture those
houses over on that side.
And so these sorts of algorithms
can be quite useful for trying
to solve these problems.
But the real problem with
many of these different types
of hill climbing, steepest ascents,
stochastic, first choice, and so forth
is that they never make a move
that makes our situation worse.
They're always going to take
ourselves in our current state,
look at the neighbors, and consider,
can we do better than our current state,
and move to one of those neighbors.
Which of those neighbors
we choose might vary
among these various different
types of algorithms,
but we never go from a current
position to a position that
is worse than our current position.
And ultimately, that's
what we're going to need
to do if we want to be able to find
a global maximum or a global minimum.
Because sometimes if
we get stuck, we want
to find some way of dislodging ourselves
from our local maximum or local minimum
in order to find the global
maximum or the global minimum
or increase the probability
that we do find it.
And so the most popular
technique for trying
to approach the problem from
that angle is a technique
known as "simulated
annealing," simulated
because it's modeling after a real
physical process of annealing where you
can think about this in terms
of physics, a physical situation
where you have some system of particles.
And you might imagine that when you
heat up a particular physical system,
there's a lot of energy there.
Things are moving around quite randomly.
But over time, as the system
cools down, it eventually
settles into some final position.
And that's going to be the general
idea of simulated annealing.
We're going to simulate that process
of some high-temperature system
where things are moving around randomly
quite frequently but, over time,
decreasing that temperature
until we eventually
settle at our ultimate solution.
And the idea is going to be if we
have some state-space landscape that
looks like this and we
begin at its initial state
here, if we're looking
for a global maximum
and we're trying to maximize
the value of the state,
our traditional hill-climbing
algorithms would just take the state,
and look at the two neighbor
ones, and always pick
the one that is going to
increase the value of the state.
But if we want some chance of being
able to find the global maximum,
we can't always make good moves.
We have to sometimes make
bad moves and allow ourselves
to make a move in a direction that
actually seems, for now, to make
our situation worse such that
later we can find our way up
to that global maximum in terms
of trying to solve that problem.
Of course, once we get up
to this global maximum,
once we've done a whole
lot of the searching,
then we probably don't
want to be moving to states
that are worse than our current state.
And so this is where this
metaphor for annealing
starts to come in where we want
to start making more random moves
and, over time, start to make
fewer of those random moves
based on a particular
temperature schedule.
So the basic outline
looks something like this.
Early on in simulated annealing,
we have a higher temperature state.
And what we mean by a
"higher temperature state"
is that we are more
likely to accept neighbors
that are worse than our current state,
that we might look at our neighbors,
and if one of our neighbors is
worse than the current state,
especially if it's not
all that much worse,
if it's pretty close
but just slightly worse,
then we might be more likely to
accept that and go ahead and move
to that neighbor anyways.
But later on as we run
simulated annealing,
we're going to decrease
that temperature.
And at a lower temperature, we're going
to be less likely to accept neighbors
that are worse than our current state.
Now, to formalize this and put a
little bit of pseudocode to it,
here is what that
algorithm might look like.
And we have a function called
"simulated annealing" that
takes as input the problem
we're trying to solve
and also potentially some
maximum number of times
we might want to run
the simulated annealing
process, how many different neighbors
we're going to try and look for.
And that value is going to vary based
on the problem you're trying to solve.
We'll again start with
some current state
that will be equal to the
initial state of the problem.
But now, we need to repeat this process
over and over for max number of times,
repeat some process some number
of times where we're first
going to calculate a temperature.
And this temperature function
takes the current time t,
starting at 1 going
all the way up to max,
and then gives us some temperature
that we can use in our computation
where the idea is that this temperature
is going to be higher early on,
and it's going to be lower later on.
So there are a number of ways this
temperature function could often work.
One of the simplest ways is just to
say it is like the proportion of time
that we still have remaining.
Out of max units of time, how
much time do we have remaining?
You start off with a lot
of that time remaining.
And as time goes on,
the temperature is going
to decrease because you have less
and less of that remaining time still
available to you.
So we calculate a temperature
for the current time.
And then we pick a random
neighbor of the current state.
No longer are we going to be picking
the best neighbor that we possibly can
or just one of the better
neighbors that we can.
We're going to pick a random neighbor.
It might be better.
It might be worse.
But we're going to calculate that.
We're going to calculate delta E,
"E" for "energy" in this case, which
is just, how much better is the
neighbor than the current state?
So if delta E is positive,
that means the neighbor
is better than our current state.
If delta E is negative,
that means the neighbor
is worse than our current state.
And so we can then have a
condition that looks like this.
If delta E is greater than 0,
that means the neighbor state
is better than our current state.
And if ever that situation
arises, we'll just
go ahead and update "current"
to be that neighbor.
Same as before, move
where we are currently
to be the neighbor because the neighbor
is better than our current state.
We'll go ahead and accept that.
But now the difference is
that whereas before we never,
ever wanted to take a move
that made our situation worse,
now we sometimes want
to move, [? go ?] make
a move that is actually going
to make our situation worse.
Because sometimes we're going
to need to dislodge ourselves
from a local minimum or a local maximum
to increase the probability that we're
able to find the global minimum or
the global maximum a little bit later.
And so how do we do that?
How do we decide to sometimes
accept some state that
might actually be worse?
Well, we're going to accept a
worse state with some probability.
And that probability needs to
be based on a couple of factors.
It needs to be based, in part, on the
temperature where if the temperature is
higher, we're more likely
to move to a worse neighbor,
and if the temperature
is lower, we're less
likely to move to a worse neighbor.
But it also, to some degree,
should be based on delta
E. If the neighbor is much
worse than the current state,
we probably want to be
less likely to choose that
than if the neighbor is just a little
bit worse than the current state.
So again, there are a couple of
ways you could calculate this.
But it turns out one of
the most popular is just
to calculate E to the power of delta
E over t where E is just a constant.
Delta E over t are based
on delta E and t here.
We calculate that value, and that'll
be some value between 0 and 1,
and that is the probability with
which we should just say, all right,
let's go ahead and
move to that neighbor.
And it turns out that if you do the
math for this value when delta E is such
that the neighbor is not that much
worse than the current state, that's
going to be more likely that we're going
to go ahead and move to that state.
And likewise, when the
temperature is lower,
we're going to be less likely to move
to that neighboring state as well.
So now this is the big picture
for simulated annealing,
this process of taking the problem
and going ahead and generating
random neighbors.
We'll always move to a neighbor if
it's better than our current state.
But even if the neighbor is
worse than our current state,
we'll sometimes move there depending
on how much worse it is and also based
on the temperature.
And as a result, the hope,
the goal of this whole process
is that as we begin to try and find our
way to the local-- the global maximum
or the global minimum,
we can dislodge ourselves
if we ever get stuck at a local
maximum or a local minimum
in order to eventually make our way to
exploring the part of the state space
that is going to be the best.
And then as the temperature
decreases, eventually we
settle there without
moving around too much
from what we've found to be the globally
best thing that we can do thus far.
So at the very end, we
just return whatever
the current state happens to be.
And that is the conclusion
of this algorithm.
And we've been able to figure
out what the solution is.
And these types of algorithms have
a lot of different applications.
Anytime you can take a problem
and formulate it as something
where you can explore a particular
configuration and then ask,
are any of the neighbors better
than this current configuration,
and have some way of
measuring that, then there
is an applicable case for these
hill-climbing, simulated-annealing
types of algorithms.
So sometimes it can be for
facility location-type problems,
like for when you're
trying to plan a city
and figure out where
the hospitals should be.
But there are definitely
other applications as well.
And one of the most famous
problems in computer science
is the traveling salesman problem.
Traveling salesman problem
generally is formulated like this.
I have a whole bunch of cities
here indicated by these dots.
And what I'd like to
do is find some route
that takes me through all of the cities
and ends up back where I started,
so some route that starts here,
goes through all these cities,
and ends up back where
I originally started.
And what I might like to do
is minimize the total distance
that I have to travel in order
to-- or the total cost of taking
this entire path.
And you can imagine
this is a problem that's
very applicable in situations
like when delivery companies are
trying to deliver things to a
whole bunch of different houses.
They want to figure out, how
do I get from the warehouse
to all these various
different houses and get back
again all using as minimal time, and
distance, and energy as possible?
So you might want to try to
solve these sorts of problems.
But it turns out that solving
this particular kind of problem
is very computationally difficult and
is a very computationally expensive task
to be able to figure it out.
And this falls under the category
of what are known as "NP-complete
problems," problems that there is no
known efficient way to try and solve
these sorts of problems.
And so what we ultimately
have to do is come up
with some approximation, some ways of
trying to find a good solution even
if we're not going to find the globally
best solution that we possibly can,
at least not in a feasible
or tractable amount of time.
And so what we could do is take
the traveling salesman problem,
and try to formulate
it using local search,
and ask a question
like, all right, I can
pick some state, some configuration,
some route between all of these nodes.
And I can measure the cost of that
state, figure out what the distance is.
And I might now want to try to
minimize that cost as much as possible.
And then the only question
now is, what does it
mean to have a neighbor of this state?
What does it mean to take
this particular route
and have some neighboring route that
is close to it but slightly different
in such that it might have
a different total distance?
And there are a number
of different definitions
for what a neighbor of a traveling
salesman configuration might look like.
But one way is just
to say, a neighbor is
what happens if we pick two of these
edges between nodes and switch them,
effectively.
So for example, I might
pick these two edges here,
these two that just happen
across-- this node goes here.
This node goes there--
and go ahead and switch them.
And what that process
will generally look
like is removing both of these edges
from the graph, taking this node,
and connecting it to the
node it wasn't connected to,
so connecting it up here instead.
We'll need to take these arrows
that were originally going this way
and reverse them, so move them going
the other way, and then just fill
in that last remaining blank, add
an arrow that goes in that direction
instead.
So by taking two edges
and just switching them,
I have been able to consider
one possible neighbor
of this particular configuration.
And it looks like this
neighbor is actually better.
It looks like this probably
travels a shorter distance in order
to get through all the cities through
this route than the current state did.
And so you could imagine
implementing this idea
inside of a hill-climbing or
simulated-annealing algorithm
where we repeat this process to try and
take a state of this traveling salesman
problem, look at all of the neighbors,
and then move to the neighbors
if they're better, or maybe
even move to the neighbors
if they're worse until we eventually
settle upon some best solution
that we've been able to find.
And it turns out that these types
of approximation algorithms,
even if they don't always
find the very best solution,
can often do pretty well at trying to
find solutions that are helpful too.
So that then was a look at local search,
a particular category of algorithms
that can be used for solving a
particular type of problem where
we don't really care about
the path to the solution.
I didn't care about the steps I took to
decide where the hospitals should go.
I just cared about the solution itself.
I just care about where
the hospitals should be
or what the route through the traveling
salesman journey really ought to be.
Another type of algorithm
that might come up
are known as these categories of
linear-programming types of problems.
And linear programming often
comes up in the context
where we're trying to optimize
for some mathematical function.
But oftentimes, linear
programming will come up
when we might have real real
numbered values so that it's not
just like discrete, fixed
values that we might have
but any decimal values that we
might want to be able to calculate.
And so linear programming
is a family of types
of problems where we might
have a situation that
looks like this where the
goal of linear programming
is to minimize a cost function.
And you can invert the numbers
and, say, try and maximize it,
but often we'll frame it as trying
to minimize the cost function that
has some number of
variables, x1, x2, x3,
all the way up to xn, just some
number of variables that are involved,
things that I want to
know the values to.
And this cost function
might have coefficients
in front of those variables.
And this is what we would call a
"linear equation" where we just
have all of these variables that
might be multiplied by a coefficient
and then added together.
We're not going to square
anything or cube anything
because that'll give us
different types of equations.
With linear programming,
we're just dealing
with linear equations in addition
to linear constraints where
a constraint is going to look
something like if we sum up
this particular equation that is
just some linear combination of all
of these variables, it is less
than or equal to some bound b.
And we might have a whole number of
these various different constraints
that we might place onto our
linear programming exercise.
And likewise, just as
we can have constraints
that are saying this linear equation
is less than or equal to some bound b,
it might also be equal to something.
But if you want some sum of
some combination of variables
to be equal to a value,
you can specify that.
And we can also maybe specify that each
variable has lower and upper balance,
that it needs to be a
positive number, for example,
or it needs to be a number that
is less than 50, for example.
And there are a number
of other choices that we
can make there for defining what
the bounds of a variable are.
But it turns out that if
you can take a problem
and formulate it in these
terms, formulate the problem
as your goal is to
minimize a cost function,
and you're minimizing that cost function
subject to particular constraints,
subjects to equations that
are of the form like this,
of some sequence of variables
is less than a bound
or is equal to some
particular value, then
there are a number of
algorithms that already exist
for solving these sorts of problems.
So let's go ahead and
take a look at an example.
Here's an example of a
problem that might come up
in the world of linear programming.
Often, this is going to come up when
we're trying to optimize for something,
and we want to be able
to do some calculations,
and we have constraints on
what we're trying to optimize.
And so it might be something like this.
In the context of a factory,
we have 2 machines, x1 and x2.
x1 costs $50 an hour to run.
x2 costs $80 an hour to run.
And our goal, what we're
trying to do, our objective
is to minimize the total cost.
So that's what we'd like to do.
But we need to do so subject
to certain constraints.
So there might be a
labor constraint that X1
requires 5 units of labor per hour.
x2 requires 2 units of
labor per hour, and we
have a total of 20 units of
labor that we have to spend.
So this is a constraint.
We have no more than 20 units
of labor that we can spend,
and we have to [INAUDIBLE] spend
it across x1 and x2, each of which
requires a different amount of labor.
And we might also have
a constraint like this
that tells us x1 is going to
produce 10 units of output per hour.
x2 is going to produce 12
units of output per hour.
And the company needs
90 units of output.
So we have some goal,
something we need to achieve.
We need to achieve 90 units of
output, but there are some constraints
that x1 can only produce 10
units of output per hour.
x2 produces 12 units of output per hour.
These types of problems
come up quite frequently.
And you can start to notice patterns in
these types of problems, problems where
I am trying to optimize for some
goal, minimizing cost, maximizing
output, maximizing profits,
or something like that.
And there are constraints that
are placed on that process.
And so now we just need
to formulate this problem
in terms of linear equations.
And so let's start
with this first point.
Two machines, x1 and x2, x costs
$50 an hour. x2 costs $80 an hour.
Here we can come up with an objective
function that might look like this.
This is our cost function, rather--
50 times x1 plus 80 times
x2 where x1 is going
to be a variable representing how
many hours we run machine x1 for.
x2 is going to be a variable
representing how many hours
are we running machine x2 for.
And what we're trying to
minimize is this cost function,
which is just how much it costs to
run each of these machines per hour
summed up.
This is an example of a linear
equation, just some combination
of these variables plus coefficients
that are placed in front of them.
And I would like to
minimize that total value,
but I need to do so subject
to these constraints--
x1 requires 50 units of labor
per hour, x2 requires two,
and we have a total of 20
units of labor to spend.
And so that gives us a
constraint of this form--
5 times x1 plus 2 times x2
is less than or equal to 20.
20 is the total number of units
of labor we have to spend.
And that's spent across
x1 and x2, each of which
requires a different number of units
of labor per hour, for example.
And finally, we have
this constraint here.
x1 produces 10 units of output
per hour, and x2 produces 12,
and we need 90 units of output.
And so this might look something
like this, that 10x 1 plus 12x 2,
this is amount of output per hour.
It needs to be at least 90.
If we can do better, great,
but it needs to be at least 90.
And if you recall from
my formulation before, I
said that generally speaking
in linear programming,
we deal with equals constraints or
less-than or equal-to constraints.
So we have a greater-than
or equal-to sign here.
That's not a problem.
Whenever we have a
greater-than or equal-to sign,
we can just multiply the
equation by negative 1,
and that will flip it around to a
less than or equals negative 90,
for example, instead of a
greater than or equal to 90.
And that's going to be an
equivalent expression that we
can use to represent this problem.
So now that we have this cost
function and these constraints
that it's subject to,
it turns out there are
a number of algorithms that
can be used in order to solve
these types of problems.
And these problems go a little bit
more into geometry and linear algebra
than we're really going to get into.
But the most com-- popular
of these types of algorithms
are simplex, which was one of
the first algorithms discovered
for trying to solve linear programs.
And later on, a class of
interior-point algorithms
can be used to solve this
type of problem as well.
The key is not to understand
exactly how these algorithms work
but to realize that these algorithms
exist for efficiently finding solutions
anytime we have a problem
of this particular form.
And so we can take a look, for
example, at the production directory
here where here we have a
file called production.py
where here I'm using
scipy, which is just
a library for a lot of science-related
functions within Python.
And I can go ahead and just
run this optimization function
in order to run a linear program.
.linprog here is going to try and solve
this linear program for me where I
provide to this expression, to this
function call all of the data about
my linear program.
So it needs to be in a
particular format, which
might be a little confusing at
first, but this first argument
to scipy.optimize.linprog
is the cost function,
which is, in this case,
just an array or a list
that has 50 and 80 because
my original cost function was
50 times x1 plus 80 times x2.
So I just tell Python, 50 and
80, those are the coefficients
that I am now trying to optimize for.
And then I provide all
of the constraints.
So the constraints-- and I wrote
them up above in comments--
is the constraint 1 is 5x_1 plus
2x_2 is less than or equal to 20.
And constraint 2 is negative 10x_1
plus negative 12x_2 is less than
or equal to negative 90.
And so scipy expects these constraints
to be in a particular format.
It first expects me to provide
all of the coefficients
for the upper-bound equations.
"ub" is just for "upper bound"
where the coefficients of the
first equation are 5 and 2
because we have 5x_1 and 2x_2.
And the coefficients
for the second equation
are negative 10 and
negative 12 because I have
negative 10x_1 plus negative 12x_2.
And then here we provide
it as a separate argument
just to keep things separate
what the actual bound is.
What is the upper bound for
each of these constraints?
Well, for the first constraint,
the upper bound is 20.
That was constraint number 1.
And then for constraint number
2, the upper bound is 90.
So a bit of a cryptic
way of representing it.
It's not quite as simple as just
writing the mathematical equations.
What really is being expected
here are all of the coefficients
and all of the numbers
that are in these equations
by first providing the
coefficients for the cost function,
then providing all the coefficients
for the inequality constraints,
and then providing all of the
upper bounds for those inequality
constraints.
And once all of that
information is there,
then we can run any of these
interior-point algorithms
or the simplex algorithm.
Even if you don't understand how it
works, you can just run the function
and figure out what
the result should be.
And here, I said, if
the result is a success,
we were able to solve this
problem, go ahead and print out
what the value of x1 and x2 should be.
Otherwise, go ahead and
print out no solution.
And so if I run this program by
running python production.py,
it takes a second to calculate.
But then we see here is what
the optimal solution should be.
x1 should run for 1.5 hours.
x2 should run for 6.25 hours.
And we were able to do
this by just formulating
the problem as a linear
equation that we were
trying to optimize, some cost
that we were trying to minimize,
and then some constraints
that were placed on that.
And many, many problems fall
into this category of problems
that you can solve if
you can just figure out
how to use equations and use
these constraints to represent
that general idea.
And that's a theme that's going
to come up a couple of times
today where we want to be
able to take some problem,
and reduce it down to some problem
we know how to solve in order
to begin to find a solution, and
to use existing methods that we
can use in order to find the solution
more effectively or more efficiently.
And it turns out that these types of
problems where we have constraints
show up in other ways too.
And there is an entire
class of problems that's
more generally just known as
"constraint satisfaction" problems.
And we're going to now take a look at
how you might formulate a constraint
satisfaction problem and how you
might go about solving a constraint
satisfaction problem.
But the basic idea of a
constraint satisfaction problem
is we have some number of variables
that need to take on some values.
And we need to figure out what values
each of those variables should take on.
But those variables are subject
to particular constraints
that are going to limit what values
those variables can actually take on.
So let's take a look at a
real-world example, for example.
Let's look at exam scheduling,
that I have four students here,
students 1, 2, 3, and 4.
Each of them is taking some
number of different classes.
Classes here are going to
be represented by letters.
So student 1 is enrolled
in courses A, B,
and C. Student 2 is enrolled in courses
B, D, and E, so on and so forth.
And now, say, university, for
example, is trying to schedule exams
for all of these courses.
But there are only three exam slots
on Monday, Tuesday and Wednesday.
And we have to schedule an
exam for each of these courses.
But the constraint now, the constraint
we have to deal with the scheduling
is that we don't want anyone to have
to take two exams on the same day.
We would like to try and minimize that
or eliminate it if at all possible.
So how do we begin to
represent this idea?
How do we structure this in a way
that a computer with an AI algorithm
can begin to try and solve the problem?
Well, let's in particular
just look at these classes
that we might take and
represent each of the courses
as some node inside of a graph.
And what we'll do is we'll create an
edge between two nodes in this graph
if there is a constraint
between those two nodes.
So what does this mean?
Well, we can start with
student 1 who's enrolled
in courses A, B, and C. What
that means is that A and B can't
have an exam at the same time.
A and C can't have an
exam at the same time.
And B and C also can't have
an exam at the same time.
And I can represent that in this
graph by just drawing edges,
one edge between A and B, one between
B and C, and then one between C and A.
And that encodes now the
idea that between those nodes
there is a constraint.
And in particular,
the constraint happens
to be that these two can't
be equal to each other.
So there are other types
of constraints that
are possible depending on the type
of problem you're trying to solve.
And then we can do the same thing
for each of the other students,
that for student 2 who's enrolled
in courses B, D, and E, well,
that means B, D, and E,
those all need to have
edges that connect each other as well.
Student 3 is enrolled
in courses C, E, and F.
So we'll go ahead and take
C, E, and F and connect those
by drawing edges between them too.
And then, finally, student 4 is
enrolled in courses E, F, and G.
And we can represent that by
drawing edges between E, F, and G
although E and F already
had an edge between them.
We don't need another one because this
constraint is just encoding the idea
that course E and course F cannot
have an exam on the same day.
So this then is what we might
call the "constraint graph."
There's some graphical representation
of all of my variables, so to speak,
and the constraints between
those possible variables where,
in this particular case, each of the
constraints represents an inequality
constraint, that an edge between B and
D means whatever value the variable B
takes on cannot be the value that
the variable D takes on as well.
So what then, actually, is a
constraint satisfaction problem?
Well, a constraint satisfaction
problem is just some set of variables,
x1 all the way through xn, some set of
domains for each of those variables.
So every variable needs
to take on some values.
Maybe every variable
has the same domain,
but maybe each variable has
a slightly different domain.
And then there's a set of constraints.
So we'll just call a set C
that has some constraints that
are placed upon these variables,
like x1 is not equal to x2,
but there could be other forms too,
like maybe x1 equals x2 plus 1 if you--
if these variables are taking on
numerical values in their domain,
for example.
The types of constraints are going to
vary based on the types of problems.
And constraint satisfaction
shows up all over the place
as well in any situation
where we have variables that
are subject to particular constraints.
So one popular game is
Sudoku, for example,
this 9-by-9 grid where you need to
fill in numbers in each of these cells,
but you don't want to
make sure there's--
you want to make sure there is
never a duplicate number in any row,
or in any column, or in any grid
of 3-by-3 cells, for example.
So what might this look like as a
constraint satisfaction problem?
Well, my variables are all of
the empty squares in the puzzle,
so represented here is just like
an x, y-coordinate, for example,
as all of the squares where
I need to plug in a value
where I don't know what
value it should take on.
The domain is just going to be all of
the numbers from 1 through 9, any value
that I could fill in
to one of these cells.
So that is going to be the domain
for each of these variables.
And then the constraints are going to
be of the form like this cell can't
be equal to this cell, can't be
equal to this cell, can't be--
and all of these need to
be different, for example,
and same for all of the rows, and
the columns, and the 3-by-3 squares
as well.
So those constraints
are going to enforce
what values are actually allowed.
And we can formulate the same idea
in the case of this exam scheduling
problem where the variables we have are
the different courses, A up through G.
The domain for each of
these variables is going
to be Monday, Tuesday, and Wednesday.
Those are the possible values
that each of the variables
can take on that, in this
case, just represent,
when is the exam for that class?
And then the constraints
are of this form--
A is not equal to B.
A is not equal to C,
meaning A and B can't have
an exam on the same day.
A and C can't have an
exam on the same day.
Or more formally, these two variables
cannot take on the same value within
their domain.
So that then is this formulation of
a constraint satisfaction problem
that we can begin to use to
try and solve this problem.
And constraints can come in
a number of different forms.
There are hard constraints,
which are constraints that must
be satisfied for a correct solution.
So something like in the Sudoku puzzle,
you cannot have this cell and this cell
that are in the same row
take on the same value.
That is a hard constraint.
But problems can also
have soft constraints
where these are constraints that
express some notion of preference,
that maybe A and B can't
have an exam on the same day,
but maybe someone has a preference
that A's exam is earlier than B's exam.
It doesn't need to be the
case, but some expression
that some solution is better
than another solution.
And in that case, you
might formulate the problem
as trying to optimize for
maximizing people's preferences.
You want people's preferences to
be satisfied as much as possible.
In this case though, we'll mostly
just deal with hard constraints,
constraints that must be met in order to
have a correct solution to the problem.
So we want to figure out some
assignment of these variables
to their particular
values that is ultimately
going to give us a
solution to the problem
by allowing us to assign some
day to each of the classes
such that we don't have any
conflicts between classes.
So it turns out that we can classify
the constraints in a constraint
satisfaction problem into a
number of different categories.
The first of those
categories are perhaps
the simplest of the
types of constraints,
which are known as
"unary constraints" where
a unary constraint is a constraint
that just involves a single variable.
For example, a unary constraint
might be something like,
A does not equal Monday, meaning course
A cannot have its exam on Monday.
If for some reason the instructor for
the course isn't available on Monday,
you might have a
constraint in your problem
that looks like this, something that
just has a single variable A in it,
and maybe says, A is not equal to
Monday, or A is equal to something,
or, in the case of numbers, greater
than or less than something.
A constraint that just has one variable
we consider to be a unary constraint.
And this is in contrast to something
like a binary constraint, which
is a constraint that involves
two variables, for example.
So this would be a constraint like
the ones we were looking at before.
Something like A does not equal B
is an example of a binary constraint
because it is a constraint that has
two variables involved in it, A and B.
And we represented that using
some arc or some edge that
connects variable A to variable B.
And using this knowledge of, OK,
what is a unary constraint, what
is a binary constraint,
there are different types
of things we can say about a particular
constraint satisfaction problem.
And one thing we can say is we can try
and make the problem node consistent.
So what does "node consistency" mean?
Node consistency means that we have all
of the vari-- values in a variable's
domain satisfying that
variable's unary constraints.
So for each of the variables inside of
our constraint satisfaction problem,
if all of the values satisfy
the unary constraints
for that particular variable, we can
say that the entire problem is node
consistent, or we can even say
that a particular variable is
node consistent if we just want to
make one node consistent within itself.
So what does that actually look like?
Let's look at now a
simplified example where
instead of having a whole
bunch of different classes,
we just have two classes,
A and B, each of which
has an exam on either Monday,
or Tuesday, or Wednesday.
So this is the domain
for the variable A.
And this is the domain
for the variable B.
And now let's imagine we
have these constraints--
A not equal to Monday,
B not equal to Tuesday,
B not equal to Monday, A
not equal to B. So those
are the constraints that we
have on this particular problem.
And what we can now try to do
is enforce node consistency.
And node consistency
just means we make sure
that all of the values
for any variable's domain
satisfy its unary constraints.
And so we can start by trying
to make node A node consistent,
like is it consistent?
And does every value inside of A's
domain satisfy it's unary constraints?
Well, initially, we'll see that Monday
does not satisfy A's unary constraints.
Because we have a constraint,
a unary constraint here
that A is not equal to Monday.
But Monday is still in A's domain.
And so this is something that is
not node consistent because we
have Monday in the
domain, but this is not
a valid value for this particular node.
And so how do we make
this node consistent?
Well, to make the node
node-consistent, what we'll do
is we'll just go ahead and
remove Monday from A's domain.
Now A can only be on
Tuesday or Wednesday
because we had this constraint
that said A is not equal to Monday.
And at this point now,
A is node consistent.
For each of the values that A can
take on, Tuesday and Wednesday,
there is no constraint that
is a unary constraint that
conflicts with that idea.
There is no constraint that
says that A can't be Tuesday.
There is no unary constraint that
says that A cannot be on Wednesday.
And so now we can turn our attention
to B. B also has the domain Monday,
Tuesday, and Wednesday.
And we can begin to see
whether those variables satisfy
the unary constraints as well.
Well, here is a unary constraint--
B is not equal to Tuesday.
And that does not appear to be satisfied
by this domain of Monday, Tuesday,
and Wednesday.
Because Tuesday, this possible value
that the variable B could take on,
is not consistent with
this unary constraint
that B is not equal to Tuesday.
So to solve that problem, we'll go ahead
and remove Tuesday from B's domain.
Now B's domain only contains
Monday and Wednesday.
But as it turns out, there's
yet another unary constraint
that we placed on the variable B, which
is here, B is not equal to Monday.
That means that this value,
Monday inside of B's domain,
is not consistent with B's unary
constraints because we have
a constraint that says
that B cannot be Monday.
And so we can remove
Monday from B's domain.
And now we've made it through
all of the unary constraints.
We've not yet considered
this constraint, which
is a binary constraint,
but we've considered
all of the unary constraints,
all of the constraints
that involve just a single variable.
And we've made sure that
every node is consistent
with those unary constraints.
So we can say that now we have
enforced node consistency,
that for each of these
possible nodes, we
can pick any of these
values in the domain,
and there won't be a unary constraint
that is violated as a result of it.
So node consistency is
fairly easy to enforce.
We just take each node, make
sure the values in the domain
satisfy the unary constraints.
And where things get a
little bit more interesting
is what we consider different
types of consistency, something
like arc consistency, for example.
And arc consistency refers to when all
of the values in a variable's domain
satisfy the variable's
binary constraints.
So when we're looking at trying
to make A arc-consistent,
we're no longer just considering
the unary constraints
that involve A. We're trying to consider
all of the binary constraints that
involve A as well, so
any edge that connects A
to another variable inside
of that constraint graph
that we were taking a look at before.
Put a little bit more formally,
arc consistency-- and "arc"
really is just another word for like an
edge that connects two of these nodes
inside of our constraint graph--
we can define "arc consistency" a
little more precisely like this.
In order to make some variable
x arc-consistent with respect
to some other variable y, we need to
remove any element from x's domain
to make sure that every choice
for x, every choice in x's domain
has a possible choice for y.
So put another way,
if I have a variable x
and I want to make x
an arc-consistent, then
I'm going to look at all
of the possible values
that x can take on and make sure that,
for all of those possible values,
there is still some choice
that I can make for y,
if there's some arc between
x and y to make sure
that y has a possible option
that I can choose as well.
So let's look at an example of that
going back to this example from before.
We enforced node consistency
already by saying
that A can only be on
Tuesday or Wednesday
because we knew that A
could not be on Monday.
And we also said that
B's only domain only
consists of Wednesday because we
know the B does not equal Tuesday,
and also B does not equal Monday.
So now let's begin to
consider arc consistency.
Let's try and make A
arc-consistent with B.
And what that means is to make A
arc-consistent with respect to B means
that for any choice
we make in A's domain,
there is some choice we can make in B's
domain that is going to be consistent.
And we can try that.
For A, we can choose Tuesday
as a possible value for A.
If I choose Tuesday for
A, is there a value for B
that satisfies the binary constraint?
Well, yes, B-- Wednesday
would satisfy this constraint
that A does not equal B because
Tuesday does not equal Wednesday.
However, if we chose
Wednesday for A, well,
then there is no choice in B's domain
that satisfies this binary constraint.
There is no way I can choose something
for B that satisfies A does not equal B
because I know B must be Wednesday.
And so if ever I run into
a situation like this
where I see that here is a
possible value for A such
that there is no choice of the value for
B that satisfies the binary constraint,
well, then this is not arc-consistent.
And to make it arc-consistent, I would
need to take Wednesday and remove it
from A's domain.
Because Wednesday was not
going to be a possible choice I
can make for A because it wasn't
consistent with this binary constraint
for B. There was no way I
could choose Wednesday for A
and still have an available solution
by choosing something for B as well.
So here now, I've been able
to enforce arc consistency.
And in doing so, I've actually
solved this entire problem.
They've given these
constraints where A and B
can have exams on either Monday,
or Tuesday, or Wednesday.
The only solution, as it would appear,
is that A's exam must be on Tuesday,
and B's exam must be on Wednesday.
And that is the only
option available to me.
So if we want to apply our
consistency to a larger graph,
not just looking at one particular
pair of arc consistency,
there are ways we can do that too.
And we can begin to formalize
what the pseudocode would
look like for trying to
write an algorithm that
enforces arc consistency.
And we'll start by defining
a function called "revise."
Revise is going to take as
input a csp, otherwise known
as a "constraint satisfaction problem,"
and also two variables, X and Y.
And what revise is going
to do is it is going
to make X arc-consistent
with respect to Y,
meaning remove anything
from X's domain that doesn't
allow for a possible option for Y.
And how does this work?
Well, we'll go ahead
and first keep track of
whether or not we've made a revision.
Revise is ultimately going
to return true or false.
It'll return true in the event that
we did make a revision to X's domain.
It'll return false if we didn't
make any change to X's domain.
And we'll see in a moment why
that's going to be helpful.
But we start by saying
"revised equals false."
We haven't made any changes.
Then we'll say, all
right, let's go ahead
and loop over all of the
possible values in X's domain,
so loop over X's domain for
each little x in X's domain.
I want to make sure that
for each of those choices,
I have some available choice in Y that
satisfies the binary constraints that
are defined inside of my csp, inside
of my constraint satisfaction problem.
So if ever it's the case that
there is no value y in Y's domain
that satisfies the constraint for
X and Y, well, if that's the case,
that means that this value x
shouldn't be in X's domain.
So we'll go ahead and
delete x from X's domain.
And I'll set revised equal to true
because I did change X's domain.
I changed X's domain
by removing little x.
And I removed a little x because
it wasn't arc-consistent,
and there was no way I could
choose a value for Y that
would satisfy this XY constraint.
So in this case, we'll go ahead
and set revised equal true.
And we'll do this again and again
for every value in X's domain.
Sometimes it might be fine.
In other cases, it might not allow for
a possible choice for Y, in which case
we need to remove this
value from X's domain.
And at the end, we just return
revised to indicate whether or not
we actually made a change.
So this function then,
this revise function
is effectively an implementation of what
you saw me do graphically a moment ago.
And it makes one variable, X,
arc-consistent with another variable,
in this case Y. But
generally speaking, when
we want to enforce
arc consistency, we'll
often want to enforce arc
consistency not just for a single arc
but for the entire constraint
satisfaction problem.
And it turns out there's an
algorithm to do that as well.
And that algorithm is known as AC-3.
AC-3 takes a constraint
satisfaction problem,
and it enforces arc consistency
across the entire problem.
How does it do that?
Well, it's going to basically
maintain a queue or basically just
a line of all of the arcs that
it needs to make consistent.
And over time, we might
remove things from that queue
as we begin dealing
with arc consistency.
And we might need to
add things to that queue
as well if there are more things
we need to make arc-consistent.
So we'll go ahead and
start with a queue that
contains all of the arcs in the
constraint satisfaction problem, all
of the edges that connect
to nodes that have
some sort of binary
constraint between them.
And now, as long as the queue is
not empty, there is work to be done.
The queue is all of the things that
we need to make arc-consistent.
So as long as the queue is not empty,
there's still things we have to do.
What do we have to do?
Well, we'll start by dequeuing from the
queue, remove something from the queue.
And strictly speaking, it
doesn't need to be a queue,
but a queue is a traditional
way of doing this.
We'll dequeue from
the queue, and that'll
give us an arc, X and Y, these
two variables where I would
like to make X arc-consistent with Y.
So how do we make X
arc-consistent with Y?
Well, we can go ahead and
just use that revise function
that we talked about a moment ago.
We call the revise function, passing
as input the constraint satisfaction
problem, and also
these variables X and Y
because I want to make X
arc-consistent with Y, in other words,
remove any values from X's domain that
don't leave an available option for Y.
And recall, what does revise return?
Well, it returns true if we actually
made a change, if we removed something
from X's domain because there wasn't
an available option for Y, for example.
And it returns false if we didn't
make any change to X's domain at all.
And it turns out if revise returns
false, if we didn't make any changes,
well, then there's not a whole lot
more work to be done here for this arc.
We can just move ahead to the
next arc that's in the queue.
But if we did make a change, if we did
reduce X's domain by removing values
from X's domain, well,
then what we might realize
is that this creates
potential problems later on,
that it might mean that some arc that
was arc-consistent with X, that node
might no longer be
arc-consistent with X.
Because while there used to be an
option that we could choose for X,
now there might not be because now
we might have removed something
from X that was necessary for some
other arc to be arc-consistent.
And so if ever we did
revise X's domain, we're
going to need to add some things
to the queue, some additional arcs
that we might want to check.
How do we do that?
Well, first thing we want to check is
to make sure that X's domain is not 0.
If X's domain is 0, that means there
are no available options for X at all,
and that means that there is no way you
can solve the constraint satisfaction
problem.
If we've removed
everything from X's domain,
we'll go ahead and
just return false here
to indicate there's no way to
solve the problem because there
is nothing left in X's domain.
But otherwise, if there are things
left in X's domain but fewer things
than before, well, then
what we'll do is we'll
loop over each variable Z that is in
all of X's neighbors except for Y. Y
we already handled.
But we'll consider all
X's others neighbors
and ask ourselves, all right, will
that arc from each of those Zs to X--
that arc might no longer
be arc-consistent.
Because while for each Z there
might have been a possible option
we could choose for X to correspond
with each of Z's possible values,
now there might not be because we
removed some elements from X's domain.
And so what we'll do
here is we'll go ahead
and enqueue, adding something
to the queue, this arc,
Z, X for all of those neighbors' Zs.
So we need to add back
some arcs to the queue
in order to continue to
enforce arc consistency.
At the very end if we make
it through all this process,
then we can return true.
But this now is AC-3, this algorithm
for enforcing arc consistency
on a constraint satisfaction problem.
And the big idea is really just
keep track of all of the arcs
that we might need to
make arc-consistent.
Make it arc-consistent by
calling the revise function.
And if we did revise it,
then there are some new arcs
that might need to be
added to the queue in order
to make sure that everything
is still arc-consistent
even after we've removed
some of the elements
from a particular variable's domain.
So what then would happen if we
tried to enforce arc consistency
on a graph like this, on a graph
where each of these variables
has a domain of Monday,
Tuesday and Wednesday?
Well, it turns out that by
enforcing arc consistency
on this graph, while it can
solve some types of problems,
nothing actually changes here.
For any particular arc just
considering two variables,
there's always a way
for me to adjust for any
of the choices I make for one of
them, make a choice for the other one
because there are three
options, and I just
need the two to be
different from each other.
So this is actually quite
easy to just take an arc
and just declare that
it is arc-consistent.
Because if I pick
Monday for D, and then I
just pick something that isn't
Monday for B, in arc consistency,
we only consider consistency between
a binary constraint between two nodes.
And we're not really considering
all of the rest of the nodes yet.
So just using AC-3, the
enforcement of arc consistency,
that can sometimes have
the effect of reducing
domains to make it
easier to find solutions.
But it will not always
actually solve the problem.
We might still need to somehow
search to try and find a solution.
And we can use classical, traditional
search algorithms to try to do so.
You'll recall that a search problem
generally consists of these parts.
We have some initial
state, some actions,
a transition model that takes me
from one state to another state,
a goal test to tell me have I
satisfied my objective correctly,
and then some path cost function.
Because in the case of maze-solving,
I was trying to get to my goal
as quickly as possible.
So you could formulate a csp, or
a constraint satisfaction problem,
as one of these types
of search problems.
The initial state will
just be an empty assignment
where an "assignment"
is just a way for me
to assign any particular
variable to any particular value.
So if an empty assignment is no
variables are assigned to any values
yet, then the action I can take
is adding some new variable
equals value pair to that assignment
saying, for this assignment,
let me add a new value
for this variable.
And the transition model just defines
what happens when you take that action.
You get a new assignment that has
that variable equal to that value
inside of it.
The goal test is just checking to
make sure all the variables have
been assigned and making sure all
the constraints have been satisfied.
And the path cost function
is sort of irrelevant.
I don't really care about
what the path really
is, I just care about finding some
assignment that actually satisfies
all of the constraints.
So really, all the paths
have the same cost.
I don't really care about
the path to the goal.
I just care about the solution itself,
much as we've talked about now before.
The problem here though is that if
we just implement this naive search
algorithm just by implementing
like breadth-first search
or depth-first search, this is
going to be very, very inefficient.
And there are ways we can
take advantage of efficiencies
in the structure of a constraint
satisfaction problem itself.
And one of the key ideas is that we
can really just order these variables.
And it doesn't matter what
order we assign variables in.
The assignment A equals
2 and then B equals
8 is identical to the assignment
of B equals 8 and then A equals 2.
Switching the order doesn't
really change anything
about the fundamental
nature of that assignment.
And so there are some ways
that we can try and revise
this idea of a search algorithm
to apply it specifically
for a problem like a constraint
satisfaction problem.
And it turns out the search
algorithm we'll generally
use when talking about
constraint satisfaction problems
is something known as
"backtracking search."
And the big idea of
backtracking search is
we'll go ahead and make assignments
from variables to values.
And if ever we get stuck,
we arrive at a place
where there is no way we can make any
forward progress while still preserving
the constraints that we
need to enforce, we'll
go ahead and backtrack and
try something else instead.
So the very basic sketch of what
backtracking search looks like is it
looks like this, a function called
"backtrack" that takes as input
an assignment and a constraint
satisfaction problem.
So initially, we don't have
any assigned variables.
So when we begin backtracking
search, this assignment
is just going to be the empty assignment
with no variables inside of it.
But we'll see later this is
going to be a recursive function.
So backtrack takes as input
the assignment and the problem.
If the assignment is complete,
meaning all of the variables
have been assigned, we just
return that assignment.
That of course won't be
true initially because we
start with an empty assignment.
But over time, we might add
things to that assignment.
So if ever the assignment actually
is complete, then we're done.
Then just go ahead and
return that assignment.
But otherwise, there is
some work to be done.
So what we'll need to do is
select an unassigned variable
for this particular problem.
So we need to take the problem,
look at the variables that
have already been assigned, and
pick a variable that has not yet
been assigned.
And I'll go ahead and
take that variable.
And then I need to consider all of
the values in that variable's domain.
So we'll go ahead and call
this "domain-values" function--
we'll talk a little
more about that later--
that takes a variable
and just gives me back
an ordered list of all
the values in its domain.
So I've taken a random,
unselected variable.
I'm going to loop over all
of the possible values.
And the idea is, let me
just try all of these values
as possible values for the variable.
So if the value is consistent
with the assignment so far,
it doesn't violate any of the
constraints, well, then let's go ahead
and add variable equals
value to the assignment
because it's so far consistent.
And now let's recursively
call backtrack to try and make
the rest of the assignments
also consistent.
So we'll ahead and call
backtrack on this new assignment
that I've added this newest--
the variable equals value to.
And now I recursively call backtrack
and see what the result is.
And if the result isn't a failure, well,
then let me just return that result.
And otherwise, what else could happen?
Well, if it turns out the
result was a failure, well,
then that means this value
was probably a bad choice
for this particular variable.
Because when I assigned this
variable equal to that value,
eventually, down the road, I ran into a
situation where I violated constraints.
There was nothing more I could do.
So now I'll remove variable
equals value from the assignment,
effectively backtracking to say,
all right, that value didn't work.
Let's try another value instead.
And then at the very
end, if we were never
able to return a complete
assignment, we'll
just go ahead and return failure because
that means that none of the values
worked for this particular variable.
This now is the idea for backtracking
search, to take each of the variables,
try values for them, and recursively
try backtracking search, see
if we can make progress.
And if ever we run into a dead
end, we run into a situation
where there is no possible
value we can choose
that satisfies the constraints, we
return failure, and that propagates up.
And eventually, we
make a different choice
by going back and trying
something else instead.
So let's put this
algorithm into practice.
Let's actually try and use backtracking
search to solve this problem now where
I need to figure out how to assign
each of these courses to an exam slot
on Monday, or Tuesday, or Wednesday
in such a way that it satisfies these
constraints, that each of these edges
mean those two classes cannot have
an exam on the same day.
So I can start by just
like starting at a node.
It doesn't really matter
which I start with.
But in this case, we'll
just start with A.
And I'll ask a question
like, all right, let me loop
over the values in the domain.
And maybe, in this case, I'll just
start with Monday and say, all right,
let's go ahead and assign A to Monday.
We'll just go in order,
Monday, Tuesday, Wednesday.
And now let's consider node B. All
right, so I've made an assignment to A,
so I've recursively called backtrack
with this new part of the assignment.
Now I'm looking to pick another
unassigned variable like B.
And I'll say, all right, maybe I'll
start with Monday because that's
the very first value in B's domain.
And I ask, all right, does
Monday violate any constraints?
And it turns out, yes, it does.
It violates this constraint
here between A and B
because A and B are now both on Monday.
And that doesn't work because B
can't be on the same day as A.
So that doesn't work, so we might
instead try Tuesday, try the next value
in B's domain.
And is that consistent
with the assignment so far?
Well, yeah, B-Tuesday, A-Monday.
That is consistent so far because
they're not on the same day.
So that's good.
Now we can recursively call backtrack.
Try again.
Pick another unassigned variable,
something like D and say,
all right, let's go through
its possible values.
Is Monday consistent
with this assignment?
Well, yes it is.
B and D are on different
days, Monday versus Tuesday.
And A and B are also on different
days, Monday versus Tuesday.
So that's fine so far too.
We'll go ahead and try again.
Maybe we'll go to this variable here,
E, say, can we make that consistent?
Let's go through the possible values.
We've recursively called backtrack.
We might start with
Monday and say, all right,
that's not consistent because D and
E now have exams on the same day.
So we might try Tuesday instead,
going to the next one, ask,
is that consistent?
Well, no, it's not because B and E,
those have exams on the same day.
And so we try, all right,
is Wednesday consistent?
And it turns out, all right, yes it is.
Wednesday is consistent because D and
E now have exams on different days.
B and E now have exams
on different days.
All seems to be well so far.
I recursively call backtrack,
select another unassigned variable,
we'll say maybe to a C this
time and say, all right,
let's try the values
that C could take on.
Let's start with Monday.
And it turns out that's not consistent
because now A and C both have
exams on the same day.
So I try Tuesday and say, that's not
consistent either because B and C now
have exams on the same day.
And then I say, all right, let's
go ahead and try Wednesday.
But that's not consistent either because
C and E each have exams on the same day
too.
So now we've gone through all of the
possible values for C, Monday, Tuesday
and Wednesday, and none
of them are consistent.
There is no way we can have
a consistent assignment.
Backtrack, in this case,
will return a failure.
And so then we'd say, all right,
we have to backtrack back to here.
Well, now for E, we've tried all
of Monday, Tuesday, and Wednesday,
and none of those work.
Because Wednesday, which seemed to
work, turned out to be a failure.
So that means there's no possible way we
can assign E. So that's a failure too.
We have to go back up to
D, which means that Monday
assignment to D, that must be wrong.
We must try something else.
So we can try, all
right, what if D is Tue--
what if instead of
Monday, we try Tuesday?
Tuesday, it turns out, is not
consistent because B and D now
have an exam on the same day.
But Wednesday, as it turns out, works.
And now we can begin to make
some forward progress again.
We go back to E and say, all
right, which of these values works?
Monday turns out to work by
not violating any constraints.
Then we go up to C now.
Monday doesn't work because
it violates a constraint.
It violates two actually.
Tuesday doesn't work because it
violates a constraint as well.
But Wednesday does work.
Then we can go to the next variable, F,
and say, all right, does Monday work?
Well, no, it violates a constraint.
But Tuesday does work.
And then, finally, we can
look at the last variable, G,
recursively calling
backtrack one more time.
Monday is inconsistent, and
that violates a constraint.
Tuesday also violates a constraint.
But Wednesday, that doesn't
violate a constraint.
And so now, at this
point, we recursively
call backtrack one last time.
We now have a satisfactory
assignment of all of the variables.
And at this point, we can
say that we are now done.
We have now been able to successfully
assign a variable or a value
to each one of these
variables in such a way
that we're not violating
any constraints.
We're going to go ahead and have classes
A and E have their exams on Monday.
Classes B and F can have
their exams on Tuesday.
And classes C, D, and G can
have their exams on Wednesday,
and there's no violated constraints
that might come up there.
So that then was a graphical
look at how this might work.
Let's now take a look at some code we
could use to actually try and solve
this problem as well.
So here, I'll go ahead and go
into the scheduling directory.
We're here now.
We'll start by looking at schedule0.py.
We're here.
I define a list of variables,
A, B, C, D, E, F, G.
Those are all of the different classes.
Then underneath that, I
define my list of constraints.
So constraint A and B,
that is a constraint
because they can't be on the same
day, likewise A and C, B and C,
so on and so forth, enforcing
those exact same constraints.
And here then is what the
backtracking function might look like.
First, if the assignment
is complete, if I've
made an assignment of
every variable to a value,
go ahead and just
return that assignment.
Then we'll select an unassigned
variable from that assignment.
Then for each of the possible values in
the domain, Monday, Tuesday, Wednesday,
let's go ahead and create
a new assignment that
assigns the variable to that value.
I'll call this consistent function,
which I'll show you in a moment.
That checks to make sure this
new assignment is consistent.
But if it is consistent, we'll
go ahead and call backtrack
to go ahead and continue trying
to run backtracking search.
And as long as the result is not
None, meaning it wasn't a failure,
we can go ahead and return that result.
But if we make it through all
the values and nothing works,
then it is a failure.
There's no solution.
We go ahead and return None here.
What do these functions do?
select_unassigned_variable
is just going to choose a
variable not yet assigned.
So it's going to loop
over all the variables.
And if it's not already assigned, we'll
go ahead and just return that variable.
And what does the
consistent function do?
Well, the consistent function
goes through all the constraints.
And if we have a situation
where we've assigned
both of those values to
variables but they are the same,
well, then that is a
violation of the constraint,
in which case will return False.
But if nothing is inconsistent,
then the assignment
is consistent and will return True.
And then all the
program does is it calls
backtrack on an empty assignment,
an empty dictionary that
has no variable assigned and no values
yet, save that inside a solution,
and then print out that solution.
So by running this now, I
can run python schedule0.py.
And what I get as a result
of that is an assignment
of all these variables to values.
And it turns out we assign A to Monday,
as we would expect, B to Tuesday,
C to Wednesday, exactly
the same type of thing
we were talking about before, an
assignment of each of these variables
to values that doesn't
violate any constraints.
And I had to do a fair amount of work
in order to implement this idea myself.
I had to write the
backtrack function that
went ahead and went through
this process of recursively
trying to do this backtracking search.
But it turns out the constraint
satisfaction problems
are so popular that there
exist many libraries that
already implement this type of idea.
Again, as with before, the specific
library is not as important as the fact
that libraries do exist.
This is just one example of
a Python constraint library
where now, rather than having
to do all the work from scratch
inside a schedule1.py, I'm just taking
advantage of a library that implements
a lot of these ideas already.
So here, I create a new
problem, add variables
to it with particular domains.
I add a whole bunch of
these individual constraints
where I call addConstraint and
pass in a function describing
what the constraint is.
And the constraint basically says, it's
a function that takes two variables,
x and y, and makes sure
that x is not equal to y,
enforcing the idea that these two
classes cannot have exams on the same
day.
And then for any constraint
satisfaction problem,
I can call getSolutions to get
all the solutions to that problem
and then, for each of
those solutions, print out
what that solution happens to be.
And if I run python
schedule.py, I now see
there are actually a number
of different solutions
that can be used to solve the problem.
There are, in fact, six
different solutions,
assignments of variables
to values that will give me
a satisfactory answer to this
constraint satisfaction problem.
So this then was an implementation
of a very basic backtracking search
method where, really, we just went
through each of the variables,
picked one that wasn't assigned, tried
the possible values the variable could
take on.
And then if it was-- if it worked,
if it didn't violate any constraints,
then we kept trying other variables.
And if ever we hit a dead
end, we had to backtrack.
But ultimately, we might be able to be
a little bit more intelligent about how
we do this in order to
improve the efficiency of how
we solve these sorts of problems.
And one thing we might
imagine trying to do
is going back to this
idea of inference, using
the knowledge we know to
be able to draw conclusions
in order to make the rest of the
problem-solving process a little bit
easier.
And let's now go back to where we got
stuck in this problem the first time.
When we were solving this constraint
satisfaction problem, we dealt with B,
and then we went on to D. And we went
ahead and just assigned D to Monday
because that seemed to work
with the assignment so far.
It didn't violate any constraints.
But it turned out that, later on, that
choice turned out to be a bad one,
that that choice wasn't consistent
with the rest of the values
that we could take on here.
And the question is,
is there anything we
could do to avoid getting
into a situation like this,
avoid trying to go down a
path that's ultimately not
going to lead anywhere by
taking advantage of knowledge
that we have initially?
And it turns out we do have
that kind of knowledge.
We can look at just the
structure of this graph so far.
And we can say that, right
now, C's domain, for example,
contains values Monday,
Tuesday, and Wednesday.
And based on those values, we can say
that this graph is not arc-consistent.
Recall that arc consistency
is all about making
sure that for every possible
value for a particular node
that there is some other value
that we are able to choose.
And as we can see here, Monday and
Tuesday are not going to be possible
values that we can choose for C. They're
not going to be consistent with a node
like B, for example, because
B is equal to Tuesday,
which means that C cannot be Tuesday.
And because A is equal to
Monday, C also cannot be Monday.
So using that information by making
C arc-consistent with A and B,
we could remove Monday and
Tuesday from C's domain
and just leave C with
Wednesday, for example.
And if we continued to try
and enforce arc consistency,
we'd see there are some other
conclusions we can draw as well.
We see that B's only option is Tuesday,
and C's only option is Wednesday.
And so if we want to
make E arc-consistent,
well, E can't be Tuesday because that
wouldn't be arc-consistent with B.
And E can't be Wednesday because that
wouldn't be arc-consistent with C.
So we can go ahead and say E, and just
set that equal to Monday, for example.
And then we can begin to do
this process again and again,
that in order to make D arc-consistent
with B and E, then D would have to be
Wednesday.
That's the only possible option.
And likewise, we can make the same
judgments for F and G as well.
And it turns out that without having
to do any additional search, just
by enforcing arc consistency, we
were able to actually figure out
what the assignment of
all the variables should
be without needing to backtrack at all.
And the way we did that is
by interleaving the search
process and the inference
step by this step of trying
to enforce arc consistency.
And the algorithm to do this is
often called just the "maintaining
arc-consistency algorithm," which just
enforces arc consistency every time
we make a new assignment of a
value to an existing variable.
So sometimes we can
enforce arc consistency
using that AC-3 algorithm at the
very beginning of the problem
before we even begin searching in order
to limit the domain of the variables
in order to make it easier to search.
But we can also take advantage
of the interleaving of enforcing
arc consistency with search such
that every time in the search process
when we make a new
assignment, we go ahead
and enforce arc consistency
as well to make sure
that we're just eliminating possible
values from domains whenever possible.
And how do we do this?
Well, this is really
equivalent to just every time
we make a new assignment
to a variable X,
we'll go ahead and call
our AC-3 algorithm,
this algorithm that enforces
arc consistency on a constraint
satisfaction problem.
And we go ahead and call that starting
it with a queue not of all of the arcs,
which we did originally, but
just have all of the arcs
that we want to make
arc-consistent with X, this thing
that we have just made
an assignment to, so all
arcs Y, X where Y is a neighbor
of X, something that shares
a constraint with X, for example.
And by maintaining our consistency
in the backtracking search process,
we can ultimately make our search
process a little bit more efficient.
And so this is the revised version
of this backtrack function.
Same as before-- the changes
here are highlighted in yellow--
every time we add a new variable
equals value to our assignment,
we'll go ahead and run this
inference procedure, which
might do a number of different things.
But one thing it could do is call the
maintaining arc-consistency algorithm
to make sure we're able to enforce
arc consistency on the problem.
And we might be able to draw new
inferences as a result of that process,
get new guarantees of this variable
needs to be equal to that value,
for example.
That might happen one time.
It might happen many times.
And so long as those
inferences are not a failure,
as long as they don't
lead to a situation
where there is no possible way to
make forward progress, well, then
we can go ahead and add those
inferences, those new knowledge,
that new pieces of knowledge
I know about what variables
should be assigned to what values.
I can add those to the assignment in
order to more quickly make forward
progress by taking advantage of
information that I can just deduce,
information I know based on the rest
of the structure of the constraint
satisfaction problem.
And the only other change
I'll need to make now
is if it turns out this value
doesn't work, well, then down here,
I'll go ahead and need to remove not
only variable equals value but also any
of those inferences that I made, remove
that from the assignment as well.
And so here then we're often able to
solve the problem by backtracking less
than we might originally have needed
to just by taking advantage of the fact
that every time we make a new
assignment of one variable
to one value, that might reduce the
domains of other variables as well.
And we can use that information to
begin to more quickly draw conclusions
in order to try and solve the
problem more efficiently as well.
And it turns out there
are other heuristics
we can use to try and improve the
efficiency of our search process
as well.
And it really boils down to a couple of
these functions that I've talked about,
but we haven't really talked
about how they're working.
And one of them is this function
here, select unassigned variable
where we're selecting some variable
in the constraint satisfaction problem
that has not yet been assigned.
So far, I've sort of just been
selecting variables at random,
just like picking one variable
and one unassigned variable
in order to decide, all
right, this is the variable
that we're going to assign
next, and then going from there.
But it turns out that by being a
little bit intelligent, by following
certain heuristics, we might be able
to make the search process much more
efficient just by choosing
very carefully which
variable we should explore next.
So some of those heuristics include
the Minimum Remaining Values, or MRV
heuristic, which
generally says that if I
have a choice between which
variable I should select,
I should select the variable
with the smallest domain,
the variable that has the fewest
number of remaining values left,
with the idea being if there are
only two remaining values left, well,
I may as well prune one of them very
quickly in order to get to the other
because one of those two has got to be
the solution if a solution does exist.
And sometimes, minimum remaining values
might not give a conclusive result
if all the nodes have the same number
of remaining values, for example.
And in that case, another heuristic
that can be helpful to look at
is the degree heuristic.
The degree of a node
is the number of nodes
that are attached to that node, the
number of nodes that are constrained
by that particular node.
And if you imagine which
variable should I choose,
should I choose a variable
that has a high degree that
is connected to a lot of
different things or a variable
with a low degree that is not
connected to a lot of different things?
Well, it can often make sense
to choose the variable that
has the highest degree,
that is connected
to the most other nodes as the
thing you would search first.
Why is that the case?
Well, it's because by choosing a
variable with a high degree, that
is immediately going to constrain
the rest of the variables more,
and it's more likely
to be able to eliminate
large sections of the
state-space that you
don't need to search through at all.
So what could this actually look like?
Let's go back to this
search problem here.
In this particular case,
I've made an assignment here.
I've made an assignment here.
And the question is, what
should I look at next?
And according to the minimum remaining
values heuristic, what I should choose
is the variable that has the
fewest remaining possible values.
And in this case, that's
this node here, node C
that only has one variable left in
this domain, which, in this case,
is Wednesday, which is a variable
reas-- very reasonable choice
of a next assignment to make because I
know it's the only option, for example.
I know that the only possible
option for C is Wednesday.
So I may as well make that assignment
and then potentially explore
the rest of the space after that.
But meanwhile, at the
very start of the problem
when I didn't have any knowledge of
what nodes should have what values yet,
I still had to pick what node should
be the first one that I try and assign
a value to.
And I arbitrarily just chose the
one at the top, node A, originally.
But we can be more
intelligent about that.
And we can look at
this particular graph.
All of them have domains of the
same size, domain of size 3,
so minimum remaining values
doesn't really help us there.
But we might notice that node
E has the highest degree.
It is connected to the most things.
And so perhaps it makes
sense to begin our search,
rather than starting at node
A at the very top, start
with the node with the highest
degree, start by searching from node
E. Because from there, that's
going to much more easily allow
us to enforce the
constraints that are nearby,
eliminating large portions
of the search space
that I might not need to search through.
And in fact, by starting
with E, we can immediately
then assign other variables.
And following that, we can actually
assign the rest of the variables
without needing to do any
backtracking at all even if I'm not
using this inference procedure.
Just by starting with a
node that has a high degree,
that is going to very
quickly restrict the values
that other nodes can take on.
So that then is how we
can go about selecting
an unassigned variable
in a particular order
rather than randomly picking a variable.
If we're a little bit intelligent
about how we choose it,
we can make our search
process much, much
more efficient by making
sure we don't have
to search through portions of
the search space that ultimately
aren't going to matter.
The other variable we
haven't really talked about,
the other function here is this domain
values function, this domain values
function that takes a
variable and gives me back
a sequence of all of the values
inside of that variable's domain.
The naive way to approach it is what we
did before, which is just go in order,
go Monday, then Tuesday, then Wednesday.
But the problem is that
going in that order
might not be the most efficient
order to search in, that sometimes it
might be more efficient to choose values
that are likely to be solutions first
and then go to other values.
Now, how do you assess whether a value
is likelier to lead to a solution
or less likely to lead to a solution?
Well, one thing you can
take a look at is how many
constraints get added, how
many things get removed
from domains as you make this
new assignment of a variable
to this particular value.
And the heuristic we can use here is
the least constraining value heuristic,
which is the idea that we
should return variables in order
based on the number of choices that
are ruled out for neighboring values.
And I want to start with the least
constraining value, the value that
rules out the lea--
fewest possible options.
And the idea there is that
if all I care about doing
is finding a solution, if I
start with a value that rules out
a lot of other choices, I'm
ruling out a lot of possibilities
that maybe is going to make it less
likely that this particular choice
leads to a solution.
Whereas on the other
hand, if I have a variable
and I start by choosing a value
that doesn't rule out very much,
well, then I still have a lot of
space where there might be a solution
that I could ultimately find.
And this might seem a little bit
counterintuitive and a little bit
at odds with what we were talking
about before where I said, when you're
picking a variable, you should pick
the variable that is going to have
the fewest possible values remaining.
But here, I want to pick the
value for the variable that
is the least constraining.
But the general idea is that
when I am picking a variable,
I would like to prune large
portions of the search space
by just choosing a variable that is
going to allow me to quickly eliminate
possible options.
Whereas here, within
a particular variable,
as I'm considering values that
that variable could take on,
I would like to just find a solution.
And so what I want to
do is ultimately choose
a value that still leaves open the
possibility of me finding a solution
to be as likely as possible.
By not ruling out many options,
I leave open the possibility
that I can still find a solution without
needing to go back later and backtrack.
So an example of that might be,
in this particular situation here,
if I am trying to choose a variable
for-- a value for node C here,
that C is equal to either
Tuesday or Wednesday.
We know it can't be Monday because
it conflicts with this domain
here where we already
know that A is Monday.
So C must be Tuesday or Wednesday.
And the question is,
should I try Tuesday first,
or should I try Wednesday first?
And if I try Tuesday,
what gets ruled out?
Well, one option gets ruled out here.
A second option gets ruled out here.
And a third option gets ruled out here.
So choosing Tuesday would rule
out three possible options.
And what about choosing Wednesday?
Well, choosing Wednesday would
rule out one option here,
and it would rule out one option there.
And so I have two choices.
I can choose Tuesday that
rules out three options
or Wednesday that rules out two options.
And according to the least constraining
value heuristic, what I should probably
do is go ahead and choose
Wednesday, the one that rules out
the fewest number of possible
options, leaving open
as many chances as possible
for me to eventually find
the solution inside of the state-space.
And ultimately, if you
continue this process,
we will find a solution,
an assignment of variables
to values that allows us to
give each of these exams--
each of these classes an exam
date that doesn't conflict
with anyone that happens to be enrolled
in two classes at the same time.
So the big takeaway
now with all of this is
that there are a number of different
ways we can formulate a problem.
The ways we've looked at today
are we can formulate a problem
as a local search problem, a problem
where we're looking at a current node
and moving to a neighbor based
on whether that neighbor is
better or worse than the current
node that we are looking at.
We looked at formulating
problems as linear programs
where just by putting things in
terms of equations and constraints,
we're able to solve problems
a little bit more efficiently.
And we saw formulating a problem as
a constraint satisfaction problem,
creating this graph of
all of the constraints
that connect to variables that
have some constraint between them,
and using that information to be able to
figure out what the solution should be.
And so the takeaway
of all of this now is
that if we have some problem
in artificial intelligence
that we would like to use
AI to be able to solve them,
whether that's trying to figure
out where hospitals should be,
or trying to solve the
traveling salesman problem,
and trying to optimize productions,
and costs, and whatnot,
or trying to figure out how to
satisfy certain constraints,
whether that's in a Sudoku puzzle, or
whether that's in trying to figure out
how to schedule exams for a university,
or any number of a wide variety
of types of problems, if we can
formulate that problem as one
of these sorts of problems,
then we can use these known
algorithms, these algorithms
for enforcing arc consistency
and backtracking search,
these hill-climbing and
simulated annealing algorithms,
these simplex algorithms and
interior-point algorithms that
can be used to solve
linear programs, that we
can use those techniques to begin to
solve a whole wide variety of problems
all in this world of optimization
inside of artificial intelligence.
This was an Introduction to Artificial
Intelligence with Python for today.
We will see you next time.
