making decisions and finding insights from your data
is an essential part of data science and
one such I'd rather than which is widely
used for this purpose this decision and keeping
that Borden's off decision tree in mind we
have come up with this comprehends Of course
Now before we go ahead recession I like
to inform you guys that we've launched a
completely three black form called Greek Learning Academy
where you have access to free courses such
as the I Cloud and that still marketing
you can check out the details in the
description below Now let's have a quick glance
at the agenda will start off by understanding
Where exactly is machine learning and what does
that decision t algorithm in machine learning Then
we'll go ahead and implement the Decision tree
algorithm with the R programming language After will
understand the concepts off in Droopy and Gini
Index going at will understand what exactly is
a war fitting and how can we prevent
were fitting in decision T and finally will
understand Shannon's entropy So let's start off a
recession All right so we'll understand machine learning
with this little example over here So what
do you see in this Like what This
is exactly Well it's a fish doesn't it
On What is this Will this again is
a fish And how about this Will this
do is a fish Now how do you
know all of these are fish Well as
a kid you might have come across a
picture of a fish and you would have
been told by up kindergarten teacher or your
appearance that this is a fish and your
brain learns that anything which looks like that
is a fish right And that is our
brain functions But whatever a machine I fighting
this major for fish and I feel it
through a machine will be able to understand
that this is actually your fish Well this
is what machine learning comes and surrender was
I'll take all of these images off the
fish and I'll keep on feeding them to
this machine until it loans all the features
associated with this fish Now when I say
all the features associated with this fish so
the features would be let's say a fish
has to wise it has two friends and
has a tail has gills And so one
so this machine would learn all of these
features associated with the fish Now once the
training is done it is given new data
to determine how much it has learned So
when they see trying tenuous first the machine
is given training data So once forced the
training is done and one cyclones all of
the features associated with the training data it
is given the test data So this is
where we are actually testing the machine how
much it has learned So we have given
this new image off this fish to this
machine and we're trying to estimate whether this
machinist able to correctly labeled this majors fish
or not So this is the underlying concept
off machine learning right now that we know
where exactly as machine learning will have a
look at the categories off machine learning So
machine learning can be broadly categorised into supervised
learning and unsupervised learning and will serve the
1st 1 which is supplies learning So in
surprise learning you basically have output variables and
input variables and this output variables are denoted
with white and the import valuable as denoted
with X So these are quickly edibles are
also denoted as dependent variable and this input
variable is also known as an independent valuable
and we're trying to determine how does on
output variable change with respectable input Variable Or
basically how does a dependent variable very with
respect One independent variable So that's what you
see Why equals F X over here Why
would be the dependent Variable and X would
be the independent variable and we're trying to
determine That's why very with X are not
so against provides Learning can be broadly categorised
into regression and classification so we'll start with
the 1st 1 which is classification So as
you see or hear classification is basically the
process off breaking the class off a new
variable Now let's take this example to understand
classification better So let's have you want to
determine whether a person has cancer are not
on the basis off Whether the person smokes
are not so over here where the person
has cancer on what would be the dependent
variable and where the person smokes on or
to be the independent variable cvb ski have
with details It which would comprise off two
columns and one column would be the person
whether he has cancer Oh no And then
the next column would be whether the person
smokes a lot right And on the basis
off this column we're trying to determine whether
the person has cancer or not And this
is Bs leader underlying method of classification Now
there is another matter known as regression So
as stated over here this method is basically
used to find Alina relationship among different entities
So over here we have a dependent variable
and we have the independent variable So the
dependent feta billers inverted with white and that
independent available is noted with Rex Now what
happens Integrationist Let's see if I put in
some random value off eggs we would have
to get a value off Why for that
So this is what is basically known as
regression So for the arbitrary value off X
we're trying to find out a corresponding value
off white Now we'll head onto the most
important part of a session which is decision
free algorithm So decision tree algorithm is a
supervised learning matter which is used for boot
classification as we last regression purpose Right So
you can put classify the data also you
can use it to up You know find
out what is the relationship between white and
X So let's understand the decision tree algorithm
with this up major we're here Now let's
see You know I have this very simply
question so I want to know whether So
my question is this I wouldn't know where
that I can watch the movie Avengers or
not And for this I have a set
off decisions to make So what you have
is basically your three light structure and this
tree like structure is Decision Tree over here
So we'll start off in the first condition
over here which is whether I like Marvel
movies or not And if I like Marvel
movies I'll come onto the left side offered
right sort Osias And then again I'll check
the second condition which is if I am
a final for Robert Downey Jr No if
I am a fine off Robert Downey Jr
I will definitely what of NGOs But if
I don't like Robert Downey Jr then I
will not watch Avengers Similarly if I don't
like Marvel movies then the next condition would
be whether I'm a d c Fanboy or
not And if I am a B C
Fanboy then I will definitely not watch Avengers
because I'm on time are well so that
is why not go and watch Avengers But
on the other hand if I'm not a
d c find boy but really like Scarlett
Johansson then I will go and watch Avengers
But if I don't like Marvel movies and
I'm not a d c fan but and
I also don't like scholar Johnson then I'll
not watch Avengers So this is what happens
and decision tree algorithm So we have a
set off decisions which we are making a
what Here and on the basis off thes
set off decisions it will go ahead and
classify the data to this is out The
decision tree algorithm works All right so this
is the basic idea behind decision three algorithm
Now we'll have a comprehensive demo off this
decision tree algorithm in boot are and vital
so we'll start off with our So what
will be working with this car seats detested
And this car seats data set is actually
part off The isil are package so I'll
go ahead and load this Arcelor package first
to load the package I will type in
library and then given the name of the
package which is I s L art So
I have loaded the package Now this Eisler
package basically has some data sets with it
So I love have a glance of different
deal assets which are part of this Isola
package So as you see over here these
are the different facets you got the artery
aside Carolan car seats college credit default And
someone not of all of these dealers said
I want to work with this car seats
dealer said which is basically about the sales
off child car seats Now I have to
load this Diaz it so the scars his
dressage As you see over here this is
actually capital C Now I will go ahead
and Stuart this in a new object and
I will name the object in car seats
over here So this car seat where you
see So this is small C and this
is capital C So I am luring this
car seats deficit which is part of the
isil are package and I'm staring into a
new object and living their object to be
car seats Now let me have ah plants
That is your asset first So I'll just
type in car seats over here Right to
this is my data set Now I love
have a glance at this deal as it
properly car seats So we have the description
off this day does it over here So
as you see this is ah simulated deposit
containing seals off child car seats at 400
different stores So we have 400 observations in
total or 400 records in total and their
11 variables or 11 condoms in total Now
these are the different column So we've got
that sales Call him over here The sales
column has basically the unit sales in thousands
at each location to this 9.5 would be
9.5 in 2000 so that could be around
9000 seals Then we've got comprise column which
is the price charged by computer at each
location And then we've got the income column
over here So income column Bird obviously tell
you the community income level and thousands of
dollars Then they got the advertising column which
they know it's the local advertising budget for
company at each location Then we've got the
population Call him over here So this will
tell you the population size and thousands Then
we've got the price So prize the company
charges for car seeds at each side Then
we've got the shell location So this basically
tells you that there are three levels for
the shelf location Either it is good bad
or medium And this indicates the quality of
the shelving location for the car seats at
each site And then you've got all of
these different columns over here So you got
the age column EDUCATION COLUMN A and column
Angeles COLUMN You can just go through the
description off this entirety does it over here
So now that we have looked at the
details said we will love started by you
know are Tom Dayton manipulation purpose over here
So now what You see Yes we have
the SEALS column So I want to add
a new column which is dependent on the
SEALS column So let's say that over the
sales is greater than eat so the speed
actually means 1000 So wherever those seals is
greater than 8000 I want to denote it
with high And where were the seals Is
actually less than eight I want to denote
it Us Look And that is what I'm
doing with this particular command over here So
I will use this if else close And
over here I am checking If though seals
the number off seals is actually greater than
or equal to eight And if the number
of sales are equal to or greater than
eat then I wanted to be denoted with
Yes on the other hand of the number
of sales are actually less than it Then
I wanted to be denoted with a new
and I'm storing this in a new object
and even that object to be high Now
let me just show you the head off
high over here So I will type in
high dot head So let me go ahead
and actually run this again over here So
let me check the class of this class
off height Right So this is actually your
character so I'll just pretend high over here
on day I'll get maybe the 1st 5
elements What you see You have Yes Yes
yes No and no Now let me compare
this with this particular column over here So
as you see the number off seals it
is 9.5 And since 9.5 is actually greater
than it you have a yes value Similarly
you have 11.2 over here and since 11
point towards greater than eat you have Yes
Then you have 10.6 which is greater than
eight again You have a years after that
We have seven point forward and 4.1 fight
which are less than it And that is
why we have a no value over here
right So we have basically in created a
new object where we are getting yes and
no values on the basis off this seals
column So now that this is done or
will the worst we will add a new
object So we will basically add this into
this entire data frame Right So I am
using the data free in function over here
Using this I will basically add this particular
object into their car seats data from right
So I am attaching this new object as
a new column into this data set and
I am storing it back to the car
seats Data from over here Now let me
have a glance that car seats of you
off car seats No If you have a
glance at the studio said Do you see
that we have added this new column Right
So let me show you the original data
set Review off Let me put in Capital
C over here So this was the original
data set which did not have the high
column And if you look at this particularity
does it over here we have this high
column over here So all of these yes
values denote those cases where the seals is
greater than it on all of these No
values denote where the sales value is actually
less than eight So we have done this
We have added a new column toe this
Now this would be our dependent variable Right
So are a problem statement for this particular
up Unity does It would be We want
to understand whether the sales are high or
not on the basis off all of these
different factors So I have created this column
now since this column is actually based on
the seals column I will go ahead and
remove this column So let me computers and
let me peace it out over here So
as you see what I'm basically doing us
I am selecting all of the rules from
this car seats did us it And I
am also selecting all of the columns except
the foods Call him That is why I
put in minus one over here So when
I put in minus one I'll be getting
all of the columns except the first column
and I am scoring result back toe The
same leaders it over here So this is
done Now let me have a glance of
this deal Asset view off car seats So
as you see I have removed the first
call and prayed to the first column was
the SEALS column which I have successfully removed
from over here Now it's finally time to
build our decision three model so we'll go
ahead and build a decision tree So the
bill the decision tree we would require this
tree package so for should have to install
the packets You'll click on packages he'll pick
on install and you will just type and
tree over here and when you click on
install the package would be installed And since
I have already installed it I don't need
to do that again So I'll type in
library and go ahead and load this package
So as you see I have successfully loaded
this tree library So now that this is
done it's time to build my model And
as I have told him the problem statement
does this I want to understand whether the
sales are high or not on the basis
off the rest off the column threat If
I were to understand where the seals are
higher not on the basis of the income
out on the basis of advertising a population
of prison so one So I will go
ahead and will this model So the street
back it gives is this three function and
inside this we are giving into parameters So
they're forced parameters Basically the formula where we
given the dependent variable and all of the
independent variables So over here are dependent variable
Is this hi column over here Right So
and the independent feasible So I have put
in this dart symbol over here to this
dot symbol basically means that I am selecting
all other variables as the independent variables except
this particular variable over here So let's take
this still a symbol So whatever is given
on the left side off the tilde symbol
all off those values are corrupt You know
also that video ballistic and as a dependent
variable and whatever values are given on the
right side of the hill assemble all of
them would be taken as our independent variables
So here I would be a dependent variable
and the rest of the cards will be
independent variables And I want to build this
model on top of the car seeds They
does it right to This is the second
batter meter and I am storing this in
a new object and I am naming their
object to be tree model So I have
successfully built this model Now let me have
a glance at the somebody off this model
so I'll use the somebody function and I
will pass in this object which I have
just built So somebody off tree model on
this is what we have So as you
seems this basically gives us no formula which
we had just given and there are 27
terminal notes now There were multiple conditions and
after those multiple conditions we basically basically reads
to the stage where we had 27 reasons
for 27 conclusions And then we've got the
misclassification error rate and the residual mean David's
I had to keep in mind that the
misclassification error rate and the residual mean David's
have to be as Louis possible right So
we have successfully created this model Now let
me go ahead and also make a plot
off the survival type implored and I will
pass in the same object over here with
just remodel inside the plot function So you
see this is what we get Now let
me zoom this and let me count the
number of terminal notes I got 123456789 10
11 12 13 14 15 16 17 18
1920 2122 23 24 25 26 27 Right
to as it was stated in the somebody
we have 27 terminal nodes over here Now
I'll go ahead and add their text to
this so I will use the X function
and I will pass in the tree model
object inside this and I have successfully added
that text to this plot as well So
the first split as based on the shelf
location column So if those shell location as
bad or medium then we'll come to the
left side off this north or the root
node And if the shell location is actually
good then we'll come to the right side
off This So as you see that says
up though forced would know out of the
shell location is actually bad or medium will
come to the left side And here we
are checking the prices less than 92.5 And
of the price is less than 92.5 will
come over here and then we'll check of
the income as less than 57 Flanagan will
check the computer price Right So we're this
is Nacchio So the company don't prize has
to be less than something And then we'll
come down and we'll get this again Over
here on the right side we are checking
of the population is less than 207.5 So
that the population is less than 207.5 all
breed event or 207.5 You basically get a
Yes So this yes means that those sales
are actually high So in the terminal notes
that ever you have yes values all of
those yes values indicate that the seals are
high And wherever you have new values this
indicates that the seals are not high Similarly
if I come down to this part of
the official location is actually good then my
conditioners whether the price is less than 1
35 And then again I come down over
here and I am checking prices So here
I'm checking whether it is based in us
or not If does not based in us
I'll come down to the left side again
I'll check of the price is less than
109 to the price is less than 109
a weird then yes the scenes will be
high Similarly I come down of the price
is less than 1 35 And if the
income is also less than one the incoming
stirs in 46 And obviously since the income
is less You know those sales will not
be high over there When people are in
ordinary porches are a lot of car seats
for that right And then up from the
other hand if the income is actually greater
than 46 then the number off seals would
be high over here I think this is
what the Yes basically the notes All right
So we have you know built this model
and we have also specialized this particular model
over here Now you know that this is
up We have done this without dividing our
data set into order Testings that and the
training said what happens when they don't divide
the data into trusting and their training centers
We actually go ahead and were fit our
model onto this data How to understand this
I'll give you an example So let's say
you have an exam tomorrow and up here
and you have not steadied anything at all
But then again your friend comes to your
rescue He felt that you know he has
found up up You know he has basically
bought the same question People somewhere Obviously the
question people was lead And he tells you
that there would be only five questions in
tomorrow's exam and it gives you the question
people and you go ahead and basically up
you know uh alone Or remember all of
the answers for those questions Now there are
two scenarios If in tomorrow's exam the same
set off questions coming then you are going
to ease the example It you're going to
get five out of five But on the
other hand if none of those questions come
comment right so maybe you are fooled by
your friend Or maybe your friend just took
money from you break And it could also
be the case that one of the questions
coming so you have only loan to spike
missions But in the exam they were totally
new questions And this is the case where
you will totally feel So that is why
you know So that's what I'm trying to
tell You have to divide your data set
into training investing sets So first you go
ahead and train the model and once your
machine knows properly about your day does it
That is when you will go ahead and
test its capacity So it's extremely important for
you to divide your data set into training
and testing sets And that is what we're
going to do over here So do do
either data set into training and testing sets
I am goingto I'll be needing the CIA
tools package so I will go ahead and
load the slightly which Seattle's now the CIA
tools package Word up Give me those sample
dot Split function over here So let me
go ahead and ah use the sample dog
split function So this again takes into parameters
over here So the first Param Eter is
basically the column on the basis off which
you would want to are split your leader
so that particular column would be high because
is our dependent variable and the split ratio
which I'm giving over here 0.65 that is
65% off These records would have the true
value and the rest 35% off the records
would have the false value And this result
I am storing it in a new object
and I am naming the object to be
equal to split bag Now I'll just show
you word is there in the split tag
So I'll actually show you may be the
1st 5 results off this You have true
falls True True True Now that this means
that are those So you see 65 0.65
So 65% of the records have been assigned
this true value and the rest 35% of
the records have been assigned this fall's value
over here So now that we have the
true and false tags over here we are
going toe subset our data on the base
off this true and false labels So wherever
we have their true labels we are going
to select all of those true labels and
store them into our trainings That so inside
the subset function I am going to get
the he does it which is basically car
seat After that I will specify that split
tag is equal to true So from this
entire car seats the tacit wherever the value
off split tag is equal to true I
am going to select all of those records
and store them in the train set similarly
from this entire car see each day does
it Wherever the split tag is equal to
falls I am going to select all of
those records and then store them into the
tests it So now we've got their testing
and their training sets treaty Well let me
show you the number off Rousse president indoor
testing and the training set So there are
260 those present in the training set and
there are 100 and 40 records present in
the test set All right so that was
the spitting off our data into training and
testing Said So now that this is done
we are again going to build a modern
So now the difference is we are going
to build our mortal on the training set
and not on the entire denounce it will
you So I will use the three function
again And this time this would be the
function right So the first parameters there seems
So I am passing and the high column
which is our dependent variable And after the
tell assemble I'll just put in a dark
over here For this dot means that all
other columns except the high column would be
my independent variables And once I you given
the formula over here I will give in
the data on which I want to build
this model So I will build this morel
on top of the train set So yeah
so I have successfully built this model Now
let me go ahead and clear this out
and I'll have Ah look at the somebody
off this So somebody off the model which
I have just created So as you see
now I have only 21 terminal nodes over
here on also I have the president will
mean Davey ins and misclassification error rate given
Right So I have built the model Now
let me go ahead and visualized this model
So plot off remodel Let me again count
on number off for terminal nodes over here
So 123456789 10 11 12 13 14 15
16 17 18 1920 21 So as you
see there are 21 terminal nodes which is
what was stated in those Somebody off this
Now let me go ahead and add text
to this textile person tree model inside this
on and let me zoom this over here
So again as you see the split as
being decided on the basis off this shelf
location called him Now if I go on
to the left side off this so if
those shell location is bad or medium I'll
come onto the left side Next I'll be
checking of The price is also less than
92.5 Have the prices less than 92.5 I'll
come over here and then I'll check of
the population is less than 207.5 So of
the population is less than 207.5 Again I'll
come to the left side and I'll check
if advertising is less than nine And if
this is the case then the result would
be no That is the sales will not
be high So for the rest of the
two cases it will be yes Now let
me come to the right side offered So
if the Shell location is actually good then
I'll come to the right side off it
and I'll check of the price is actually
less than 156.5 or it is greater than
that Several It's less than that Then I'll
check if the advertising is greater than six
and the advertising of you know then six
Then it will be yes on the other
hand of the advertising us less than six
Then again I'll be on this side and
I'll check up The age is less than
55 or greater than 55 So the ages
no off the e of the people in
the region is less than 55 Then the
scenes of the guards each would be high
and the things of the car seats wouldn't
be high So this is the sort of
inference which we can draw from this So
we have out of the text we have
visualized it Now it's time to predict the
values on top of the tests it So
we have successfully done this and to predict
the values Let me just go ahead and
copied us over here and a piece this
over here So I'll be using the credit
function now this predict function peaks and thes
three para meters over here So the Air
Force Param Eter would be the tree model
So this is basically a little model of
which we have just built and then the
second off perimeter would be the best Second
perimeter is basically that that sit on which
we want to build a model and then
we will give something new So this part
I mean that would specify what is their
type of prediction that we want And since
they're doing classification the type of prediction over
here will be class So I have used
this predict function and I am storing the
predicted results in three Now I will go
ahead and build the confusion matrix for this
So to build the confusion matrix I will
be using this table function So as you
see over here inside the stable function I'll
be passing in the actual values and the
predicted values So the actual values are stored
and test dollar high and the protector values
are stored in three underscored bread Right So
I have created this confusion metrics Now inside
this UBC have a tabular rock You basically
have a table right And you have this
no nds values over here So what does
this mean Actually So what This tells us
or off all the records where the seals
while you was actually No 66 off them
were correctly predicted Toby No And this value
tells that are off all the records where
the value is actually No 17 off them
were incorrectly predicted Toby Yes When we come
down over here off all the records where
the actual value was Yes 25 off them
were incorrectly predicted to be no Similarly off
All the records were actual value was yes
32 off them were correctly predicted as yes
for this left Dagnall But you see over
here this indicates all of the records which
were correctly predicted And this right Dagnall indicates
all of the values which are incorrectly predicted
to look at the accuracy You will basically
be white This left diagonal with all of
the values present and this metrics So let
me do that over here so I'll type
in 66 plus 32 divided by 66 less
30 to bless 25 Bless 17 And we
get an accuracy op 0.7 over here Now
let me actually store this an A C
C one over here And let me actually
beast this over here All right so I
have got the accuracy off this Now I'll
actually show you the plot off this again
Let me also add the text over here
So as you see we have created a
very big decision pre over here right to
the Rather Lord of Splits and the Freezer
Tree which has overgrown Now the question Estes
do we actually need such victory to get
the right accuracy What if we get the
perfect accuracy Maybe you know if we and
the spitting at this part out in the
spitting at this park So how we find
it out So this is where you know
the mean David's comes in the misclassification added
eight comes in So you know there could
be a case where you know if we
stop spreading at this particular stage the misclassification
evidence will be the longest on We stopped
writing at this particular states though you know
being classification error it could be the lowest
So let's go ahead and find it out
So for that purpose develop you know actually
be building our cross validation for ruining our
tree So we have this cv 0.3 function
which is again part of the three package
Now let me go ahead and use this
function over here So the stakes into para
meters So the first Param Eter is the
tree model or the model which we have
built And the second perimeter the function which
is Prune Dart Miss Class So we are
trying to prune the street by finding out
the right value of the misclassification error rate
and I am storing it into a new
object And that new object is basically CV
three So now that this is done right
let me just bring out CV tree over
here so I'll just type and see Vetri
But like so just give me a second
guys All right So I have successfully given
this for war here Now Um what you
see over here ISS you have sized Avians
You have key value You have method over
here So we started off with one route
north Then we had to root notes 35
and so one and the initial David's was
107 82 the Davids unit kept on decreasing
So this David's kept on decreasing till 57
But after that no the Davids increased so
this would be the right value off the
number off terminal notes that is 12345 and
6123456 So this is where we have to
stop right So when the number off terminal
nodes are actually nine doctors when they would
have the lowest value off deviance let me
show you again 123 456 And over here
you have 12345 and six Right So what
The value off 57 we have the lowest
Davey als So um let me check that
out So no what I will do us
I will go ahead than the prune the
tree over here and to prune the tree
I am using this prune dot miss class
function and then this prune dot Miss class
function again I am passing in two parameters
So the first parameter as tree model And
then I will give the best to be
equal to nine that does this will And
when the number off terminal knows that actually
nine So now that we have given this
we will store this in a new object
and will name that object to be pruned
model Now I will go ahead and blood
The pruned model Over here It's a plot
off prune model And let me show you
the number off Now the number off dominant
loads of years 12345678 and nine So as
you see you have nine terminal nodes over
here Now let me go ahead and also
at the text over here Right So 123456789
We have also added the text over here
right So again the route know the split
As with respect to shelve location social location
is bad or medium They will come onto
the left side off this and then we'll
check up the prices less than 92.5 and
that the price is less than 92.5 then
obviously the number of sales of car seats
would be high from the other hand of
the price is greater than 92.5 Then we'll
come to the right side offered and again
we we are checking multiple conditions over here
Right So this is how you can actually
prune your tree and you can figure out
for you know do we actually have to
work for it artery or we can basically
you know cut our our creator particular stage
Now earlier we heart garden accuracy of 0.7
with the fully grown tree Now after pruning
this tree let me go ahead and predict
again So this time the model which I
have built us pruned model So I am
trying to predict the values with respect to
this pruned model On top of the tests
it and the type of protectionist class synthesis
classification And I'll be storing this entry underscore
bread right So I have done this over
here Now again Let me go ahead and
build up this confusion matrix So I will
be using the table function to build this
confusion matrix over here this again takes into
para meters So the first part omitted all
the actual values which are stored in test
dollar high And then I will given the
predicted values which are starting tree underscore Fred
Now let me check in the accuracy over
here So as you see this left Dagnall
would indicate all of those values which have
been correctly classified and the strike Dagnall indicates
all of those values which have been incorrectly
classified So it would be 64 less 40
divided by 64 plus 40 plus 19 plus
17 And we get an accuracy off 74%
right So with a fully grown without fully
grown tree we had an accuracy off 70%
But when we had actually pruned our three
we got in a curiously off 74 balls
And right so this means that you don't
have to work through the RT You can
actually prune your create a particular states and
that can give you are much much much
better accuracy than what you're fully grown tree
can give you So this is our confusion
Metrics over here right So we are done
with our first case study where we were
trying to classify whether the number off steals
would actually be high or low on the
basis off all of these different columns over
here So now that this is done they
will go ahead to the second part off
it So that's what we that was actually
binary classification Now instead of buying a declassification
I actually want to do my DIY off
you know multi class classification through do multi
class classification I'll be working with the IRS
data set So let me show you the
IRS data set over here I'll type in
view and let me give in the name
of the data set which is actually Iris
so this is no idea Zeros it over
here So I've got all of these columns
over here You got CEPAL and supple worth
better land Patel Right on the species Call
him Now There's a multi class classification because
I am trying to determine the species off
this iris flood with respect to the rest
off these columns over here So let me
just show you I'll use the table function
and I'll type and iris dollar species So
as you see we've got three different species
So the species off the flood it could
be Sentosa warsi Color or war Janica And
there are 50 species offset Those are 50
species of OSI color and 50 species off
for Jessica over here Right So as after
you I want to find out the species
off this iris flowers with respect to the
rest off the four columns over here So
now that this is done again I want
to divide this entire artistic asset in tow
Do part of the trainings That and the
best things that So for this propose I'll
not actually be using Seattle's I'll be using
the carrot package see up again So do
build the decision Fever of God different packages
so that we can use the three package
Or we can use the party package or
we can use the our part package So
the first example of the foods to move
what was with respect to the three package
and the second demo is with respect to
the party package So let me go ahead
and do it at this library so I'll
type in library off party over here Right
So I have successfully loaded this party package
No To divide this data set into training
and testing site I will be loading up
this carried package All right so the scattered
package gives me the create data partition function
So I will be using this create data
partition function and the stakes in three Param
It does So the air Force parameters basically
the desert The column on the basis off
which I want to divide this data Next
is the split ratio So I am giving
those penetration to be equal to 0.65 and
lest is equal to fall So if you
want the result to be in the form
of celestial given toe But I don't want
the result to be in the former for
less So that is why I'll said the
less to be equal to falls And I
am storing this result in a new object
And I am leaving the object to be
equal to split tag So this is done
Now we actually sure you what do we
have And split tax So split tag again
I'll shoot first Fire God So what issues
you have 15789 So are these 65% records
right So these are so that pro number
one through number 578 and nine These are
part of those 65% records over here All
right so now that we have the you
know all those 65% record store in split
attack So from the entire iris data said
I will select only dues through numbers which
are present in the split tag and I
will store all of them in the training
set right So this is done now similarly
from the entire IRS Data said I will
sell it those records which are not present
in the split back and I will store
them in the best it So now I
have my testing and training sets Really So
Andrew off train So I have 99 records
in the training set and then I'll also
die pin Andrew off test So I have
51 records in the best things that so
I have my training and testing That's ready
So now I'll go ahead and build a
more lawn drop of the trainings that so
I'll be using the C tree function which
is part of the party package So in
this party package again up to the sea
tree function are given two parameters So the
foods perimeter is again the formula So I
want to determine those species off this start
So this is my dependent variable So there's
my dependent variable in the rest of the
call and should be the independent variables And
that is why I have given this Species
column on the left side of the still
a symbol And I have given this daughter
on the right side of this hill assemble
some species column is my dependent variable and
the rest of the columns are my independent
variables And I am building this mortal on
top of the train set and I am
storing this result in the my tree object
So we have successfully built this model Now
I will go ahead and lord my trick
So let me remove this world plot over
here on Let me build this up and
let's see what do we get So this
is what we have So were you the
first split as being determined by the battle
length column So after the better land as
less than or equal to 1.9 then the
species off the flower would definitely be seducer
right So with 100% guarantee you can see
that the battle and is greater than or
equal to one That spirit is less than
or equal to 1.9 And the species off
the floor would be Sentosa On the other
hand of the Battleland's is actually creator than
1.9 Then you'll have the second condition to
the second conditioners of battle worth is less
than or equal to 1.7 service less than
or equal to one point 2nd 7 Again
you'll have the next condition Value check of
Ireland is less than or equal to 4.7
So if petrol enters less than or equal
to 4.7 then your species is definitely will
see color On the other hand if your
pet talent is greater than 4.7 then there
could be two options So without tears up
50 59% or 60% chance that it could
be with secular And there's a 40% probability
night could be waas awards unica And on
the other hand if you have this battle
were to be created in 1.7 then you
just come down the right side off it
and you can see that with 90% Probably
the species off the floor would be raw
Janica and with 10% probability that the species
off the Flores will see color All right
so we have built a model and we
have all supported the mortal Now it's time
to predict the values now to break the
values again I'll be using this predict function
and the stakes in the three para meters
pseudo forest paramilitarism bottle which you've just built
the second parameters that test it on which
we are trying to break the values And
then we have the type of production which
is basically response And I am storing this
result in my bread object Right So we
have predicted the values and stood the result
in this my bread object Now I will
go ahead and uh create a confusion matrix
again So the actual values are stored and
test dollar species and the predicted values are
stored and up my bread object So I
have successfully created this confusion matrix for here
So let's understand this confusion matrix soup all
of those records So we see that all
of those records for the species is said
Oh so they have been correctly classified Us
said if we come toe a secular again
all the records where the species of the
flood was worse it color All of them
have been correctly classified us or see color
Similarly all of the records where the species
of the flower was Veronica so all of
those 15 off them have incorrectly classified as
well Jessica But to off them have bean
incorrectly classified us or secret right So you
have only this part where that it's misclassification
and the rest of the cases for Sentosa
and were secular That is 100% up Codec
classifications So let's go ahead then find out
the accuracy off this So that would be
17 plus 17 plus 15 which is your
left Dagnall over here right All of these
diagnose our present after all of these values
are present on the left Dagnall and then
you will given the rest of the values
which are 17 plus 17 plus 15 plus
two and you get an accuracy off 96%
with this model But you've just built right
Let's understand the problem this morning So as
you see only two columns have bean used
for the production off This piece is off
the floor so only Battleland's and petal worth
have been used But the rest of the
columns have not been used so that is
why we are going to build this model
again Where will use only Patil Island and
petal worth as our independent variables So I
am building my second model over here and
in this species would be my dependent favorable
And the independent variables would be better land
and battle worth And I am building this
model on top of the train set and
I am storing this result in the my
tree to object So I have mellitus model
No again Let me go ahead and visualize
this So I'll use the blood command and
I will pass in my tree to object
inside this on Daz You see over here
So this is what you get So this
is sort of the same plot Richard Cultural
previous one right For the only difference isn't
stuff using all the columns as our independent
variables we have only used back in length
and tolerant as our independent variables All right
so this is done Now we will go
ahead and predict the values So we have
built a model Now it's time to break
the values So this time I am again
predicting the values where I'm passing in the
force Param eter us my 32 So this
is the model which you have just built
and I am taking the values on top
of the tests It and the type off
respond or type of production is response which
I am storing in my bread too So
I have made the prediction Now let me
go ahead and create this confusion metrics So
I am passing in the actual values which
represented as dollar species are the post perimeter
and then the second bottom Ito The affected
values are stored in my prayer to all
right So again we have this seem result
over here right So instead of using all
the columns if you just use the Battleland's
and petal World column that has more than
enough for us to determine what type off
species this particular iris flowers iss That's quite
an interesting observation isn't it Right So we
are done with bio declassification We are also
down with multi class classification Now we'll go
ahead and do regression but decision treat and
for that will be working with the Boston
data set and we'll be using or different
up function So we'll be working with the
our part for our library over here So
this are barred basically stands for it goes
if partitioning right And let me go ahead
and load up this Boston did asset So
I will be using the ST dot CSU
function and I am passing in the pot
over here and I am starting us in
a new object and name that object to
be Boston So let me just type and
view off Boston in here Right So this
is my data set Now let me just
show you the state as it were This
is also part of the mass package So
let me look at the mass package Now
let me put in this question mark on
Let me type in Boston over here Right
So this is our Boston bureau said which
comprise off all off these columns God of
C R I m Which is basically the
per capita crime read put down than your
God Sudden which is proportion off residential land
zoned for large over 25,000 square Klay Then
you're going to end this proportion off non
retail business across but down Right So are
off all of these Our column off interest
would be this m e d V column
So there's immediately column stands for median value
off owner occupied homes and $1000 So I
am trying to determine what is the 1,000,000
value off the house with respect to the
rest off these columns over here So in
your free time you can go to their
description off this entire data set You have
all of these columns and you can go
to the description off this particular day does
it for you So now that this is
done we have the details that we have
understood the D does it Now it's time
to divide this data set into training and
testing sits So for that we would require
the create data partition function which is part
of the carrot package So I am loading
the carrot package first Once I have lurid
the carried package I am using this created
up partition function So again though I want
to divide this year as it into two
parts with the help off this up So
on the basis of this immediately column and
the split ratio which I'm giving a 0.70
So I want 70% of the records to
be in the train certain 30% of the
records to be in the test set and
I am sure in the sense split tag
No I will love Just go ahead and
take all of those records in the Boston
D does it which are to those rules
which are present in the split dagens tow
them in the training set Similarly all of
those rules which wasn't present in the split
back I am selecting all of those rules
from the Boston data set and I am
storing them in the yes it So I've
got my training and distance that's really against
the royal type Andrew off train and I
see that there are 356 records No again
Let me type in Andrew off test So
let me just type in Andrew off test
over here So I've got 152 cards and
the best said so I have divided those
us it into training and testing said You
know it's time to build a more long
top of the trains it so I'll be
using the our part function which is part
of the art part package So the formula
is pretty much the same right to the
first parameter I am giving the immediately column
as my dependent variable I'm putting it on
the left side of the tele symbol and
I am taking the rest of the columns
as my independent variables And I am building
this morel on top of the train set
and I am storing this in a new
object and I am naming the object to
be equal to my treat So I have
successfully built this model Now I will go
ahead and visualize the model which I've just
built and to visualize that we have this
package called us are part dark Lord And
this are Bardot Lord Package would give me
the art Bondar blood function and I am
passing the my tree object inside this So
let me see what we get Right So
over here as we see the first split
other route on orders based on this l
stat column So we are checking of the
value off l start is greater than or
equal to 7.7 And effort is greater than
or equal to 7.7 Then we come over
here again We check offense that is greater
than or equal to 15 If this is
true we'll do another check Where we are
finding out at the crime is greater than
6.3 and I start It's true you are
Your median value off house would be $12,000
on defectors Crime is not greater than or
equal to 6.3 years Median price would be
17 Let's come to this higher end over
here That up median price of the house
is $46,000 So let's see How do we
get $46,000 So forced is your l start
value So if you're l start value is
greater than or equal to 7.7 It was
not created in your equal to 7.7 Then
you'll come to this side and you are
checking if Artemus less than or greater than
7.5 So if Adam is greater than seven
point fight then you'll come over here and
you'll say that though 1,000,000 dies off the
house would be $46,000 right So this is
the way how you are trying to find
out the price of the median price of
this house Similarly if I want this case
over here So l start is greater than
L said is not greater than or equal
to 7.7 Welcome over here again If Artemus
less dense of important five will come over
here again If it is actually greater than
92 they will come on the right side
Then they will say that the price off
the house would be equal to $42,000 All
right so um we are done with the
visualization part offered Now they will go ahead
and click the values So we have built
a model Now it's time to protect the
values So the stakes in do barometers So
the force perimeter is the moral which you
have just built the second perimeter as though
they does it on which we are trying
to protect the values which is basically the
tested And I am breaking the values on
top off predict treat So I have successfully
predicted the values as well Now that this
is done it's time for me to bind
the actual values and their predictive values I'll
be using the C buying function so the
actual values are stored and the immediately call
them off The test it and the predicted
values are stored in the predict three object
which we have just created So I am
binding these two and storing them in a
new object and the object to be equal
to final later So I'll have a glance
this view off final data So these are
the actual values off the median price of
the house and these other predicted values off
the median price of the house right If
the actual was 24 suspected as 25 actual
was three feet Was Rachel us to anyone
So what you see you basically have an
editor in prediction for all of these And
if you want to find out the average
Aaron production we have to get the root
mean square area right up So this is
actually a metric So let me show you
the class of this so I'll type in
class off predict tree and you see that
this is actually this is ah metrics award
So this is actually class off final data
soil type and class off final data So
this is all my tricks and I have
to convert this metrics in toward the tough
for him for servile select as dog data
Three mile parson Final data into this and
I'll store it in a new object I'm
actually storing it in the same object which
is final data Now let me check the
class So class off late Um so as
you see this is a data from now
right So now I will go ahead and
subtract the predicted values from the actual values
And that will give me the editor in
prediction So when I subtract this particular value
from this particular value I will get the
errant prediction and that is what I'm doing
So final dollar predicted and I have final
or actual I'm subtracting these two and I
have in store the result in something less
added Now I will bind the error back
to the final data which I have to
see by and finally leader and I am
binding the other to this And I am
storing the same final data also view off
final date up So actual values predicted values
and I have the current production over here
Now I will go ahead and find out
the root mean square Aargh So this is
what I have We have Right So first
I will take this square off the doe
So final data dollar at her I will
take this choir of this Then I will
take the means So means square And when
I take the square root I will get
root mean square error And I am storing
this in our Remesy won Let me show
you Where is the value of vitamin C
one So I get a value of 4.99
All right There is my first model Andi
I've got the intimacy value of the root
mean square error value to be 24.99 now
One interesting peak over here is we are
not actually using all of the columns over
here right So we are only using l
stat Artem H D I s and Crime
So instead of giving all of the columns
let me just use thes particular goal columns
and builders second model So let me select
this over here right So I'm building this
second model where my dependent variable is again
immediately But my independent variables would be only
l start and then I've got r m
After that I've got each Then I got
crim and then I've got the I s
right So have angered everything I've got L
start I've got Adam I've got CIA I
am And each I've got And I've also
got the I s So I have added
all of these as my independent variables And
I am building this model on top of
the train set And I am storing this
in our new model Onda naming that model
to be equal to my 32 Right So
I have built this modern Now let me
remove this Actually Onda And so now let
me go ahead and click the values so
I'll use the credit function again I will
pass in my tree too So this is
basically the office para meet up on the
second perimeter would be the test because I
want to break the values on top of
the test set and I will store this
in a new object and the object to
be equal to predict tree too All right
now I will go ahead and buy in
the actual values and the predicted values So
the actual values are basically the immediately column
from the test set and the director values
are is basically this object predict tree to
Andi I am naming this as actual and
I am leaving This has predicted and I
am storing this in final data to now
I'll have a glance at Final Data Tomb
So view off finally got to I've got
my actual values and the collective values over
here Right now it's timeto convert this metrics
in tow data frame So I use this
as our data frame function and ill person
final data to inside this and I'll store
this in final data to now Let me
go ahead then calculate the evident prediction So
the air in production is basically actual minus
predicted Right So I'll die final dollar to
dollar actual minus final data to dollar predicted
and I will store the sin Pero too
So I have the current production Now let
me go ahead and bind the other back
to the original data and I am storing
the same final data to again right So
view off final data to So I got
the actual values to applique values and the
errant prediction Now let me find out the
root mean square era for it So I'm
taking the square of the error Then I'll
take the mean offered Then I'll take the
square root offered and I'll store it and
our embassy to let me print out our
embassy to which is 4.99 Let me also
print out our embassy one which is 4.99
So as you see the search basically the
seem now in Slough including all of the
columns if we just include thes columns will
get the same particular result So that is
why you don't have to work fit your
model It's not necessary to build a model
on top off all of the columns So
by visualizing you got to know that if
I just include these particular columns then I'll
get the same sort of security Or I
got to get the same sort off you
know better in the model of a time
building decision Tree is one of the most
commonly used to Albertans in the world Officer
in practical work for some reasons as soon
see don't it may not be one of
the best algorithms under all situations Under some
situations you will see that your logistic regression
probably gives it better reasons than Decision tree
But still there is something of our decision
tree which makes it stand off So let's
see what is it Does something or decision
tree on before I go too far There
is no one single algorithm which is considered
as the best in production we might use
are mixed so far algorithms for them together
In one group that group has fallen ensemble
a French word on We put the group
as one single more the ensemble technique way
be cover tora waste over the So So
don't make any impression that this algorithm is
better than that I got the most businesses
reality Some algorithms perform well in some situations
Other Albertans perform well in others So we
always Susan Have you seen decision tree before
people who come with a lot of experience
Management experience Decision Tree is a very standard
management tool that is used under certain specific
conditions Tentatively water conditions are in which feels
decision trees What could be those situations where
they use decision trees In fact decision trees
used in our day to day life We
all use it every day What should be
a started the ocean So that many my
slippers it is a kind of floater defense
It tends you to take decisions in those
situations where my people states are possible and
you don't know what's going to happen You
want to minimize your risk So Decision Tree
is one of the standard pick needs that
is employed to design what action to take
What sequence of steps to tick People who
have used Decision Tree You'll find Lord off
similarity between what I'm going to do on
how it has been used already Shall we
look at this No position tree Let me
give you for people who have not been
exposed to the decision tree Some brief idea
about what business it can be Useful classifications
Asprilla's immigration the most of the time It
is used for classifications Since your people have
done your mortal what does it deter Sit
on which to build early here Martin Was
it mpg Mpg deficit Was it car's mpg
What is that Good to see you got
or a liquor dizzy 74% certifies over the
six that story engine which you'll get for
that particular later Six mpg data said Lena
Really of model will give in the range
of 74 to 76 Use decision Tree on
that you 85 plus percent educators A phenomenal
jump in terms off accuracy percentage 85% Plus
that is the beauty of this So it
can be useful classification It can be used
to regulation both mostly used for classifications It
consists off what we called the Nords and
the branches where the notes represent some decision
you're taking or some functions you're going to
execute some logical functions Those functions will be
executed on your independent attributes The columns in
your data people are not seen Decision tree
earlier before this is the overall structure of
decision tree can continue to any level I'm
just ringing some few living's This is for
the route Lord all the things which are
in ah particle What is the shape called
over all All these old shapes are my
notes notes represent any function called that you're
going to make on your independent attributes The
result off the function can be this or
this by any result through or falls So
this is the reason off the function executed
here These are the results of the function
executed So at each lord be executed function
All these are functions This one is called
the root lore and all those notes richer
not further divided There are relief level boats
Any F leaf in between the leaf lords
On route north of you have this function
words the algorithm that we're going to use
it is called CRP Classifications and regression priests
CRT classifications on regulation please It is a
binary tree creator It creates only two branches
in every Lord There are some commercially of
a liberal orphans which can create multiple branches
at each Lord You might be wondering why
to know why two branches Why not three
branches There are commercial algorithms available but the
one which for use in psychic plan It
is open source It is called CRT Some
off you might be wondering That means it
can be usefully for by the reclassification Our
decision tree can be used for character recognition
in your English alphabet How many characters are
there in English alphabet 26 We can use
Decision Tree to classify those characters into 26
classes so What will happen is it will
do a worse is others be versus others
See even if you convert into vanity so
it can be used for mighty class classifications
it doesn't have to be buying any class
with me Just because the nature of the
algorithm is to split it into two every
time does not mean it can be used
only for binary classification It can be used
for my people Ask prostitution All right What
happens is that at the leaf level the
northern of three types of group nor branch
nodes and the leaf notes the three times
at the branch north we have the functions
and then the result of the functions Root
node represents your interrogator free data Sit Suppose
you're using this is street for classification You're
gonna classified things for classification problem When you're
given a record whether given image belongs to
an airplane or a ship or a zebra
you want to classify given image into one
of these three categories What decision travel to
is it tell you the problem The probability
that this is an airplane is 70% Probably
there is a zebras 2% probably did something
into 30 I'm Did you person beach Those
percentages have given at this level the leaflet
Those percentages are called posterior probability It's my
any chance you color probability based analysis like
Naive base Not yet right So they will
talk about something called Post A Your property
When you do those analysis you're talking prior
problem peace and posted their properties Posterior means
after knowing all these things what is the
probability that this belongs to a particular class
Hence the name full studio I think it's
Loiseau credited So the question is what do
we mean by leaf Note belongs to the
majority class So what happens is suppose you
have a data set for simplicity's sake Let's
take only two glasses Okay Have a better
city on these air Independent attributes I want
I want I to i trees on So
food on the target variable is a class
Okay In the class they can be two
types One is defaulter Other is known different
defaulter known Before you're working in a bank
the management in the bank has asked you
to create a model where the model should
take as input the perimeters about the various
customers They income the age category where the
person lives so and so forth on the
model should tell you what is the probability
that this person will be a defaulter What's
this No default Only me Okay so the
bank gives you all the historical data off
People are taking the loans Some of them
were before there Some of them were known
before us Do you know anyone with me
Okay No um by a creative decision Tree
What happens is all this entire data sick
ISS used actor route nor there's a lot
of detail to it I'll come to those
details gradually on then based on some function
which is executed on one of these are
independent attributes Your data gets split into two
when the data is expected to She also
have some records Heroes you have some records
Find out what is the majority here before
tourism known defaulters Find out what is The
major will be here before goes on on
the force Suppose the majority here is the
foreigners Then this north gets the label off
the foreigners Suppose the majority here It's known
before us This gets the label off known
defaulters but please understand it is not 100%
known before does it is not 100% before
those it's a mixed but the majority is
That's all right Since it's not it's still
a mixed majority Is this I apply some
other functions Are Mother after built on this
data set on for this politic So this
more later said gets further split into smaller
set on in this data suit Suppose all
of them are all of them are only
defaulters then this Lord does not need to
be split any further It's already can completely
known So I know that when somebody follows
this part he's likely to be before so
I don't need a photo This street On
the other hand this one has still got
some mix off before the loan defaulters Suppose
the majority of them are known differs So
on some other function which is executed in
one of the other attributes this date of
loose spirit again and supposed to split the
such that all these are known before there's
all these air defaulters then you don't need
to further strictness Now I know under what
combination kind of person become a defaulter Maybe
this combination of functions this combination functions so
every ignored we'll find out what is the
majority class The label of the majority class
will be assigned to the more but the
fact ISS Unless you're at the least level
Whenever you're in between the leaf and root
the north will always mixed Berry contaminated They
have known that your glutes We want to
reach a stage in our decision to remember
from where we have your words your nodes
means notes which belonged to complete before us
only are known before When I reached that
stage then I knew under what combination Somebody
becomes a before under what combinations somebody becomes
known before So I know all those for
me No combinatorial safer mysticism which leads somebody
to become before on the other combinatorial is
which lead somebody to become known before this
is the part that I'm looking for This
is a mortal I'm looking for barbeque on
linear models which finds out the best Fix
a face for you Basic line for you
which expresses the relationship with the independent variables
on the target variable The decision to express
is that the relationship between independent variables and
the target class using these parts The parts
tell you how this independent variables are related
to the target class that is immortal and
the beauty of decision tree ISS not in
secular and the other language are I can
convert this whole decision tree into English rules
sell printed rules for me If the person's
income group is this and his age is
dark on this education qualification is this very
likely is going to be known Before that
English rules I convert in the job according
deployed That's my model This much more turn
than what I'm told you are taken to
the details But Transavia question that is what
This minutes All right So if you're doing
classifications then the objective decision trail Go to
the miss to create Ask your Nords as
possible fuel pyramids The note should belong only
one plus that is objective In case you're
using decisions for regression calculating the mpg values
then the objectivist at each north The variance
off the data points should be minimized No
look at this all If you know what
his ratings 40 civilians What is radiance Don't
give me the formula I'm not asking for
formula Understand the concepts Meetings is on a
non veg How dissimilar to data points are
very in Spain's on an average How dissimilar
a data point is from the central Mali
It's time eyes expert Ray support to divine
in that is what we call variance who
under rotated Small Standard division All these have
nothing but we'll expressing how the similar appoint
is from the central Okay the object of
the decision Trees collect all those records under
the different cells such that the variance within
the cells minimized All similar records come together
in the leaf level The variance written the
cell s managed right It might sound slightly
Greek or let into you right now Let's
go for the down the road and do
some hands on Bring up all discussions that
are probably bothering you Rachel This union dicks
this calling Profion dicks I'll show you a
picture Really The relationship between the two What
is he here in the red dots These
red dashes This is the car which represents
the genie Jeannie's It's a ranges from 0.5
This is probably easier to what minutes and
profit goes from 0 to 1 That is
the only difference between these two occurs if
I skilled on the improv Fico scale it
down then this drink go The presence in
trophy skilled on As you can see between
the two is there's not They're not much
difference between the red and agree There is
not much difference Great represents in trophy read
represents Jimmy You can use any of the
formalist they give you The same was the
green one We don't use a decision tree
but I'm not going to talk about this
This is more related to your confusion matrix
If you guys have done confusion matrix I
wouldn't confusion Matrix Yes it's a very to
know that you're going to say so Oh
yeah Okay Richard that's just point firing Does
it mean that that is the level at
which we should be using anybody's The impurity
is more than that We won't be able
to use that guinea index formula Taylor know
what is the signifier These are to raise
off measuring your impurity Okay Okay In trophy
will the impurity ranges from zero to once
Didn't even you used any form or for
measuring then maybe a train to some zero
point by point Thanks A maximum impurity in
Jimmy Okay Right there Two different ways of
measuring the same thing One maximum inspired by
the otherwise on One more thing It's important
that I bring it out here itself This
relationship between probably and an trophy going toe
one ongoing 2.5 visible Hold one Lee When
you're two classes one bosses others many of
three classes then this bar will go for
their This will go for that Okay I
don't want to confuse it with that So
right now let's keep it out of the
picture But keep this in mind This relationship
on the scale is true One leaf or
two glasses one verses other a versus other
beaver serviced those anything we call it over
here Over here stands for one worse This
arrest You will see this when we run
the elder Over here Over here Strategies These
are the different decision tree algorithms available So
you three born zero It's also called 93
I creative die Corta miser Very old We
don't use it anymore 34.5 It came over
with some of the limitations of I d
three Our cart is based on this invite
on the cartel North Um is based on
this C 5.0 is the latest available commercial
worship not open source so we can't use
it in in the bite on we have
implemented a star which is nothing but see
for concept C 4.5 is open source and
one day just disappointed Just all algorithms ever
wanted Just disadvantages be advantages off decision trees
Simple You don't need to go Hi fi
started States you don't need to worry about
the people are news and all the saints
Simple It's very fast in a Euro production
and warming classifying Why is it very fast
Because the algorithm slash is your uncertainty by
half in the very first level So every
level it slashes the number of records check
by her off Right so it's performance is
very fast It does well with noisy daytime
missing data It does It does well with
missing that I preach missing the ties Another
category You don't need to process it It
does well with noisy Did I something which
I take with a pinch of salt because
this is what you will come across when
reading books and articles but not algorithm stand
really handle License it Honest on as I
told your decision tree even a slight change
in the distribution of the underlying data said
can change the restructure completely So I really
kind of you 100 think this with slight
pinch of soul But this is true It
can handle new American categorical variables you don't
need How many hi fi knowledge of started
sticks of mathematics to implement to make use
of decision tree The fall of it I
showed you is one big conceptual You don't
even need to know that this it won't
age There are some disadvantages which we need
to be aware off As you say it's
always it's biased towards those columns which a
large number of cardinal values and why Because
that gives the decision tree that much more
room The experiment this is gonna come may
not be very optimal in mourning relationships because
this is access panel under Let me tell
you what it means Simple Nothing to worry
Yeah Are you only little for the feet
of space Have you heard of this world
feature space feature space Okay Last session means
the first half but you're not heard of
features base before that Nothing about the Okay
All right Now you have a date aflame
with you We are different Oh on in
this data fame let's assume you're three independent
variables and one target I one i two
i three on all these surveys Were you
supposed value satisfy supposed value fears 73 on
the target value is some mpg Some MPD
some 50 Then you have another variable 10
to 1 than 15 36 Something of the
sort on your various values said 2013 cents
this data free Whenever you learn any machine
learning algorithms all machine learning algorithms are talk
in the context off virtual concentrate It's not
really that what your concept is called a
feature space feature space or a mathematical space
Now these three dimensions which you see here
this is your eye one Maybe this is
your eye to This is like the independent
variables become the dimensions off this black American
speech We call this data Victor Okay this
is going to Victor Data Victor Now look
at this Fine 73 maybe fine on this
Seven on this And three on this Very
With these points need three On this way
of this point this data this blue dot
this blue dog represents this record Similarly when
I applaud this 10 to 1 and on
this to on this and one of this
very with these points meet this blue dog
it represents the second record So every record
in your mathematical space becomes a dark every
recording of data Same becomes a point in
the mathematical speeds All right so now let
us assume you guys have done best fit
Mortar leading best fit line model So yes
or no Yes Okay I know this last
residence only So you have this matter medical
space in this mathematical space You have data
porn scattered over Okay on you You said
you did it for him Petey Right mpg
This is weight of the car Maybe you
have the third dimension Let's take it this
one day in one dimension right now Otherwise
you have weight of the car the acceleration
your hospital your various other attributes Let's imagine
its rate of the car Okay Now this
weight of the car has some distribution on
this mpg has some distribution The best foot
line that you found the best fit line
will always go between the X bar and
the UAE But wherever these to meet your
best fit line will always go between these
two points The best fit model The linear
model which you came out with will always
go to the point They're the experts and
the UAE bars meet in the mathematical space
in there will be hints point that begins
Point Ok that would be the point Why
showing this to you We are in decision
tree in decision to you What happens Is
it something slightly different Lena model It gives
you best fit line but reducing the some
off spreaders Now look at what decision Treat
us We're into classifications right now but we
can do regression also So first let me
explain classifications you in this Imagine you are
two classes right Imagine you have two classes
The said the classes are like this We
have been asked to build a decision tree
based model which separates these two classes the
defaulters and the non difference So the first
thing that you do is suppose I take
income This is income you're Morgan sees If
the implements list and some among the northern
which again is most likely to be deported
us into greater than some amount the notes
is likely to be known before us The
in reality is just the opposite Huh Reality
is this office So if it works Marty
of modern seeing is if it tries to
draw perpendicular line it tries to break your
data under two sets But dropping particularly line
at a particular treasure It is anybody less
than this is before anybody more than this
is long before Look at this When I
started the a little mathematical space you have
the roots nor root nor reflects the entire
data Sit maximum and trophy This entire data
set is then for the broken into two
where you say income less than some among
you Come to this a greater than some
more to come to this So it's it
drops a perpendicular line here income Listen this
before whose majority or defaulters get it in
this majority nor difference So this noticed laundry
folder this is defaulters But you're not happy
because there since still some mix up at
virginity Then you look at age You see
there people before a certain age say for
example I take this age anybody about this
age it's likely to be before anybody below
this age is likely going on before them
So look at this north This note gets
based on age split in tow All these
are different Does known defaulters This block that
you see here This is this block this
block that get shipped This is this book
As you can see this bloke is perfectly
homogeneous It doesn't need to be split any
further but this book needs to be split
So I'll put some I can show you
more than three dimensions but keep on putting
boundaries which are parallel to the access That
is why this cold access barrel allegorical It
puts boundaries in a mathematical space to separate
out the rates from the blues on in
the process used You keep getting these lords
Each one of these north is a particular
compartment in the math American space But as
you can see when the distribution is angular
decision tree will find it difficult to separate
But well us If you heard use linear
classifier linear classifier will simply put an angle
of line on get much better results then
decision Trouble You lead a very complex decision
tree to get the same a crazy It's
a simple leader mortal That is a limitation
off The second point here may not be
optimal As it is access by little mortally
saved the leaf Not is going to have
only one class Yeah so we should be
made to do the credit problems here Now
she wants improbability valets spine But if that
leaf is gonna have one lead just it's
gonna have only one place What is the
purpose of 1,000,000,000 Probably that it's gonna be
100% rating is if it is a mixture
off red and good and making That's a
I can say that I can call concolor
the probability Yeah To answer this question you
just have to wait for two more slaves
where I'll talk a book This line uh
not streets can become difficult to interpret And
I'll talk about this elaborate designed getting and
simple Just hang on for discussion is if
the leaf level have perfect opportunity If I
do any property naturally viciousness You make sense
The second nominal look I feel is a
beautiful question It just come a bit early
so we'll take it up So can I
say that uncertain unity is a function of
hydrogen ity The more opportunity there is the
more uncertainty There's generous in there Uncertainty is
inverse off virginity Home is unity is alternate
security Homogeneity is also honest purity So then
you're starting Decision Tree You'll come across two
words What is called a Trojan T which
is also analysts and pro fee e n
p R E e Each way you'll come
across a board called an trophy Some people
put the age some people don't on Then
you'll come across uh purity If you are
the way two words entrance fee and purity
their opposite to each other electrically One more
again Okay well not go too far into
us I get 10 chances You're the guests
Which famous personality I'm thinking about You have
been questions to ask Okay Prosecution side Thomas
Lord is abortion male or female Gender wanted
us but how old are you So look
at these two questions Indian or other nations
I'm thinking of a personality famous personality The
famous personality can be from anywhere across a
blow right So if I say not Indian
then you're left with so many other personalities
across the globe to look for We don't
know which one He asked a question The
question was male or female The moment I
answered the question hard for fears uncertainties removed
There is a moment I asked that question
Apart off Uncertain invisible So in decision keep
what this Elberton does is it tries to
ask that question the first question in such
a way that had minimal system but dropping
and profits maximum So the Decision tree algorithm
is built in such a way that at
every level it tries to Ausubel question just
like you also question to me resulting in
the maximum drop in and coffee So let
me give you detail about how this is
done We'll see that Yeah but half of
them trophy's gone So this is an experiment
which I was trying to know you know
back full off color bones this bag discovered
inside a clock back on your asked to
pull out one of these balls You don't
know which ball is going to come out
on each ball Each color can take you
the difference situation So you don't know what
situation you're going to end up maximum uncertainty
On the other hand if I give you
a bag like this and ask you to
pull off the ball you know what color
is going to come There is no uncertain
unity In short het virginity means and Brophy
Homogeneity means zero and profit broken up in
a decision tree can be safely assume that
the functions of pleasure that the entropy has
gone down to the maximum extent possible when
that functional supply that that note I'll answer
that question in a way destroying just okay
only know I'll give you one trick pieces
example Don't think too much about this example
is a fictitious example Data cities border from
the world of big data but it's a
working big Later This is a sales data
set Okay it Israelis columns in it One
of the Collinses are the priority order number
location and all the saints in this data
set baseman column called province This province is
the place from where the person has placed
the order for You're working in Ah the
retail giant's you're a data scientist working one
of this retail giants on the task assigned
to you is Can you build a modern
which by looking at the various other attributes
will help me decide Where is this customer
locator So the last Colin is a location
called the screen size is limited So I've
shown you one day a subset of the
DNC Otherwise this is a long bitter So
in the last column young you have at
loose like um somebody there Oh missing the
good busy *** earlier Then you have other
Republicans You have other other things all here
in the last four Okay now the objective
is to find out Looking at the other
attributes can guess where my best memory is
located Since the Target column is the last
column the target columnist in Flint off the
bag which is surely on the previously the
fact that the Stargate column has mixed off
various look and a piece in the original
data sick some records that from our air
summary courts of investing that they're so on
so forth It represents the different color balls
that you saw here This is the situation
you are in when you take the entire
data set And this is the target So
this started calling Mr Bag are you okay
Are you only found this I want this
decision to Elberton cart will do is it
will find out for you amongst this independent
columns We'll select particle on within the particular
column It'll also Celik the break point when
it breaks the data That column At that
very point it results in two smaller finds
These two smaller finds of your two notes
the time did not Each one of them
is one single fight But in the process
of doing this body her achieved us Whatever
in trophy was there at this column That
and trophy When you look at these two
columns the some off an trophy of these
two columns will be less than then trophy
of this parliament original fight Then trophy of
the target column in the child Nords will
be less than the intro fear of the
target column in the parent Then it's still
okay But this majority some classes maternity someplace
but it's still a mix up It's not
pure Yes Still not got 100% purity Then
in that case what Lewis will take this
file this nor on find out again one
of these independent after books based on which
it will break that Nord into two smaller
roads So this this more does this this
notice Now what happens is then trophy of
the Target column at this level will be
less than the in profit The third column
A district They don't have The target column
it this living We'll be listening the interface
of the target column in this Lord So
the objective Decision tree is to break your
data into smaller and smaller sex in such
a way that the enter fee off the
target column In the Child Lords it's less
than then trophy of the Target column in
the paid in fluke This is exactly what
we do when we play Guess the personality
in 10 questions is exactly what we do
Samples ice like that you're for next list
I'm just busy having probably the breaking down
the making it is shouldn't be At a
certain level it's probably simpler also but when
you come up with the actual date Argo
yeah a much larger population So breaking that
in terms of this in profit here in
the business So how What I'm saying my
question is more I was seen a predictive
behaviour on that or something like that You
do with the largest sample like that Simply
Pete in somewhere else And he's asking version
which actually were many light years away from
answering Honestly we need to actually get some
more concept in place But to answer this
question our decision tree one off the limitations
Weaknesses of decision Tree is from the same
universe from the same population You take two
random samples Decision to own one three decisions
around other the You're likely to be different
from each other Decision Please get easily impacted
by the variance in the samples Yes When
you're using this it is very important to
make sure that your sample is true The
presentation off the population which is true for
all the algorithms But decision tree is more
sensitive Toe those parents right man That's only
perfect Greek all if you know what the
sample what is universe and all these things
you know Okay What is the difference between
the two The sample and the universe What
is the difference Cycles Populations contributing I like
that in Serie Okay Alright Buicks sample is
a subset of cooperation but very important point
You raised your hand You want a simple
you know Ah very important Point based Keep
in mind Pollution is dynamic It's constantly changing
Two weeks ago when I came weeks maybe
a month ago when I came from airport
in this place there was no construction activity
going on on the road on the route
he took was totally different also but really
got stuck in the mess on the reform
airport of this space Okay so imagine you're
working number your data Scientists working on the
Texas services on you have been assigned with
the task of creating immortal where the model
predicts the moment somebody sits in a taxi
starting the right what time he's likely to
reach his desk Patient That's what happens When
is it in the taxi Even before the
taxes starts moving it tends you by what
time you're likely to reach illustration But in
school I was told that time taken is
a function of distance bestie things guys Mark
even started and it's telling time How does
that happen Welcome to the world of data
signs What these data scientists have done is
there taken data from the past off whatever
customer engagements on birth this mortal for you
But this data this they have taken from
the past This is a sample on this
year built immortal Walk about the data which
is being generated at this moment when I'm
testing the model they're just not part of
this morning That is why we say universal
population is always dynamic It's not really large
It's dynamic It gives Gingy Where is your
sample It's not only small ecstatic and this
brings in a problem into data sets The
problem is the root cause of the problem
is your sample can never be true Representation
of the real world Later you might have
big data set upon cloud somewhere on AWS
There are better but still it's a sample
because it's static So that is really what
difference between the two It's not just the
size It's also the fact that universes dynamic
life that it apart Skid continues sitting generated
on new features New active shooter coming into
play which is not reflected in a sample
That is why material girl from you use
You have to make sure that your samples
ask God A presentation as possible on a
lot of the algorithm is very sensitive to
these differences between sample and the universe is
thus preventing or fitting for regular vision The
very important concept that you will have to
be aware of when using decision trees Decision
trees are known as non parametric Albert What
do you mean Mother Many based leader mortal
Why go to MX plus C in that
mortal water the pattern meters yaks and and
see where the two perimeters using the M
and see the algorithm gets for you the
best fit line which gives them the minimum
sum of squares Are you okay My life
Yeah right So it controls the MNC to
find the best fit line for you We
don't have any such perimeters Industry entry We
don't have any perimeters which the decision to
algorithm learns from your later Yes it has
fallen a non parametric over So what happens
is when you make use of a decision
tree that decision to you half really go
and create a massive three for you Richard
Prisons Many off these carts cinema at medical
space such that inside each compartment you have
only one record Each compartment being one leafs
in each leaf you have only one record
That's a question is us But such trees
When you're taking the production before on the
face they give you zero there in training
set but untested there Erin will be very
high We called such trees over fit models
Have you heard of orphan underfoot And all
these things are a little these concerts Fantastic
So the decision flee when it has one
lee one recorded leaf level Perfectly homogeneous zero
Aaron training say but that trees useless as
a model because in the test data or
in production it lots of ways So what
they do is be rolled back those leaves
upwards that is called post funny We talked
the crease and we roll back those leafs
upwards That concept is called regularization You're come
across regularization in linear model Also if not
probably you'll come across these later that is
called the last Year and the rich organization
and a double s pool lasso and rich
R i d e g So when I
applied the regularization on my trees because I
want to production is it then our belief
level You large sign 100% Homogenic they're also
find mix us so many The mix up
we can make is a problem It's using
later You should stand here and then you
know what I mean That Yeah Richard that's
what will happen So you go in visually
analysed tree you'll see that at least level
launch in one recording each anyone to do
and so on so forth That means you're
in over fix on And as a scientist
I will never put the Lordship morning in
the production I like holding back to roll
it back up I used for this story
post So when I do that I learned
a with agree with Leafs which makes us
they're not for much innocence 11 step above
home region is where you stop Maybe one
step or pull ships I don't The difference
on the situation was truly institutions In fact
that question brings me to another point which
I'll discuss with you Let me take this
to then I'll complex Yes In fact I
was going to take the same question How
do you know at what step you should
chop the tree and not more Okay this
is applicable to all immortals not just decision
Tree How do you know how much your
your regular race Many of you bigger Morten's
You always test your model both on the
training set on the test data simultaneously Ba
ginestra Later Science The mistake that they do
is they try to build a model I
caught you again I caught her again How
about that body Much more Okay You wanna
grab a cup of coffee I don't know
whether it's allowed inside It's not alone in
Bangu It's a low PanAgora Okay Uh what
is this Mechanization Benizri A player Globalization on
any techniques You should know how Mr Playing
How much Not Okay So they never you
build mortals The bigness to data science The
mistake that they do is they try to
build a model which gives zero errors in
your pregnancy That is a mistake That you
should never do that You should always allow
the mornings to do energy in your training
dictates tooth What Your objectives should be minimal
errors in the production which is represented by
the test data Your test data represents the
products in NATO richer more or less not
seen Your objective should be to minimize the
errors in the test data in the production
Not in training usually will find the accuracy
there to get in your training Involvement the
control in government ISS less It's more than
they receive that To get in production involved
you'll find that it'll often happened 95% of
the time This is what will happen Your
model is giving you 95% accuracy in your
controlled environment When you take into production it
becomes 90% It'll never happen the other way
around Where in New York control involvement It's
given you 95% in the production environments giving
19% very unlikely What should not happen ISS
in your control in government 95% in the
production environment 80% that should not That sudden
drop in performance indicates stock your in order
for example So what we do us then
we build the models We test the models
both on my This is my complexity of
the mornings complex lease Also a function off
number off dimensions are attributes It also includes
whether you come here model is a simple
leading immortal quadratic cubic Those benefits I change
This is Mira I check my painting at
her on my tester boat Okay I keep
believe a border I run it on training
data This is training Andi I also run
it on a stick You'll often find that
as you increase a complex river Morton your
errors come down But after some point the
festival starts going up Your training Error construe
zero After some point you're Estonian Starts going
up your training it it starts going towards
zero That point this is for over Fritz's
This is going under FIC Right foot is
something Infants So when I build a decision
tree a non parametric algorithm it creates a
very complex tree for you You see this
when we do the hands I assure you
that that particular tree on training said it
gives you zero errors But on this sick
the others jump Significant You're an orphan zone
What should happen ISS But mortal richer more
you've been should give you more or less
similar inaccuracies in both your training and you
noticed such mortals are likely General is in
the works Well if you're OK so Decision
tree We will have to but event over
fitting Babel How to do it Whenever you're
reducing Decision Tree you have to do regular
reddish It does not assume any kind of
free No relationships between independent target variables It
is known perimeter s algorithm It does not
have any perimeters to find the best fritz
surface for you if left untouched The decision
to even treat a very complex uncontrolled more
than for you which will be or Frick
a wide or fitting We'll have use of
any realization I'll show you how to use
that on for that Were you certain perimeters
Those perimeters are called hyper parameters In modeling
there are two types of perimeters One is
M and C These perimeters are called more
than perimeters Meter learned from your data set
There are other set off perimeters which are
going hyper parameters there used to control the
behavior off the algorithm a long time ago
E I can make out Hey I am
hopefully anybody If you have any problem don't
feel don't hesitate to ask get left behind
just because he was being It's not Rachel's
Don't feel that They're told I think this
is just our Your neighborhood is listen in
place The chair is blocking it Did you
want Sure Sure Sure Yeah You don't so
stuck somewhere you're with me under listen Yeah
And then you can bank on him right
But but some things with us Won't you
have a question What do you want me
to repeat on regulation Okay Yeah sure Let's
do it with Let's start from here then
Uh we have a plot later versus complex
both 40 We should always do this for
the time being Imagine complex a physical to
the number of dimensions attributes you use in
your model Complexity can also be why people
DMX Percy verses Baikal to m one x
one Plus aim to Xcor Extra square Proceed
That is also another way of being in
complexity Some I'm looking at The 1st 1
made you keep on adding more and more
dimensions to your modern that we bring in
complexity Okay When every building models you'll never
get the best bottle in the first iteration
Actresses it's never happened to me you'll always
get immortals in multiple iterations Best model which
will survive in the production of oil and
give you the results You find that mortal
one leadoff iterations So in first I pollution
I might be mpg Watson's weight acceleration on
something else In the second iteration I'm a
drop acceleration and take Wait worse is Hospital
in the third iteration I might say OK
hospital on Wait there linearly related to each
other Let me combine them into one and
use that one dimension form mpg So I
mean I drink through all these things before
I finally said We don't for a mortal
which gives me good results All these iterations
that you do it's what it's sown here
Moving from simplest on simplest model to complete
smarts Every iteration I do it test against
both the training data training data and tested
Let me show the test one injury So
this is modest I'm deliberately showing tests are
slightly higher than bringing because usually many break
our date under 70 30 You broke your
data under 70 30 The game or data
for learning less data for testing 84 green
44 distinct Yeah Oversleep makes sense Okay you
get more data horse going to learn from
and lives data for the algorithm protest on
There's a very basic assumption you're making here
I'm not sure whether you're a bit office
when you prepare for your board exams There
isn't giving that examples My daughter recently took
the boarding times The more the trainings that
she has more sample cautions she has more
likely she is going to perform well in
the board exam Wright said the board exemption
is going to get one a few questions
20 questions but should have prepared her cellphone
thousands of questions So we assumed that the
20 cushion patterns will be somewhere here in
this 1000 cushion parts Are you with me
So she has done this 1000 well this
20% questions would be a subset of this
1000 The patterns will be subset of this
that is sort of a resumption is But
in data science this assumption is always many
times violated when a break for data into
70 30 or 80 20 some part off
the 20 that the patterns in the 20
is not there in this 80 is not
there in the seventies does start in Syria
and Martin doesn't Right now you're going to
check this mortal onto this best data in
the test data there's some patterns which is
model is not seen before So it starts
giving a higher ed Ersten you Yes but
what if you will see is as increasing
complexity of immortal You'll see that both the
training every on the taster they keep falling
though big decrease But what do you notice
this after some point your training your kids
going down your tested goes back up this
point where the training error starts debuting upwards
This is my best fixed point this side
I'm in order Fig Zune My morning will
give me zero isn't training My daughter will
give me 100% score in all our sample
questions but in production in their boarding on
shoe bomb Right Fortunately that didn't happen So
she's an old fragile same under Fergus Also
not good under for this suboptimal Morton or
if it is useless mortal you are to
be somewhere in between the right fit mortal
your experience same patterns same conditions no same
patterns that will be very here This is
actually there's a care confidential I'm telling you
that happens when Ok ok there's a contribution
This Ah before Okay let me finish this
off What time is a key break Is
it 3 30 Because it is 3 39
or 30 Any future becoming Ocado So I
guess coming to your questions that isn't even
the best concert being this What happens is
when you're on or for Exume your model
is able It's flexible It's trying to adjust
to all the distribution of the data points
in the mathematical space of the training set
Let me explain this to you So this
is my mathematical space This is not independent
attribute I want This is my independent data
But I two and on my data points
are distributed like this There are two classes
on the data Points are displayed like that
Some bullets I want to separate these two
classes Okay You've been the mortal A simple
decision tree with only toe attributes They start
a good decision Tree will not give you
good results So I make use off on
Lear Morton on I draw a very complex
sir face like this which cleanly separates the
blues renderings Believe it Off north this surface
can be represented in mathematics As an equation
you must have heard about 40 analysis If
you guys come from that background for the
analysis is can build a very complex function
for you on the mortal you zeroed and
piston I'm in there Nothing They think this
mortal onto another mathematical space The test data
in the best data these blue points on
the rate points might be distributed in different
ways They might be different when positions they
might be different positions that's got in connection
with the same positions No I really think
this mortal as it is and transform it
here as you mean You go something like
this your model is going to give too
many years Can I take the liberty of
showing you something Will be slightly deviating here
so they get the concepts in place It
is connected to the literary organization concept which
I'm going to talk about in decision tree
This is to dimension on This is the
nature off your universe Your population data points
have been continuously generated around the central values
on your data Points keep on increasing in
volume Okay What also happens in the universe
is new Attributes come into play which are
not showing So your population is always dynamic
It's not really large It is dynamic out
of this population I pick us that short
that stack shoot this for my training's it
Stop short this topic It doesn't change under
smaller than sites In this next short I
create a very complex mortal to separate beans
from the reds Zero didn't bring I think
another snapshot off this on that snap short
the data points might like some billions because
they're giggling maniacally Capturing the medical aid somewhere
else make irrespective their central values When I
bring this mortal and project it onto this
it starts doing it It's which you don't
start being under so you're in on all
the fridge Zoom then your model is extremely
complex and gives a very good results in
your training set But when you take the
model to the rest later the models performance
drops That happens because you absorb the patterns
in your on the distribution patterns that end
the noise in this data set you absorbing
to your Morton But that model when you
put into this The noises different noises the
jiggle Oh for to be the data points
from the central values It's different So the
mortar But for sport this exact point This
is one living I told right the point
with the deviant that is likely to be
probably the best trick Probably a use of
my What is it like a peaceful sit
on girl Mr Reliable distances Fine And then
I'll take one more slide Hopefully the concert
fallen place anymore Good should Well Mr Klein
Yeah but you can do to use the
model in production your no easily computer increase
between your best fit line We keep changing
Okay Body saying this Let me a karat
made from uh it's actually very important point
Our model is not for life Every mortal
will need to be recalibrated All frequently recalibrate
the models to get the same a Christian
level difference on your domain Now I took
the example of time taken to travel from
a to B is a function off build
immortal But that model which have built three
months ago will not work into this involvement
because reality has changed the reality in the
world has changed that so much of construction
activity is going on so that I am
It's Monday from pregnant from the airport to
this place is going to change So all
mortals in the real world they will drop
in accuracy over time because of population keeps
changing for the things right Yeah there is
You can actually track the drop in performance
of immortals on time On below certain fresh
will be A few people have come from
the quality worldview Something will control sharks The
control charts have upper control Limited lower control
Um it's a moment I see some friends
within the control charter going about them that
something is going on Your model is a
process A process Behavior has to be maximum
No no I am not aware of any
formula Their you can fix it The frequency
Ignore it started to my knowledge is done
You want to keep in that preference There
has to be a feedback loop Master feedback
loop Feedback loop may be automatically go through
manual process because Castillo Yeah people do their
video Masterly feedback Your brother Yeah Yes just
you know right on a one last people
leave then will break for tea on That
is about this I This data six I
hope really it was being in the problems
problem clearly to you This is my ideas
Data sick All of you know ideas data
say it is about the plants in that
there are three types of plants I don't
know the names but blue red and yellow
This is the distribution of the plans on
one particular attributes That's upon which one particular
ready built This is how the data is
distributed for the three different classes This is
the entire big message All of this data
said I take a random sample I'll break
the sampling to Prince's estate 70 30 80
20 water Look at the difference and distributions
You can be a little the moment you
take a sample you change the distributions even
if you the moment could change the distributions
you change the sentence values I mean and
the medium's the moment to change the mean
And the millions The model the linear model
Which of building which goes so the mean
and the 1,000,000 with central values The model
in change If your baby borders on this
the border would have been different than what
you form of this That is why when
you take this model and put into production
models perform more it's now look at this
carefully So when I took a sample from
this decision change on look at the difference
in distribution of the train Certain tests it
when the solution is so different between the
train Second set itself How do you expect
the model which is performing well here to
perform well here Look at the second leg
Another landen function on this on second set
of samples look at the difference in distributions
the data seem but I took those angles
using an unknown functions The decisions changed So
whenever you do sampling you're introducing bias sitters
You can't escape it So when you break
the daytime to train successive to introduce more
bias errors that is why a model which
performs well in praying is unlikely to perform
equally well Intestine but in production on decision
To me it is a non parametric We
have no perimeters to control and find the
best fares Some crude accompany complex mortally nutrients
it saying Training said decision preview give zero
It that believe in I take it so
this the decision is completely different So the
real but from court this is a particular
character Stick off this particular little where it
can easily become over if it so we
will use the regularization techniques to prevent it
from becoming orphaned The question Really ask Ask
boss if I got him right because 30
asked Waas should then profit this place This
bill is in this that serious That's what
he asks Here's the answer This then profit
The child levels should be less than an
trophy at the parent level I'm noting and
tough of disabilities And this this should be
Listen this I'm not saying that I'm saying
the total in profit this level should be
Listen interfere the parent level So I never
paid including Okay old Rick Let's more now
this one It looked like slightly dangerous stuff
but 1,000,000,000 it's medicine All right Don't worry
about this You understand the concept It's important
formulas at all there in the libraries We
just need to don't let the library and
use it But it doesn't hurt to see
the formula and the constipated So don't worry
about the formula Let me extend this to
you There's a history behind this These trees
There was a person called Shannon Club Shin
out of his name lotion Well done Who
Who says Yeah Anybody who has heard this
name Exactly Morley Yeah you're right Hey guys
he's one of those guys because all of
whom were earning our bread and butter today
he was the father of consents launching an
information Paris on He did a very fantastic
job in this area So I think it'll
be good if you're just going explode on
Google Who This guy is 40 What He
said destruction and service She need a graph
Where this is present is the relationship between
Probability and and Profion Trophy shown by the
Symbol Itch She had a physicist friend who
was exported thermodynamics on become across and profane
thermodynamics On this day unfortunately asked him for
a name for this concert on that gets
it in trophy Otherwise there was no reason
for it to be called in profit Okay
He wanted it to make make it look
something great Okay It is great but it
it's fallen prophet What happens is when you're
in a situation where my people outcomes are
equally probable We have blue and red when
you put your hand into the bag and
pull it out The problem of getting red
waas to my four when my toe but
already getting blue wants to buy 41 by
two You're in the situation off maximum uncertain
unity You don't know what's gonna happen un
sufficiently is measured on a scale of 0
to 1 So when you are in a
situation where we have to take a decision
on can lead you to multiple stage with
same probability you're in a state of maximum
unstrap unity So produce 0.5 maximum uncertainty But
you're older May huh Then we took them
back where two of the pen caps for
blue and the two of them are read
the problem of getting red and blue US
point At this situation you have maximum uncertainty
1.2 I replace all the rate of with
blues zero answer Very so when you know
the problem of something happening is one the
probably of the suppositions provocative picking a blue
from the bag we knew the probability of
something happening is one bury zero uncertainty So
you had at this point how are you
all OK with me lady in the back
What's your name Yeah you body right The
probabilities were not carrying about the probability we're
coming to a conclusion that this is a
default and this is not a default But
all the learning worlds is about predicting something
So in the case how do we say
this is a model where we can say
the probably It is like conforming that this
belongs to this group No no So it
is a tradition due to get watches asking
though I was somebody else What is asking
is the decision tree at the leaf level
If it's going to classify something as before
long before right on your document probabilities All
these classifications algorithms better used before position clear
Largest takeoff nine BCE Anything All of them
have a function That function is called robot
Predict Don't Roomba Robot stands for probably You
don't want class is your own problem Please
You progress so Decision Tree can give a
problem evidence right Does it answer ambition Okay
Or did Shelley Moore Yes I just get
a feeling that it is looking like the
probability prediction of the largest regression boss All
these are same concepts coming in different bottles
OK I'm glad that you noticed that all
these algorithms do the same thing in your
feature space in a mathematical space but they
do it in different ways All right so
it's a very interesting thing when we unfortunately
people are not covered Once you covered all
this algorithms down the load then hopefully the
Monday will meet where I'll show you how
these pro different algorithms do the same thing
but in different ways So you're going to
great Show him one Any questions No Y
you're honest with these two points What closure
And funders In mathematics there is a function
That function is so locked to the base
to off B I This function takes a
beautiful shape like this Locked of the base
two of p fn p a is one
lakh of the based off P A is
zero I said it is a negative tomato
capital So then locked to the base so
you can do it on Google if you
wish This is equal to zero locked of
the based off 0.5 this one So he
found out 11 functioning mathematics He knew it
It gives you this relationship between the probability
Amber and coffee Hunger There's one small problem
The problem Miss locked of the base to
base off locked to the base to off
anything less 10.5 It tends to go towards
infinity It goes towards infinity They wanted to
come down toward zero We wanted to come
down to zero Yeah it doesn't mean its
I remove the minister Michael but it doesn't
mites So I want to start to come
down to make it come down I might
be played with by Pierre It's a property
of getting blue When it is one this
becomes one This is 00 and 10 So
on so forth when probably getting blue is
very small This morning tone will pull this
whole thing down It is a This time
is going towards infinity here But this be
I will pull it down towards zero Since
this e question which you see here in
trophy is the function off me I look
to the based off here this expression it
reflects the relationship between entropy and probably don't
worry about the form light on Just keep
in mind how the probability all something happening
or not happening is related to interfere There
is uncertainty That's all you need to know
Okay Formally in all its not required But
this is what happens under the
