So let us begin today's lecture.
We will start recalling what we have been
doing.
We have been looking at the definition of
eigenvalues, eigenvectors and diagonalization
of matrices.
So let us just recall the example we have
done.
We had taken a matrix A as shown with first
row -5, -7 and 2 and 4 and we showed that
it has 2 Eigen values, lambda=2 and -3.
For each one of this eigenvalues, we found
eigenvectors.
C1 is the eigenvector for the eigenvalue 2
and C2 is the eigenvector for the eigenvalue
-3 and finding these eigenvectors is essentially
solving a system of linear equations and this
gave us the matrix P with first column as
the first eigenvector.
Second column as the second eigenvector and
this we showed that since these are eigenvectors
corresponding to distinct eigenvalues, these
2 vectors are linearly independent and as
a result, this matrix P has got full rank,
so it is invertible and then we will check
the property that P inverse AP is a diagonal
matrix and the diagonal being the eigenvalues
2 and -3.
So this was the illustration of how to find
a matrix, which will diagonalise a given matrix
A.
So this prompted us to define and ask the
question, given a matrix A, when does there
exist an invertible matrix P, such that P
inverse AP is a diagonal matrix and how to
find that P. So we made a definition that
a matrix A is diagonalizable if there exists
a matrix P, which is invertible such that
P inverse AP is a diagonal matrix and we stated
a theorem.
Actually we proved it also partially, so let
us just go through the proof again.
If A is nxn matrix, then A is diagonalizable
if and only if, so this is an if and only
if statement, there exist scalars lambda 1,
lambda 2, lambda n and vectors C1, C2, Cn,
such that the following holds 1, A applied
to the vector Ci is lambda i Ci and this essentially
says that the scalar lambda is an eigenvalue
and Ci is an eigenvector for the eigenvalue
and the diagonalizability is captured by the
fact that the set of this eigenvectors C1,
C2, Cn is a linearly independent set and hence
form the basis for Rn.
There are n vectors, which are linearly independent
and form a basis.
So let us just run through the proof again.
So suppose P is, A is diagonalizable, then
there is a matrix P such that P inverse AP
is the diagonal matrix.
So there is an invertible P implies there
is a matrix P with this property.
So let us call C1, C2, Cn the columns of that
matrix P, right.
So P is given by C1, C2, Cn.
Since P is invertible, then none of these
vectors could be 0, because all the vectors
are, they form a linearly independent set.
So none of them is 0 and in fact being invertible
matrix as a full rank, so they are linearly
independent.
So let us define D to be the matrix with a
diagonal entries as lambda 1, lambda n.
Everything else is 0.
So we are defining D as this and we want to
check that these are the lambda 1, lambda
2, lambda n are eigenvalues.
We want to check that.
D is given to us.
We are given P and we are given D. So D is
given to us.
This is D and C1, C2, Cn are the column vectors
of P.
So we want to check that lambda 1, lambda
2, lambda n are the eigenvalues and C1 to
Cn are the corresponding eigenvectors.
So for that, we can also rewrite this equation
P inverse AP=D, slightly differently.
So the matrix multiplication, this implies
if I multiply both on the left by P, then
AP=P*D and let us write P and D in terms of
the columns.
So P is column as C1, C2, Cn.
So this value is put here into D and what
is this multiplication?
That is precisely writing it as the columns
being AC1, AC2, and ACn.
This being the diagonal matrix is column where
lambda 1C1, lambda 2C2, and so on, right.
So this is just matrix multiplication, nothing
more than that.
So this is equal to this, that means AC1=lambda
1C1, AC2 is lambda 2C2 and so on.
So in general the jth entry ACj is lambda
jCj and that precisely says that lambda j
is an eigenvalue with Cj as the eigenvector.
So one way is done.
So conversely let us suppose that we have
got elements, vectors X1, X2, Xn, says that
is a linearly independent set and this lambda
X are the corresponding eigenvalues.
So AXi=lambda iXi, so that is given to us,
write the property.
So what we want to show.
We want to show that there is a matrix P where
that P inverse AP is diagonal.
So let us construct the matrix with these
as the column vectors, right.
So P is defined as, with these vectors as
a column vectors, so these being linearly
independent.
So this is a linearly independent set, so
this is invertible, right.
So rank of P is n, it is an invertible matrix.
So we can compute now what is AP and that
comes out to be right AP is right to the column
A times the column vectors of P X1 up to Xn
that is the same as AX1 to Xn and using the
property that AXi is lambda iXi, this comes
out and that is precisely P*D. So D, if you
write D as a diagonal matrix lambda 1 to lambda
n, then we get AP=P*D, P is invertible, so
that can be written as P inverse AP, just
writing.
There is nothing in the proof except that
writing the matrix multiplication A*P as A*
the column vectors and expressing it appropriately.
So this proves the theorem that a matrix A
is diagonalizable, if and only if there are
n eigenvalues, right and there are eigenvectors
corresponding to them form a linearly independent
set.
So that is the condition for diagonalization.
So the question comes.
When they are linearly independent, everything
is okay.
So the problem is given nxn matrix, how do
you find a linearly independent set of eigenvectors.
So the question, does a given matrix have
n eigenvalues 1 and for each eigenvalue, you
will have an eigenvector with a form linearly
independent set or not.
If yes, the matrix is diagonalizable.
If not, you cannot help it and it is not diagonalizable,
right.
So basically what we are saying is, this n
eigenvectors will form a basis, because this
is an nxn, right.
So saying that a matrix is diagonalizable
is equivalent to saying finding n linearly
independent eigenvectors for that matrix,
right.
Finding eigenvector means first you have to
find the eigenvalues anyway.
So that is a problem we want to take.
So first of all, there is an observation that
supposing you have got lambda 1, lambda 2,
lambda r are distinct eigenvalues of a square
matrix.
All the eigenvalues may not be distinct, right
are the roots of the characteristic polynomial.
So roots can repeat, right.
So let us assume out of the n, lambda 1, lambda
2, lambda r are distinct eigenvalues of the
given matrix and V1, V2, Vr are the corresponding
eigenvectors, right.
Then the claim is this set is linearly independent.
So what we are saying is even though all the
eigenvalues need not be distinct, but if you
collect the eigenvectors corresponding to
distinct eigenvalues, they will form a linearly
independent set, right.
So at least there is some achievement.
They may not be, all of them may not be distinct,
but whatever are distinct, for each one of
them, we find an eigenvector.
So that many linearly independent vectors
we have got, okay.
So let us prove this.
So let us assume, see this number is r.
There are r linearly independent, sorry r
eigenvalues.
So let us assume that l, there is a number
l </= to this r, with the smallest integers
is that V1, V2, Vl is linearly dependent set,
right.
Assume that there is a number l, such that
V1, V2, Vl is linearly dependent, okay.
Suppose such as thing exist, okay and let
us pick up the smallest of them, right.
That means what, if l is the smallest, such
that V1, V2, Vl is linearly dependent set,
then lesser 1 will be independent, right.
This is the smallest vector, which is linearly
dependent.
So if you take less, then that should be independent.
So that means that, if you look at V1, V2,
Vl-1, that is a linearly independent set,
right.
So that means what, so there is this scalars
not all 0, such that, see this is linearly
independent.
So Vl must be a linear combination of right,
because that is the dependent.
V1, V2, Vl is linearly dependent and V1, V2,
Vl is independent.
That means Vl must be a linear combination
of the other 1.
So let us write, so let us Vl be = alpha 1V1+alpha
l-1 Vl-1, where all of them cannot be 0, right.
Because if all of them are 0, then Vl will
be 0, okay, that is an eigenvector.
So eigenvector cannot be 0 right, Vl is eigenvector.
So we assume that is an eigenvector.
So that cannot be 0.
So at least 1 of them is non-zero.
Alpha 1, alpha 2, alpha l-1, 1 of the scalars
is not =0.
So that is, so let us keep that.
So let us multiply both sides by the scalar
alpha l.
Then what you will get.
So alpha l Vl will be = lambda 1 alpha l,
right, okay.
So what is A (Vl)?
See lambda l Vl, lambda l is an eigenvalue.
So what is lambda Vl.
That is =A(Vl), right?
Because Vl is the eigenvector for the eigenvalue,
right lambda l.
So this is because this is an eigenvalue and
Vl is an eigenvector, but what is Vl from
this equation, that is a linear combination.
So we put this value here, okay and use linearity
of A, so A*sum=sigma alpha iA(Vi), right.
But A(Vi), again each Vi is again a eigenvector,
so what is this equal to A(Vi), that is lambda
i (Vi), right.
Is it okay?
So the first equation is because Vl is an
eigenvalue and from 1, this value is put here,
right A of the summation.
And then by linearity, this is sigma alpha
iA(Vi) and each Vi being eigenvector, this
is nothing but lambda i(Vi), okay.
So let us keep this equation as 2.
From this equation also, I can multiply both
sides.
If I multiply both sides by lambda l, what
is lambda lVl?
That is alpha 1 lambda lV1+alpha l-1 Vl-1,
right.
So I can multiply this equation also both
sides by lambda l, I will get another value
of lambda l Vl.
So these 2 must be equal then, after multiplication,
right.
So multiplying this by lambda Vl, it will
get the next equation.
So lambda l Vl=that, so from the 2 and 3,
if I subtract the 2 equations, right both
give me value of lambda lVl, right.
So this is lambda lVl as equation, this is
1 value and the other value is this, so subtract
the 2, you get 0=alpha i, that is the scalar
comes out lambda l-lambda i Vi, i=1 to l-1,
right.
Subtracting the 2 equations, that is all.
Now I know that Vis are linearly independent
and this is a linear combination, which is
equal to 0.
So that means what?
All the scalars must be equal to 0.
So alpha i lambda l-lambda i must be equal
to 0, but at least 1 of the alpha i is not
equal to 0, right.
That means for that particular i, lambda l-lambda
i must be equal to 0.
If that product has to be 0.
That means for that particular i, lambda l
will be equal to lambda i, but that is not
possible because they are distinct.
So our assumption must be wrong, right.
So that is what we are saying.
So by linear independence, all these scalars
must be 0.
But for some alpha i is not 0, so that means
for some i, lambda i must be equal to lambda
l, which is a contraction, because all are
given to be distinct.
So that means we assume there is l, so that
V1, V2, Vl is linearly dependent, that means
that is nothing, does not exist.
That means all of them are linearly independent.
So the theorem says, that if lambda 1, lambda
2 are distinct eigenvalues of the square matrix,
and you pick up and eigenvector for each 1
of this eigenvalues, V1, V2, Vr, then that
is a linearly independent set.
That means the eigenvectors corresponding
to distinct eigenvalues are linearly independent,
right.
So that is it.
So in case we have all eigenvalues are distinct,
then you will get n linearly independent eigenvectors.
You are very lucky, okay.
If not, then the problem can come.
So the problem can come, if the characteristic
polynomial has multiple roots, right, a root
is repeated, right and for that particular
eigenvalue, you may be able to find 1 or 2
eigenvectors.
We do not know, right.
So let us look at an example.
So let us look at the example of matrix A.
1, 1, 0, 1, right.
So how do find the eigenvalues?
This is A, A-lambda i, determinate=0.
So that will give you 1-lambda, 1, 0, 1-lambda.
Determinant of that will be t-1 square, if
t is the variable you are introducing.
If lambda, lambda-1, so that means what, so
there is only 1 eigenvalue, right and that
has got, it is repeating, right.
So there is 1 eigenvalue t=1 and that is repeating,
okay.
So let us find the eigenvector correspond
to that eigenvalue.
So how do you find the eigenvector?
So look at the eigenvalue, what is the eigenvalue,
t=1.
So A-lambda i, right.
So what is A-lambda i?
1-1 that becomes 0, okay.
1-1 that becomes 0, so everything is 0 except
the second entry in the first row, right.
So that A-lambda i for lambda=1.
So what, this itself is in reduced row echelon
form.
We do not have to do anything, right.
So what is the rank of this matrix.
We want to solve, homogenous system A-lambda
i applied to X=0.
We have to find solutions, to find eigenvectors
for that, right.
So for that, we have to know what is the rank
of this.
So what is the rank of this matrix?
That is 1, right.
So rank is 1, so what is null t, that is equal
to 1, rank+null t=the dimension, right.
So null t means what, for the homogenous system,
that is the dimension of the solution space.
So dimension is 1.
That means there is only 1 linearly independent
eigenvector for the eigenvalue lambda=1, right.
We can find that.
So how do you find that?
So this applied X1, X2=0, right.
So what does it mean?
You can give X to any value, right 1, what
is X1.
You can give it 0.
So you get 1 0 as solution for this space,
right.
So this is 1 dimensional, so this solution
space of A-lambda i, lambda V1 is of dimension
1 and you get a vector for this, which spans
this one dimensional.
So 1 0, right that is one of the vectors.
So there is only 1 eigenvalue and that is
repeated, but there is only 1 eigenvector.
So that means what, I cannot find a basis
of R2 consisting of eigenvectors.
So what does the previous theorem tell me.
This matrix is not diagonalizable, right,
because there is only 1 eigenvalue and for
that eigenvalue, lambda=1, I can find only
1 linearly independent eigenvector because
the solution space has got dimension 1, right.
So it depends on for a root, if it is repeated,
can I find that many linearly independent
eigenvectors for it or not, right.
Essentially is boils down to that.
So let us keep some names to these things.
So for the null space of A-lambda i that is
solution space, we will start denoting it
as E lambda, that is Eigen-subspace for the
eigenvalue lambda, right.
All the vectors, which form a solution for
A-lambda i=; applied X=0, right.
So that is null space of that matrix.
We have different names, okay.
So what is, so we will start calling if lambda
is an eigenvalue, that means it is a root
of the characteristic polynomial and root
may be repeated.
So the number of times it is repeated, that
is called the algebraic multiplicity of that
eigenvalue.
So as a root of the characteristic polynomial,
it will be at least appearing once, because
it is a root.
It may appear twice, thrice or so many times,
so number of times it appears, that is called
algebraic multiplicity of the eigenvalue,
right and we denote it by M lambda.
So that is M lambda or M lambda of A depending
on if you want to stress, that is the matrix
A.
So that is the algebraic multiplicity.
Now on the other hand, for this eigenvalue,
we are interested in eigenvectors, right.
So how many eigenvectors exist and how many
exists?
That depends on the dimension of the null
space.
So that we call it as geometric multiplicity
of the eigenvalue.
So what is geometric multiplicity?
That is the dimension of the null space of
A-lambda In.
So that is same as the nullity of this matrix.
Nullity gives the dimension, so that is called
the geometric multiplicity and geometric multiplicity
is written g lower lambda or g lambda A.
So g for geometric multiplicity, M lambda
for algebraic multiplicity, right.
So when do you think a matrix will be diagonalizable?
There may be roots, which are repeated, right,
but if you count with multiplicity, there
are N roots for a matrix of order nxn.
Some of the roots may be repeated, right,
but for the roots, which are repeated, there
are number of times it is repeated is algebraic
multiplicity.
Dimension of the null space of A-lambda A
is geometric multiplicity.
If as many independent you can find as algebraic
multiplicity, then you are through, right.
then you will have total number of eigenvectors
will be linearly independent and they will
be n, right.
So if it is less as happened in the previous
example, lambda=1 was an eigenvalue of multiplicity
2, algebraic, but geometric it was only 1.
So that was lot of the defect in diagonalizing
the matrix, right.
So sometimes this number is called the defect
for the matrix to be diagonalizable.
If both are equal, then it will become diagonalizable,
right.
Because for each eigenvalue, we will have
as many eigenvector as is the algebraic multiplicity.
So put them together, you will have a linearly
independent basis, linearly independent set
of eigenvectors forming a basis.
So we can diagonalize that, okay.
So that is what, so this is what normally
called the defect, but that is not really
important, okay, what you call that difference,
okay.
Algebraic multiplicity will always be bigger
than geometric multiplicity.
Is that clear?
Yes.
The number of times root is repeated algebraically
that many eigenvectors, you may not be able
to find.
You will be able to find only less of them,
because they are independent.
So theorem says if the algebraic and the geometric
multiplicity of a matrix agree, for every
value eigenvalue lambda, then the matrix is;
then there is a basis consisting of eigenvectors
and hence the matrix is diagonalizable, right.
So it is quite clear what is the proof.
Basically, for each if they are distinct eigenvalues
with multiplicity M1, M2, and Mk, right for
each we can find a basis right of as many
as the algebraic multiplicity.
So find the basis and put all the basis together.
For each eigenvalue, there is eigenspace,
there is a basis, which has as many eigenvectors
as is the algebraic multiplicity, if they
are equal.
Put them together, you will get a basis consisting
of eigenvectors for the underlying space Rn,
okay.
So their sum is equal to the degree of that
is equal to R, right.
Algebraic multiplicity equal to geometric
multiplicity, total=n, so for each one of
them, there is a basis, put them together.
So there is nothing complicated in this theorem.
Basically, that is how you will analyse whether
a matrix is diagonalizable or not.
So the main problem is given a matrix to show
it is diagonalizable or not, what we have
to do?
We have to find, for the given matrix, what
are the eigenvalues.
Once you have the eigenvalues, find out corresponding
eigenvectors, right.
Find out the dimension of the corresponding
eigenspace that is same as the null space
of A-lambda i, right.
If each is equal to the algebraic multiplicity
dimensionality of A-lambda i=the algebraic
multiplicity, then we are through, then we
can find a basis consisting of eigenvectors,
right.
These are just rewording the things.
We say 2 matrices A and B are similar if A
can be obtained from B or B can be obtained
from A by this process.
B=P
inverse AP, where P is invertible, right.
So in terms of this the diagonalizability
can be said as a matrix, any matrix A is similar
to a diagonal matrix, if and only if, the
algebraic multiplicities are equal to the
geometric matrix and we can find that.
So that is also the diagonalization process.
So we are rewriting in terms of the terminology
of 1 or 2 matrices said to be similar because
that useful later on in the other context
also.
So for us, it says that a matrix A is diagonalizable
or similar to a diagonal matrix if and only
if you can find an eigen basis.
That means find a basis consisting of eigenvectors
and that is same as, for each eigenvalue,
the algebraic multiplicity is same as the
geometric multiplicity.
