good morning in this lecture we will be studying
singular value decomposition this topic ah
embodies a very deep connection between quite
a few different topics in the area of linear
algebra consider this situation we have already
studied eigenvalue problem in which we wanted
to ah decompose a matrix a in this form with
u and v equal we have already studied eigenvalue
problem and all the time of our study in eigenvalue
problem we have faced this question whether
the decomposition of this sort will exist
or not if it exists then how to handle it
and so on so it would be nice always in the
eigenvalue problem if we could make this lambda
diagonal with u and v orthogonal and also
such things and at every step our work was
made with ah difficulties of several sorts
first among all matrices we could ask this
question only for those matrices which are
square that is a sub set a sub set of matrices
the sub set of square matrices constitute
the only matrices for which this question
arises
so for square matrices this question arises
to begin with ok even in the square matrices
not all matrices can be diagonalized not all
square matrices can be diagonalized so among
square matrices we had a subset which is the
set of diagonalizable matrices for which this
is this kind of a decomposition is possible
right
even among diagonalizable matrices we had
another sub set of matrices which are symmetric
for which this decomposition would be affected
with orthogonal v right which is same as u
for that matter with that condition full filled
even among the symmetric matrices for which
we had this valuable theorem that you can
work out an orthogonal digonalization even
there the diagonal elements of lambda could
be negative now even among symmetric matrices
we had a sub case sub set which is positive
semi definite in which case the lambda i turns
out to be non negative now this is the best
possible situation which we could sink of
and that is a sub case of the sub case of
the sub case of the generalized general form
of the matrices
now we can ask this question that we do not
ask for a similarity transformation and we
focus on this form of the decomposition when
you say we do not ask for similarity we basically
want to allow u and v to be different ok so
in that case we ask this question that if
we do not ask for u and v to be equal then
what are these we can ask for and get results
and with just this one relaxation of allowing
u and v to be different different in content
as well as in size if we allow that then we
can get a decomposition of this sort which
is guaranteed for all matrices irrespective
of size and shape that means even a rectangular
matrices with orthogonal u and v matrices
and with non negative diagonal entries in
delta in this matrix lambda diagonal matrix
lambda that is in that case we do not refer
to it as lambda because lambda has been already
used for the matrix of eigenvalues
so we show that as sigma that means that just
by allowing this u and v to be different we
can effect a decomposition of this sort with
all the other phase sets that is the decomposition
will be possible for all matrices and it will
be always possible the question will arise
for all matrices including rectangular that
diagonalized that decomposition we cannot
call it diagonalization that decomposition
will be always possible with orthogonal u
and v not same anymore and the diagonal entries
of this matrix sigma will be
all non negative such a decomposition is the
singular value decomposition and those diagonal
entries are called singular values of the
matrix a underline is this very important
theorem called the s v d theorem or singular
value decomposition theorem the theorem says
for any real matrix a of size m by n there
exists orthogonal matrices u which is m by
m and v which is n by n both orthogonal such
that u transpose a v is diagonal matrix of
size m by n now what is this idea of a diagonal
matrix of a rectangular size so its diagonal
entries are sigma one sigma two sigma three
etcetera all non negative which you obtain
by getting f square matrix first of size p
by p in which p is lesser of the two dimensions
m and n
now if you want this diagonal matrix to be
m by n size then whichever is larger m or
n that many extra rows below rows or that
many extra zero columns you append ok and
these diagonal entries sigma one to sigma
p are called the singular values of this matrix
a similar result is there for complex matrices
then for that the as many theorem will read
for any complex matrix a belonging to c m
by n they are exists unitary matrices u and
v such that u star a v where star is a conjugate
transpose is real sigma this is always real
and so on ok so now this theorem gives the
basis for the decomposition in this manner
for a matrix a
now the question arises how to construct u
v and sigma the three components the three
factors the way we work out their construction
at the same time provides a proof also of
the s v d theorem that such factors u v sigma
will always exists so let us quickly look
at the construction so construct the singular
value decomposition the factors u sigma and
v you first say that ok if we would decompose
a in this manner a as u sigma v transpose
then it transpose a transpose will be this
v sigma transpose u transpose and then we
can just multiply it as we multiply it u being
orthogonal u transpose u will be identity
and we have got v sigma transpose sigma v
transpose now sigma transpose sigma
we have already discussed that sigma is a
matrix of this shape in which if m is less
then it will have only m columns which will
mean that the matrix will be only this much
these rows will not be there since m is less
then it will have this shape if n is less
then it will have this shape ok so extra zero
rows or extra zero columns there will be no
question of anything here because one of these
zero blocks will be here not both ok so if
sigma is of this shape then sigma transpose
sigma will be a square matrix in which the
diagonal entries will be sigma one square
sigma two square up to sigma p square and
then since this matrix is n ah n size so if
n is larger than there will additional zero
entries in the rest of the diagonal position
and all the of diagonal entries will be zero
so that is the description of this sigma transpose
sigma now here this sigma transpose sigma
this matrix is being called lambda which has
a reason you see a transpose a is certainly
symmetric not only symmetric its positive
a semi definite also you cannot say a priori
whether it is positive definite or not but
positive semi definite it is certainly be
now if this is a symmetric matrix then this
certainly has a diagonalization and orthogonal
diagonalization for that matter and this v
lambda v transpose is actually the decomposition
that you do when you solve the diagonalization
problem of a symmetric matrix so that means
this v which you want in singular value decomposition
is in fact the matrix i can vectors of a transpose
a ok and this lambda then is the diagonal
matrix of eigenvalues of a transpose a if
so then we already know how to determine v
n lambda because we have studied the eigenvalue
problem of a symmetric matrix in good detail
we can effect this diagonalization so that
means by effecting the diagonalization of
a symmetric matrix we determine v and lambda
the moment v and lambda determined we can
work out sigma because each diagonal entry
of lambda the first png s are nothing but
sigma one square sigma two square sigma three
squared and up to sigma p squared right so
from the first p lambdas from here which are
all non negatives we can take the square root
so when you take the square root there are
two square roots
for a positive number one positive one negative
so you collect only the positive ones which
you put as sigma one sigma two sigma three
etcetera up to sigma p so all the non trivial
entries of this matrix sigma as sigma one
sigma two sigma three etcetera up to sigma
p so all the non trivial entries of this matrix
sigma is now our hand then and then we append
that with appropriate number of zero rows
or zero columns depending upon what is the
size of a which is the same as the size of
sigma that means v and sigma are now in our
hand
now remember a is u sigma v transpose and
v is orthogonal so that means we can post
multiply that original definition of the singular
value decomposition with v and then we de
transpose v will identity from here you will
get only u sigma and on this side you will
get a b in which In this entire equation a
was originally given v and sigma we have determined
and we are left with the problem of determining
this matrix u the columns of the matrix u
so four situations will arise when we go to
determine the columns of matrix u in fact
four situations may arise in any particular
case only three of them will arise the there
either the third will arise or the fourth
will arise depending upon whether the matrix
a has more rows or more columns ok so first
situation is actually the one in which you
will have some information to determine if
you equate the two sides column by column
then you will find that the left side will
give you columns which is a v one a v two
a v three where v one v two v three are columns
of matrix v ok and from the right side you
will get you will get columns the corresponding
columns as u one into sigma one plus all zeros
then u two into sigma two then plus all zeros
and so on that means you will get this kind
of column equations when you break this column
by column that will be the first r columns
if r is the rectangle that means for the non
zero singular values
so out of these p singular values some of
them may be zero right so for non zero singular
values corresponding column equations will
give you this kind of equations and if sigma
k is non zero the determining the corresponding
columns of u is easy you just divided a v
k by sigma k and you get the columns of u
so these columns developed from here are bound
to be mutually orthogonal
you can verify that suppose two columns u
y and u j have been developed like this and
you want to find out u y transpose u j ok
so they are not only orthogonal they are orthonormal
that is each of them is a unit vector also
so being orthonormal this has to be one if
i and j is same and zero if i and j is i and
j are different so you can see this that when
you consider u y transpose u j from these
expressions from here you have determined
u y u j then here you will find that you will
get v i transpose a transpose a v j now a
transpose a is the matrix for which we actually
solve the eigenvalue problem right so v j
is its eigen vector corresponding to eigen
value lambda j that is sigma j square
so when you write this here one by sigma i
is here one by sigma j is here right so we
collect the scales together and then we are
left with vi transpose vi transpose a transpose
avj so write vi transpose and a transpose
a vj is lambda j vj this is lambda j this
is vj so lambda j that is sigma j square is
scalar which we can bring here and we are
then left with v i transpose v j here from
there you find that if i and j are different
then vi transpose vj is zero because v and
lambda together give the orthogonal diagonalization
of a that means
columns of v are mutually orthogonal right
so if i and j are different then v i transpose
v j is zero and you have got the orthogonally
of u i transpose u j right here on the other
hand if i and j are same then vi transpose
vj transpose vj you will get which is one
because v is orthogonal so each column v j
in particular is of size one so in that case
v j transpose v j will be one and this sigma
j transpose square cancels with this square
i is equal to j in this case so you will get
one here that means u j transpose u j will
be one so that shows the orthonormality of
all the columns that we have determined from
this this much for those singular values which
are not zero right for non zero singular value
for the singular values which are zero we
have got this a v k equal to sigma k u k and
sigma k is zero that means you are talking
about a v k equal to zero right so the corresponding
u k is left in determinant so that means that
you cannot determine uk from this relationship
because the coefficient is zero but it is
left in determinate that means you are free
to chose a suitable uk what is a suitable
uk a unit vector that is orthogonal to all
the other columns that we have already determined
right now in a case where m is less than n
that means u has less number of columns and
v has more number of columns right that means
in that case you will get further equations
a v k for k greater than m for which on this
side you will get zeros right and from that
there is no corresponding column of u to determine
right
so this is gone the fourth case is where m
is greater that is the matrix a has more rows
than columns in that case after all this calculations
there will be further row columns of u which
are left indeterminate so just like the case
two in this case also there are additional
two columns ah additional columns of u which
are left indeterminate so just this case in
this case also the additional u vectors are
determined to make the entire u matrix orthogonal
that means additional columns of this case
with zero singular values and additional u
columns corresponding to this case with additional
singular values additional columns which have
no matching
singular values so these two cases are determined
based on the orthogonality requirement of
u so that means in one line you can say we
extend the columns of u determined from here
to an orthonormal basis and that full set
of m vectors will give you the square matrix
u so this way after the three factors of the
singular value decomposition have been constructed
you have a equal to u sigma b transpose each
other c you have in hand after constructing
the singular value decomposition like this
you would like to see what are the properties
of such a decomposition so first question
after verifying existence is uniqueness is
it unique the actual answer is that its actually
not unique for example you can apply several
changes in it and still the changed u sigma
v will constitute another singular value decomposition
of the same matrix so that means that you
can do several changes
so those changes are here and then you can
say for a given matrix the s v d is unique
up to these changes that means it is actually
not unique it is determinate but such changes
will not disturb the requirements such changes
will not disturb the ah fact that the matrices
the decomposition is still and s v d of the
given matrix so what are these changes which
are possible the same permutation of columns
of u columns of v and diagonal elements of
sigma that means if you interchange ah sigma
two and sigma five and at the same time interchange
columns u two and u five and interchange v
two and v five then the resulting u sigma
and v will still give
as s v d and so on now corresponding to equal
singular values you have got columns of u
and v so among them if you work out an orthogonal
re organization that is suppose sigma two
and sigma three are same then you say that
i will work out this now this will be my new
u two and this will be my new u three and
corresponding for v also between v two and
v three also you will make the same transformation
this will be still ok the resulting you and
v matrices with the same sigma will still
give you a singular value decomposition which
is valid the particular case particular transformation
that we worked out here is cos theta sin theta
sin theta cos theta here that is cos theta
minus sin theta note that this is minus so
that matrix is an orthogonal matrix
so such orthogonal linear combinations for
columns of u and corresponding columns of
v is fine that will not disturb the singular
value composition for zero or nonexistent
singular values you can do any linear combination
any arbitrary orthonormal linear combinations
among the columns of u or columns of v so
that will still be alright so these reorganizations
in an already existing svd can be done and
the result will be still an svd ok
now if this can be done then we can do something
better than what we have done till now thats
we have determined sigma one sigma two sigma
three now if the permutations can be appropriate
in that then we can order them that is we
can organize columns of u and v in such a
manner that the sigma the singular value comes
first is the largest magnitude and so on this
we can do so this is typically done when we
work with singular value decomposition so
that means the non zero singular values come
at the top with this order and after that
the zero singular values come and after that
of course additional rows or columns may come
depending up on the rectangular size and shape
of the given matrix right now here what is
r r is the rank and this is a very simple
result which you can immediately establish
that is rank of the given matrix is a same
as rank of sigma which is r here ok other
properties you would have already
noticed that matrix of matrix a is of size
m by n that means it maps vectors from r n
to r m right in which this is the domain and
this is the co domain right now you can see
that v being an n by n orthogonal matrix can
give a basis which is orthonormal basis the
columns of v are actually n dimensional vectors
and they are all mutually orthonormal so that
means that the columns of v give us an orthonormal
basis for the domain similarly columns of
u will give an orthonormal basis for the co
domain and now here we see how these new basis
v and u decompose the domain and codomain
in to orthogonal sub phases so you consider
the application of a on a arbitrary vector
x with a written as u sigma v transpose now
if you represent ah the vectors in the domain
the vector x in the domain in this new basis
v then the expression the coordinates of that
those vectors in this new basis will be v
transpose x actually v inverse x but since
v is orthogonal so it will be same as v transpose
x right so if we call that y then will have
u sigma
y u is written here and recognizing that sigma
is a diagonal matrix which sigma one sigma
two sigma t written on the diagonal entries
among which the top r are non zero you will
have sigma y as sigma one y one sigma two
y two etcetera up to sigma r y r below that
everything else is zero right and u has been
broken and written in this fashion r columns
here and then rest of them here now when you
consider this product you will find the product
is sigma one y one into u one plus sigma two
y two into u two and so on up to this after
that everything else is being here now see
what is happening in this some you will notice
that this has nonzero components along only
the first r columns in this product the component
along u r plus one u r plus two u r plus three
etcetera are all zero that means that a x
has nonzero components along only the first
r columns of u right that means u has given
as an orthonormal basis for the codomain in
which the range the vectors ax are contained
only with the first r columns of u that means
u gives an orthonormal basis for the codomain
such that the range is exactly described by
the first r members of u and the rest of them
described and orthogonal component of range
orthogonal complement of range ok so that
means the entire codomain has been decomposed
into two orthogonal subspaces the first one
is the range which is x described with the
first r columns of u
which are corresponding to the nonzero singular
values and the rest of them are components
in the orthogonal complement of range which
are not in the range right similarly on the
domain side if you see this v transpose x
is y right so v transpose what are the rows
of u transpose rows of v transpose are v one
transpose v two transpose v three transpose
and so on right and where v one v two v three
are columns of v so the entries the coordinates
in y y one y two y three are actually v one
transpose x v two transpose x etcetera so
that is v k transpose x is y k that us the
coordinate y k is found like this right so
that is component of x along the unit vector
v k so the full x is component of it along
v one into the v one unit vector plus its
component along v two into unit vector v two
and so on like this now in this you will find
that those vectors which are here only make
a contribution in the ax mapping ok those
here we will not make any such contributions
because y r plus one yr plus two etcetera
are zeros that we have already seen right
the are made zero by the in this product sigma
y so whatever is y r plus one y r one plus
two etcetera sigma y will kill their contributions
that means whatever is y r plus one yr plus
two etcetera their contribution in the product
here will be zero because sigma multiplied
to them will kill their contributions so that
means v here gives you an orthonormal basis
for the domain
such that the components v r plus one v r
plus v n they area actually constitute the
null stage so you find that on the codomain
side range is constructed by the columns of
u corresponding to nonzero singular values
and on the domain side the null space is spend
by the other columns other columns of v that
is columns of v which are corresponding to
the zero singular values or non existence
singular values and thats it ok now with this
understanding in the background we proceed
and find a few more interesting things in
particular we work out the revised definitions
of a normal matrix and the condition number
of matrix in basis v if we write a vector
in the domain ok in this manner then this
can be written as v c right where v is the
matrix with columns v one v two v three etcetera
up to v n and c is the vector with this scalar
components then from the definition of norm
which we have seen earlier in the chapter
seven of the text book in an earlier lecture
we discussed so from the definition of the
normal matrix we say that norm square is maximum
over v of norm a v square by norm v square
now in this if we insert this description
of the general vector v that is v c ok so
then first of all from the norm definition
we get this and there in place of small v
we insert v c then we get this for v we have
v c and for v transpose we have c transpose
v transpose now here we have already seen
that a transpose a
diagonolazation was carried out with the basis
matrix v and the corresponding diagonal matrix
sigma transpose sigma right so in place of
this whole thing we can write sigma transpose
sigma right now here sigma transpose sigma
diagonal matrix with entries sigma one square
sigma two square up to sigma p square and
then perhaps additional zeros right so this
numerator based on to basically this right
and now you say that we want the maximum of
it ok when it will be maximum if sigma one
sigma two sigma three sigma four are non all
of the same magnitude then this is will be
maximum when c k is a vector when c is a vector
in which the only component is along the largest
one ok which gets magnified by the largest
amount then only you will get the maximum
value of this and so you get the norm as norm
square as the case where only that c k has
a non zero value for which sigma k is maximum
that is sigma max
so when you put sigma max there then you got
this ok so norm is now found as the largest
singular value of the matrix so this is the
new revised definition of the norm of a matrix
now for a non singular square matrix we worked
out condition number right so here again we
try to do that for a inverse we get this ok
which is v sigma inverse u transpose which
is this now you notice that by the same definition
if we try to work out the norm of a inverse
then it will be the largest singular value
of a inverse and the smallest singular value
from a will actually in its reciprocal will
give the largest singular value for a inverse
so you find that the norm of a inverse is
one by sigma min of the original matrix a
so the condition number is norm of a into
norm of a inverse that is sigma max into one
by sigma min so you get this and that brings
us to the revised definition for norm and
condition number of a matrix the new revised
definition of norm and condition number will
be like this the norm of a matrix is the largest
singular value and the condition number is
the ratio of the largest singular value through
the least now note that this revised definition
of condition number
can equally cater to rectangular matrices
also the old definition based on inverse would
not be able to do that now note one more important
issue if you can arrange the singular value
increasing order as we have been talking about
then with rank of the matrix has r you can
write it in this manner so in which u r is
that sub matrix which has all the columns
of u which are corresponding to non zero singular
value similarly v r are the corresponding
columns of v and u bar and v bar constitute
the rest of the columns in that case this
matrix a which is u sigma v transpose can
be multiplied in this block form in which
the three components that you get out of it
will be zero base because of these and the
non zero component is only this u sigma u
r sigma r v r transpose so the other components
are zero and this gives you this summation
that will mean that if you can store the components
of u and v the columns of u and v which are
corresponding to non zero sigma then that
alone alone will be the sigma values of non
zero sigma sigma k s will be able to reconstruct
the matrix a and that means that for a large
matrix which only a few top singular values
as nonzero and significant you can effect
a very efficient storage and reconstruction
so with this background now go ahead and see
what is the application and what is the particular
advantage
of singular value decomposition ah for solving
linear system of equations a x equal to b
and we again revise the definition of pseudo
inverse compared to what we did earlier in
the chapter seven so in the background there
is this term called generalized inverse for
any matrix you can define a generalized inverse
or g inverse if for a vector be in the range
a g b is a solution of this that is for a
matrix a ah a matrix g can be considered a
an inverse of some sort generalized inverse
if for a consistent right side vector b g
b gives you the solution that way g operates
something like an inverse so pseudo inverse
is actually a special case of generalized
universe
the pseudo inverse or the moore penrose inverse
is defined in this manner and in order to
differentiate it from the ordinary inverse
we write it with this symbol a hash so a hash
is u sigma v transpose hash now here when
ever an inverse is actually possible we take
the a hash we take the pseudo inverse as same
as the actual inverse so the pseudo inverse
of this will be v transpose hash sigma hash
u hash now v transpose and u are orthogonal
so for them actual inverse is exists so for
v transpose hash we write v transpose inverse
which is v and similarly for u hash which
is u inverse which is u transpose that actual
problem which ah like actual problem is with
this right so this is the one which requires
a definition so that is defined like this
for this structure of sigma in which there
is a diagonal matrix of r by r size here with
r non zero singular values and everything
else is zero sigma hash is defined as this
ok
so now that will mean that those diagonal
entries which are non zero their reciprocals
will come here and those diagonal entries
which are zeros so their reciprocal rather
than infinity we put zero here this is very
interesting in place of one by zero which
should come as one by zero by the ordinary
rule here we are actually writing zero so
this is how we define the pseudo inverse or
moore penrose inverse in elaboration you can
write sigma hash in this manner so sigma one
to so in place of the diagonal entries row
one to row p u write where row k is the reciprocal
of sigma k
when sigma k is non zero and sometimes in
practical cases even if sigma k is very small
then we consider it as good as zero ok that
is here so for those cases where sigma k is
zero or extremely small we put row k as zero
rather than putting one by extremely small
number or one by zero we actually put it zero
there ok so this is the definition of pseudo
inverse now sometime at leisure you should
compare this expression and this a description
of the pseudo inverse with the special cases
full rank cases which we worked out in chapter
seven ok as right inverse and left inverse
so in those cases where the matrix is full
ranked those definitions will appear as special
cases of this now what are the inverse like
properties or qualities of this pseudo inverse
first is pseudo inverse pseudo inverse of
the matrix id the original matrix considering
only ah actual zero cases being put zero here
and not he truncations ok second important
point which is like inverse that if a is actually
invertible if it is a square non singular
matrix then this will boiled on to the ordinary
inverse and a hash b will give the correct
unique solution of ax equal to b on the other
hand if the situation is not so good and if
a x equal to b is an under determined but
consistent system that is full rank case of
more unknowns and less equations then a hash
b selects that solution x star which has the
minimum norm out of an infinite possible solutions
on the other hand if the system is inconsistent
then this a hash b defined with the same formula
then this a hash b will minimize the least
square error that is if the system is inconsistent
there is bound to be some error a x equal
to b in ax equal to b ax will never be exactly
equal to b then this same a hash b will find
you an x star which gives the minimum error
now if that minimum error giving solution
is also not unique if there are invite of
them then at the same time it will give you
that solution out of those infinite possible
solutions giving the minimum error which has
the least size so all these sensible things
the pseudo inverse does with the help of a
single definition now you should contrast
this with the solution which is obtained earlier
from tikhonov regularization
so pseudo inverse solution is typically used
when you want precise values and also for
diagnosing a linear system whether it has
any such inconsistency or under determinacy
problems and so on on the other hand tikhonov
solutions can be used when the position matrix
a changes over a domain and you want continuity
of solutions so tikhonov stations is preferable
for continuity but diagonals is and for precise
solutions pseudo inverse solution is better
tikhonov solution will always inhibit some
error now in he exercises of this chapter
in the text book actually the reason its exercise
which asks you to determine the tikhonov solution
and the pseudo inverse solution and compare
then for a matrix a which has
one of the components variable now we want
to know how this whole thing is accomplished
by a single formula so for that first we note
down what is the pseudo inverse solution that
we find that is this is the pseudo inverse
of a and when we multiply it with b we got
this sum where the summation is over k from
one to r that is for all the non zero singular
values so for that we get this expression
ok and when we reduce it then we have u k
transpose b which is scalar divided by sigma
k because row k is one by sigma k we can if
we write it like this then we will find that
the pseudo inverse solution that we are getting
is actually a linear combination of r basis
members v one to b v r the corresponding components
of these scalar values written in the parenthesis
now we want to pose the problem as first minimization
of the error and then if the solution is infinite
then further minimization of the size if the
solution and then see whether we get this
same solution so if we want to minimize the
error these square error ok half norm square
of the error a x minus v then as we open this
we have already encountered this earlier ones
then the minimalist condition first order
condition is that its derivative its gradient
with respect to x must be zero so when we
do that we get this as we got last time here
now in place of a we write u sigma v transpose
and through few steps we come to this point
now note that this is a matrix equation
and this is the corresponding scalar equation
for each component of that vector equation
right so this is for each k from k equal to
one to r where r is the rank that is non zero
singular values so from here that you find
that v k transpose x that is component of
x along the unit vector v k turns out to be
u k transpose b divided by sigma k this sigma
k square goes down in the denominator and
this is what is actually sitting here right
so in this solution x star is composed of
several vectors v one v two up to v r in which
the component of v k is this right so that
means x star is actually giving you this combination
of these vectors with these components now
this first order condition for the minimality
of this tells you what should be the components
of the solution along the basis vectors v
one v two v three up to v r along v r plus
one v r plus two v r plus three what should
be the component that is not mentioned here
that means those components can be anything
the error is still remaining because the condition
is satisfied that means the general solution
for minimum error you can constitute with
the components along v one v to v r as specified
here and any component along the rest of the
directions that will give you this with components
as prescribed along the first r directions
along the first r basis numbers and anything
in the rest that means y is free here so v
r plus one y one v r plus one r plus two y
two and so on
since these y one y two y three can be anything
the v bar is the basis for the null space
that you will appreciate because any null
space member will not change anything in the
solution in the right hand side so now we
say that out of all these infinite possible
solutions which one is the one which is of
least size so then what we ask for we ask
for how to minimize the size of the vector
subject to this error being minimum anyway
that means the solution you take from here
and minimize it with respect to y that is
which y to select to minimize the size of
the vector this so we say minimize the size
ok that is x norms square that is this
so now you find that x star this part is a
linear combination of v one to v r and this
part is a linear combination of other basis
members and all other basis members are orthogonal
to the basis members of the first family that
will make this x star sitting in one sub space
and this part v bar y sitting in another sub
space two sub spaces being orthogonal to each
other so how do you find the square of some
of this if the two members are in a in orthogonal
sub spaces so since they are mutually orthogonal
this will be simply x star square plus v bar
y norm square ok now you find that if we then
want to ask that which y will give this as
minimum where where this is already available
and cannot be tempered only y can be changed
then y equal to zero will give you this as
zero and this sum as minimum
so that means that y equal to zero will you
give you the minimum size vector x which is
of this form which minimizes the error now
how this whole thing happens that you get
all the optimal conditions in the suggestion
that you construct with the help of the pseudo
inverse so for that let us investigate the
anatomy of this optimization through ah s
v d if we use basis v and basis u for the
domain and codomain then the variables x and
b under question x unknown b right hand side
unknown they are transformed as this that
is in the new basis v the expression of x
will be this y and in the new basis u for
the codomain the vector b will be represented
as c which is u transpose v
now if we write the system of linear equation
a x equal to b and a is u sigma v transpose
then v transpose x is y ok and u brought here
as u transpose multiplied with b u transpose
b is c so then you basically get the equation
in the new bases v on this side and u on this
side as sigma y coequal to sigma c and this
is a completely decoupled system because if
we write this system of equations sigma y
is equal to c we will find 
sigma one sigma two up to sigma r like this
y one y two up to y r and below that possibly
more variables ok up to y n and on this side
we will have c one c two v up to c r and below
perhaps more things now the way the singular
value solution has been constructed you get
the useful information only from the first
r rows first r equations and they are completely
decoupled because y one simply c one by sigma
one y two is simply c two by sigma two up
to this y i is c r by sigma r
what happens below below you find that all
zeros here that means right left side of the
equation will give zero question is what is
here in c if there are corresponding zeros
here then that will means that the system
is consistent but that information zero equal
to zero is completely unusable if it does
not have any information content on the other
hand if some particular values here r non
zero that will mean we are talking about zero
equal to something non zero that means that
is the conflict that is the source of inconsistency
in the system of equations so in this situation
we find that for k equal to one to r this
is what we determined and that is the only
useable component and for c k greater than
zero for k larger greater than one r that
is below for c k greater than ah non zero
for c k is non zero you will find that you
have purely undesirable conflict s that is
simply the inconsistency decompose into ah
an orthogonal sub space and which cannot be
compensated for by any other component and
c k equal to zero will give you completely
redundant information that is again the completely
redundant information is also collected over
an orthogonal sub space which cannot be changed
from any other component from outside so by
setting the appropriate diagonal entries of
sigma hash as zero s v d extracts this pure
redundancy and inconsistency and rejects that
so it rejects the redundancy it rejects the
inconsistency and gives you that solution
which is the best possible achievable at the
same time since these were free skill because
the usable component gave you the value of
only this much setting this variables as zero
minimizes the norm of y and since the norm
of x will be the norm of v y v is orthogonal
so though the multiplication
of an orthogonal matrix the norm of the vector
does not change so minimum y will mean minimum
x for the norm now the points to notice here
important points to note are the following
the first s v d provides you a complete orthogonal
decomposition of the domain and codomain and
it separates functionally distinct subspaces
on this side ah the null space from the rest
on that side the range from the rest it offers
a complete diagnosis of the pathologies of
a system of linear equations and then pseudo
inverse solution a hash b gives you a ah the
most meaningful solution of a linear system
in all cases apart from these what has not
been noticed still now clearly is that with
the existence of svd guaranteed that any metrics
real or complex you can write as u sigma v
star or v sigma v transpose many important
mathematical results and many other formulations
can be worked out in a straightforward and
direct manner in many of the cases in coming
lectures based on this existence of s v d
you will find that you will be able to appreciate
the deductions of many of the results quite
easily
so here we in this lecture we have actually
connected two important problems systems of
linear equations and eigenvalue problems together
through the singular value decomposition in
the next lecture which will be the last lecture
of our linear algebra module we consolidate
a few important issues based on the abstract
fundamental ideas of linear transformations
thank you
