In linear algebra, an eigenvector or characteristic
vector of a linear transformation is a non-zero
vector that changes by only a scalar factor
when that linear transformation is applied
to it. More formally, if T is a linear transformation
from a vector space V over a field F into
itself and v is a vector in V that is not
the zero vector, then v is an eigenvector
of T if T(v) is a scalar multiple of v. This
condition can be written as the equation
T
(
v
)
=
λ
v
,
{\displaystyle T(\mathbf {v} )=\lambda \mathbf
{v} ,}
where λ is a scalar in the field F, known
as the eigenvalue, characteristic value, or
characteristic root associated with the eigenvector
v.
If the vector space V is finite-dimensional,
then the linear transformation T can be represented
as a square matrix A, and the vector v by
a column vector, rendering the above mapping
as a matrix multiplication on the left-hand
side and a scaling of the column vector on
the right-hand side in the equation
A
v
=
λ
v
.
{\displaystyle A\mathbf {v} =\lambda \mathbf
{v} .}
There is a direct correspondence between n-by-n
square matrices and linear transformations
from an n-dimensional vector space to itself,
given any basis of the vector space. For this
reason, it is equivalent to define eigenvalues
and eigenvectors using either the language
of matrices or the language of linear transformations.Geometrically,
an eigenvector, corresponding to a real nonzero
eigenvalue, points in a direction that is
stretched by the transformation and the eigenvalue
is the factor by which it is stretched. If
the eigenvalue is negative, the direction
is reversed.
== Overview ==
Eigenvalues and eigenvectors feature prominently
in the analysis of linear transformations.
The prefix eigen- is adopted from the German
word eigen for "proper", "characteristic".
Originally utilized to study principal axes
of the rotational motion of rigid bodies,
eigenvalues and eigenvectors have a wide range
of applications, for example in stability
analysis, vibration analysis, atomic orbitals,
facial recognition, and matrix diagonalization.
In essence, an eigenvector v of a linear transformation
T is a non-zero vector that, when T is applied
to it, does not change direction. Applying
T to the eigenvector only scales the eigenvector
by the scalar value λ, called an eigenvalue.
This condition can be written as the equation
T
(
v
)
=
λ
v
,
{\displaystyle T(\mathbf {v} )=\lambda \mathbf
{v} ,}
referred to as the eigenvalue equation or
eigenequation. In general, λ may be any scalar.
For example, λ may be negative, in which
case the eigenvector reverses direction as
part of the scaling, or it may be zero or
complex.
The Mona Lisa example pictured at right provides
a simple illustration. Each point on the painting
can be represented as a vector pointing from
the center of the painting to that point.
The linear transformation in this example
is called a shear mapping. Points in the top
half are moved to the right and points in
the bottom half are moved to the left proportional
to how far they are from the horizontal axis
that goes through the middle of the painting.
The vectors pointing to each point in the
original image are therefore tilted right
or left and made longer or shorter by the
transformation. Notice that points along the
horizontal axis do not move at all when this
transformation is applied. Therefore, any
vector that points directly to the right or
left with no vertical component is an eigenvector
of this transformation because the mapping
does not change its direction. Moreover, these
eigenvectors all have an eigenvalue equal
to one because the mapping does not change
their length, either.
Linear transformations can take many different
forms, mapping vectors in a variety of vector
spaces, so the eigenvectors can also take
many forms. For example, the linear transformation
could be a differential operator like
d
d
x
{\displaystyle {\tfrac {d}{dx}}}
, in which case the eigenvectors are functions
called eigenfunctions that are scaled by that
differential operator, such as
d
d
x
e
λ
x
=
λ
e
λ
x
.
{\displaystyle {\frac {d}{dx}}e^{\lambda x}=\lambda
e^{\lambda x}.}
Alternatively, the linear transformation could
take the form of an n by n matrix, in which
case the eigenvectors are n by 1 matrices
that are also referred to as eigenvectors.
If the linear transformation is expressed
in the form of an n by n matrix A, then the
eigenvalue equation above for a linear transformation
can be rewritten as the matrix multiplication
A
v
=
λ
v
,
{\displaystyle Av=\lambda v,}
where the eigenvector v is an n by 1 matrix.
For a matrix, eigenvalues and eigenvectors
can be used to decompose the matrix, for example
by diagonalizing it.
Eigenvalues and eigenvectors give rise to
many closely related mathematical concepts,
and the prefix eigen- is applied liberally
when naming them:
The set of all eigenvectors of a linear transformation,
each paired with its corresponding eigenvalue,
is called the eigensystem of that transformation.
The set of all eigenvectors of T corresponding
to the same eigenvalue, together with the
zero vector, is called an eigenspace or characteristic
space of T.
If the set of eigenvectors of T form a basis
of the domain of T, then this basis is called
an eigenbasis.
== History ==
Eigenvalues are often introduced in the context
of linear algebra or matrix theory. Historically,
however, they arose in the study of quadratic
forms and differential equations.
In the 18th century Euler studied the rotational
motion of a rigid body and discovered the
importance of the principal axes. Lagrange
realized that the principal axes are the eigenvectors
of the inertia matrix. In the early 19th century,
Cauchy saw how their work could be used to
classify the quadric surfaces, and generalized
it to arbitrary dimensions. Cauchy also coined
the term racine caractéristique (characteristic
root) for what is now called eigenvalue; his
term survives in characteristic equation.Fourier
used the work of Laplace and Lagrange to solve
the heat equation by separation of variables
in his famous 1822 book Théorie analytique
de la chaleur. Sturm developed Fourier's ideas
further and brought them to the attention
of Cauchy, who combined them with his own
ideas and arrived at the fact that real symmetric
matrices have real eigenvalues. This was extended
by Hermite in 1855 to what are now called
Hermitian matrices. Around the same time,
Brioschi proved that the eigenvalues of orthogonal
matrices lie on the unit circle, and Clebsch
found the corresponding result for skew-symmetric
matrices. Finally, Weierstrass clarified an
important aspect in the stability theory started
by Laplace by realizing that defective matrices
can cause instability.In the meantime, Liouville
studied eigenvalue problems similar to those
of Sturm; the discipline that grew out of
their work is now called Sturm–Liouville
theory. Schwarz studied the first eigenvalue
of Laplace's equation on general domains towards
the end of the 19th century, while Poincaré
studied Poisson's equation a few years later.At
the start of the 20th century, Hilbert studied
the eigenvalues of integral operators by viewing
the operators as infinite matrices. He was
the first to use the German word eigen, which
means "own", to denote eigenvalues and eigenvectors
in 1904, though he may have been following
a related usage by Helmholtz. For some time,
the standard term in English was "proper value",
but the more distinctive term "eigenvalue"
is standard today.The first numerical algorithm
for computing eigenvalues and eigenvectors
appeared in 1929, when Von Mises published
the power method. One of the most popular
methods today, the QR algorithm, was proposed
independently by John G.F. Francis and Vera
Kublanovskaya in 1961.
== Eigenvalues and eigenvectors of matrices
==
Eigenvalues and eigenvectors are often introduced
to students in the context of linear algebra
courses focused on matrices. Furthermore,
linear transformations over a finite-dimensional
vector space can be represented using matrices,
which is especially common in numerical and
computational applications.
Consider n-dimensional vectors that are formed
as a list of n scalars, such as the three-dimensional
vectors
x
=
[
1
3
4
]
and
y
=
[
−
20
−
60
−
80
]
.
{\displaystyle x={\begin{bmatrix}1\\3\\4\end{bmatrix}}\quad
{\mbox{and}}\quad y={\begin{bmatrix}-20\\-60\\-80\end{bmatrix}}.}
These vectors are said to be scalar multiples
of each other, or parallel or collinear, if
there is a scalar λ such that
x
=
λ
y
.
{\displaystyle x=\lambda y.}
In this case λ = −1/20.
Now consider the linear transformation of
n-dimensional vectors defined by an n by n
matrix A,
A
v
=
w
,
{\displaystyle Av=w,}
or
[
A
11
A
12
…
A
1
n
A
21
A
22
…
A
2
n
⋮
⋮
⋱
⋮
A
n
1
A
n
2
…
A
n
n
]
[
v
1
v
2
⋮
v
n
]
=
[
w
1
w
2
⋮
w
n
]
{\displaystyle {\begin{bmatrix}A_{11}&A_{12}&\ldots
&A_{1n}\\A_{21}&A_{22}&\ldots &A_{2n}\\\vdots
&\vdots &\ddots &\vdots \\A_{n1}&A_{n2}&\ldots
&A_{nn}\\\end{bmatrix}}{\begin{bmatrix}v_{1}\\v_{2}\\\vdots
\\v_{n}\end{bmatrix}}={\begin{bmatrix}w_{1}\\w_{2}\\\vdots
\\w_{n}\end{bmatrix}}}
where, for each row,
w
i
=
A
i
1
v
1
+
A
i
2
v
2
+
⋯
+
A
i
n
v
n
=
∑
j
=
1
n
A
i
j
v
j
{\displaystyle w_{i}=A_{i1}v_{1}+A_{i2}v_{2}+\cdots
+A_{in}v_{n}=\sum _{j=1}^{n}A_{ij}v_{j}}
.If it occurs that v and w are scalar multiples,
that is if
then v is an eigenvector of the linear transformation
A and the scale factor λ is the eigenvalue
corresponding to that eigenvector. Equation
(1) is the eigenvalue equation for the matrix
A.
Equation (1) can be stated equivalently as
where I is the n by n identity matrix and
0 is the zero vector.
=== Eigenvalues and the characteristic polynomial
===
Equation (2) has a non-zero solution v if
and only if the determinant of the matrix
(A − λI) is zero. Therefore, the eigenvalues
of A are values of λ that satisfy the equation
Using Leibniz' rule for the determinant, the
left-hand side of Equation (3) is a polynomial
function of the variable λ and the degree
of this polynomial is n, the order of the
matrix A. Its coefficients depend on the entries
of A, except that its term of degree n is
always (−1)nλn. This polynomial is called
the characteristic polynomial of A. Equation
(3) is called the characteristic equation
or the secular equation of A.
The fundamental theorem of algebra implies
that the characteristic polynomial of an n-by-n
matrix A, being a polynomial of degree n,
can be factored into the product of n linear
terms,
where each λi may be real but in general
is a complex number. The numbers λ1, λ2,
… λn, which may not all have distinct values,
are roots of the polynomial and are the eigenvalues
of A.
As a brief example, which is described in
more detail in the examples section later,
consider the matrix
M
=
[
2
1
1
2
]
.
{\displaystyle M={\begin{bmatrix}2&1\\1&2\end{bmatrix}}.}
Taking the determinant of (M − λI), the
characteristic polynomial of M is
|
M
−
λ
I
|
=
|
2
−
λ
1
1
2
−
λ
|
=
3
−
4
λ
+
λ
2
.
{\displaystyle |M-\lambda I|={\begin{vmatrix}2-\lambda
&1\\1&2-\lambda \end{vmatrix}}=3-4\lambda
+\lambda ^{2}.}
Setting the characteristic polynomial equal
to zero, it has roots at λ = 1 and λ = 3,
which are the two eigenvalues of M. The eigenvectors
corresponding to each eigenvalue can be found
by solving for the components of v in the
equation Mv = λv. In this example, the eigenvectors
are any non-zero scalar multiples of
v
λ
=
1
=
[
1
−
1
]
,
v
λ
=
3
=
[
1
1
]
.
{\displaystyle v_{\lambda =1}={\begin{bmatrix}1\\-1\end{bmatrix}},\quad
v_{\lambda =3}={\begin{bmatrix}1\\1\end{bmatrix}}.}
If the entries of the matrix A are all real
numbers, then the coefficients of the characteristic
polynomial will also be real numbers, but
the eigenvalues may still have non-zero imaginary
parts. The entries of the corresponding eigenvectors
therefore may also have non-zero imaginary
parts. Similarly, the eigenvalues may be irrational
numbers even if all the entries of A are rational
numbers or even if they are all integers.
However, if the entries of A are all algebraic
numbers, which include the rationals, the
eigenvalues are complex algebraic numbers.
The non-real roots of a real polynomial with
real coefficients can be grouped into pairs
of complex conjugates, namely with the two
members of each pair having imaginary parts
that differ only in sign and the same real
part. If the degree is odd, then by the intermediate
value theorem at least one of the roots is
real. Therefore, any real matrix with odd
order has at least one real eigenvalue, whereas
a real matrix with even order may not have
any real eigenvalues. The eigenvectors associated
with these complex eigenvalues are also complex
and also appear in complex conjugate pairs.
=== Algebraic multiplicity ===
Let λi be an eigenvalue of an n by n matrix
A. The algebraic multiplicity μA(λi) of
the eigenvalue is its multiplicity as a root
of the characteristic polynomial, that is,
the largest integer k such that (λ − λi)k
divides evenly that polynomial.Suppose a matrix
A has dimension n and d ≤ n distinct eigenvalues.
Whereas Equation (4) factors the characteristic
polynomial of A into the product of n linear
terms with some terms potentially repeating,
the characteristic polynomial can instead
be written as the product of d terms each
corresponding to a distinct eigenvalue and
raised to the power of the algebraic multiplicity,
|
A
−
λ
I
|
=
(
λ
1
−
λ
)
μ
A
(
λ
1
)
(
λ
2
−
λ
)
μ
A
(
λ
2
)
⋯
(
λ
d
−
λ
)
μ
A
(
λ
d
)
.
{\displaystyle |A-\lambda I|=(\lambda _{1}-\lambda
)^{\mu _{A}(\lambda _{1})}(\lambda _{2}-\lambda
)^{\mu _{A}(\lambda _{2})}\cdots (\lambda
_{d}-\lambda )^{\mu _{A}(\lambda _{d})}.}
If d = n then the right-hand side is the product
of n linear terms and this is the same as
Equation (4). The size of each eigenvalue's
algebraic multiplicity is related to the dimension
n as
1
≤
μ
A
(
λ
i
)
≤
n
,
μ
A
=
∑
i
=
1
d
μ
A
(
λ
i
)
=
n
.
{\displaystyle {\begin{aligned}1&\leq \mu
_{A}(\lambda _{i})\leq n,\\\mu _{A}&=\sum
_{i=1}^{d}\mu _{A}\left(\lambda _{i}\right)=n.\end{aligned}}}
If μA(λi) = 1, then λi is said to be a
simple eigenvalue. If μA(λi) equals the
geometric multiplicity of λi, γA(λi), defined
in the next section, then λi is said to be
a semisimple eigenvalue.
=== Eigenspaces, geometric multiplicity, and
the eigenbasis for matrices ===
Given a particular eigenvalue λ of the n
by n matrix A, define the set E to be all
vectors v that satisfy Equation (2),
E
=
{
v
:
(
A
−
λ
I
)
v
=
0
}
.
{\displaystyle E=\left\{\mathbf {v} :(A-\lambda
I)\mathbf {v} =0\right\}.}
On one hand, this set is precisely the kernel
or nullspace of the matrix (A − λI). On
the other hand, by definition, any non-zero
vector that satisfies this condition is an
eigenvector of A associated with λ. So, the
set E is the union of the zero vector with
the set of all eigenvectors of A associated
with λ, and E equals the nullspace of (A
− λI). E is called the eigenspace or characteristic
space of A associated with λ. In general
λ is a complex number and the eigenvectors
are complex n by 1 matrices. A property of
the nullspace is that it is a linear subspace,
so E is a linear subspace of ℂn.
Because the eigenspace E is a linear subspace,
it is closed under addition. That is, if two
vectors u and v belong to the set E, written
(u,v) ∈ E, then (u + v) ∈ E or equivalently
A(u + v) = λ(u + v). This can be checked
using the distributive property of matrix
multiplication. Similarly, because E is a
linear subspace, it is closed under scalar
multiplication. That is, if v ∈ E and α
is a complex number, (αv) ∈ E or equivalently
A(αv) = λ(αv). This can be checked by noting
that multiplication of complex matrices by
complex numbers is commutative. As long as
u + v and αv are not zero, they are also
eigenvectors of A associated with λ.
The dimension of the eigenspace E associated
with λ, or equivalently the maximum number
of linearly independent eigenvectors associated
with λ, is referred to as the eigenvalue's
geometric multiplicity γA(λ). Because E
is also the nullspace of (A − λI), the
geometric multiplicity of λ is the dimension
of the nullspace of (A − λI), also called
the nullity of (A − λI), which relates
to the dimension and rank of (A − λI) as
γ
A
(
λ
)
=
n
−
rank
⁡
(
A
−
λ
I
)
.
{\displaystyle \gamma _{A}(\lambda )=n-\operatorname
{rank} (A-\lambda I).}
Because of the definition of eigenvalues and
eigenvectors, an eigenvalue's geometric multiplicity
must be at least one, that is, each eigenvalue
has at least one associated eigenvector. Furthermore,
an eigenvalue's geometric multiplicity cannot
exceed its algebraic multiplicity. Additionally,
recall that an eigenvalue's algebraic multiplicity
cannot exceed n.
1
≤
γ
A
(
λ
)
≤
μ
A
(
λ
)
≤
n
{\displaystyle 1\leq \gamma _{A}(\lambda )\leq
\mu _{A}(\lambda )\leq n}
To prove the inequality
γ
A
(
λ
)
≤
μ
A
(
λ
)
{\displaystyle \gamma _{A}(\lambda )\leq \mu
_{A}(\lambda )}
, consider how the definition of geometric
multiplicity implies the existence of
γ
A
(
λ
)
{\displaystyle \gamma _{A}(\lambda )}
orthonormal eigenvectors
v
1
,
…
,
v
γ
A
(
λ
)
{\displaystyle {\boldsymbol {v}}_{1},\,\ldots
,\,{\boldsymbol {v}}_{\gamma _{A}(\lambda
)}}
, such that
A
v
k
=
λ
v
k
{\displaystyle A{\boldsymbol {v}}_{k}=\lambda
{\boldsymbol {v}}_{k}}
. We can therefore find a (unitary) matrix
V
{\displaystyle V}
whose first
γ
A
(
λ
)
{\displaystyle \gamma _{A}(\lambda )}
columns are these eigenvectors, and whose
remaining columns can be any orthonormal set
of
n
−
γ
A
(
λ
)
{\displaystyle n-\gamma _{A}(\lambda )}
vectors orthogonal to these eigenvectors of
A
{\displaystyle A}
. Then
V
{\displaystyle V}
has full rank and is therefore invertible,
and
A
V
=
V
D
{\displaystyle AV=VD}
with
D
{\displaystyle D}
a matrix whose top left block is the diagonal
matrix
λ
I
γ
A
(
λ
)
{\displaystyle \lambda I_{\gamma _{A}(\lambda
)}}
. This implies that
(
A
−
ξ
I
)
V
=
V
(
D
−
ξ
I
)
{\displaystyle (A-\xi I)V=V(D-\xi I)}
. In other words,
A
−
ξ
I
{\displaystyle A-\xi I}
is similar to
D
−
ξ
I
{\displaystyle D-\xi I}
, which implies 
that
det
(
A
−
ξ
I
)
=
det
(
D
−
ξ
I
)
{\displaystyle \det(A-\xi I)=\det(D-\xi I)}
. But from the definition of
D
{\displaystyle D}
we know that
det
(
D
−
ξ
I
)
{\displaystyle \det(D-\xi I)}
contains a factor
(
ξ
−
λ
)
γ
A
(
λ
)
{\displaystyle (\xi -\lambda )^{\gamma _{A}(\lambda
)}}
, which means that the algebraic multiplicity
of
λ
{\displaystyle \lambda }
must satisfy
μ
A
(
λ
)
≥
γ
A
(
λ
)
{\displaystyle \mu _{A}(\lambda )\geq \gamma
_{A}(\lambda )}
.
Suppose A has d ≤ n distinct eigenvalues
λ1, λ2, …, λd, where the geometric multiplicity
of λi is γA(λi). The total geometric multiplicity
of A,
γ
A
=
∑
i
=
1
d
γ
A
(
λ
i
)
,
d
≤
γ
A
≤
n
,
{\displaystyle {\begin{aligned}\gamma _{A}&=\sum
_{i=1}^{d}\gamma _{A}(\lambda _{i}),\\d&\leq
\gamma _{A}\leq n,\end{aligned}}}
is the dimension of the union of all the eigenspaces
of A's eigenvalues, or equivalently the maximum
number of linearly independent eigenvectors
of A. If γA = n, then
The direct sum of the eigenspaces of all of
A's eigenvalues is the entire vector space
ℂn
A basis of ℂn can be formed from n linearly
independent eigenvectors of A; such a basis
is called an eigenbasis
Any vector in ℂn can be written as a linear
combination of eigenvectors of A
=== Additional properties of eigenvalues ===
Let A be an arbitrary n by n matrix of complex
numbers with eigenvalues λ1, λ2, ..., λn.
Each eigenvalue appears μA(λi) times in
this list, where μA(λi) is the eigenvalue's
algebraic multiplicity. The following are
properties of this matrix and its eigenvalues:
The trace of A, defined as the sum of its
diagonal elements, is also the sum of all
eigenvalues,
tr
⁡
(
A
)
=
∑
i
=
1
n
A
i
i
=
∑
i
=
1
n
λ
i
=
λ
1
+
λ
2
+
⋯
+
λ
n
.
{\displaystyle \operatorname {tr} (A)=\sum
_{i=1}^{n}A_{ii}=\sum _{i=1}^{n}\lambda _{i}=\lambda
_{1}+\lambda _{2}+\cdots +\lambda _{n}.}
The determinant of A is the product of all
its eigenvalues,
det
(
A
)
=
∏
i
=
1
n
λ
i
=
λ
1
λ
2
⋯
λ
n
.
{\displaystyle \det(A)=\prod _{i=1}^{n}\lambda
_{i}=\lambda _{1}\lambda _{2}\cdots \lambda
_{n}.}
The eigenvalues of the kth power of A; i.e.,
the eigenvalues of Ak, for any positive integer
k, are λ1k, λ2k, …, λnk.
The matrix A is invertible if and only if
every eigenvalue is nonzero.
If A is invertible, then the eigenvalues of
A−1 are 1/λ1, 1/λ2, …, 1/λn and each
eigenvalue's geometric multiplicity coincides.
Moreover, since the characteristic polynomial
of the inverse is the reciprocal polynomial
of the original, the eigenvalues share the
same algebraic multiplicity.
If A is equal to its conjugate transpose A*,
or equivalently if A is Hermitian, then every
eigenvalue is real. The same is true of any
symmetric real matrix.
If A is not only Hermitian but also positive-definite,
positive-semidefinite, negative-definite,
or negative-semidefinite, then every eigenvalue
is positive, non-negative, negative, or non-positive,
respectively.
If A is unitary, every eigenvalue has absolute
value |λi| = 1.
=== Left and right eigenvectors ===
Many disciplines traditionally represent vectors
as matrices with a single column rather than
as matrices with a single row. For that reason,
the word "eigenvector" in the context of matrices
almost always refers to a right eigenvector,
namely a column vector that right multiplies
the n by n matrix A in the defining equation,
Equation (1),
A
v
=
λ
v
.
{\displaystyle Av=\lambda v.}
The eigenvalue and eigenvector problem can
also be defined for row vectors that left
multiply matrix A. In this formulation, the
defining equation is
u
A
=
κ
u
,
{\displaystyle uA=\kappa u,}
where κ is a scalar and u is a 1 by n matrix.
Any row vector u satisfying this equation
is called a left eigenvector of A and κ is
its associated eigenvalue. Taking the transpose
of this equation,
A
T
u
T
=
κ
u
T
.
{\displaystyle A^{\textsf {T}}u^{\textsf {T}}=\kappa
u^{\textsf {T}}.}
Comparing this equation to Equation (1), it
follows immediately that a left eigenvector
of A is the same as the transpose of a right
eigenvector of AT, with the same eigenvalue.
Furthermore, since the characteristic polynomial
of AT is the same as the characteristic polynomial
of A, the eigenvalues of the left eigenvectors
of A are the same as the eigenvalues of the
right eigenvectors of AT.
=== Diagonalization and the eigendecomposition
===
Suppose the eigenvectors of A form a basis,
or equivalently A has n linearly independent
eigenvectors v1, v2, …, vn with associated
eigenvalues λ1, λ2, …, λn. The eigenvalues
need not be distinct. Define a square matrix
Q whose columns are the n linearly independent
eigenvectors of A,
Q
=
[
v
1
v
2
⋯
v
n
]
.
{\displaystyle Q={\begin{bmatrix}v_{1}&v_{2}&\cdots
&v_{n}\end{bmatrix}}.}
Since each column of Q is an eigenvector of
A, right multiplying A by Q scales each column
of Q by its associated eigenvalue,
A
Q
=
[
λ
1
v
1
λ
2
v
2
⋯
λ
n
v
n
]
.
{\displaystyle AQ={\begin{bmatrix}\lambda
_{1}v_{1}&\lambda _{2}v_{2}&\cdots &\lambda
_{n}v_{n}\end{bmatrix}}.}
With this in mind, define a diagonal matrix
Λ where each diagonal element Λii is the
eigenvalue associated with the ith column
of Q. Then
A
Q
=
Q
Λ
.
{\displaystyle AQ=Q\Lambda .}
Because the columns of Q are linearly independent,
Q is invertible. Right multiplying both sides
of the equation by Q−1,
A
=
Q
Λ
Q
−
1
,
{\displaystyle A=Q\Lambda Q^{-1},}
or by instead left multiplying both sides
by Q−1,
Q
−
1
A
Q
=
Λ
.
{\displaystyle Q^{-1}AQ=\Lambda .}
A can therefore be decomposed into a matrix
composed of its eigenvectors, a diagonal matrix
with its eigenvalues along the diagonal, and
the inverse of the matrix of eigenvectors.
This is called the eigendecomposition and
it is a similarity transformation. Such a
matrix A is said to be similar to the diagonal
matrix Λ or diagonalizable. The matrix Q
is the change of basis matrix of the similarity
transformation. Essentially, the matrices
A and Λ represent the same linear transformation
expressed in two different bases. The eigenvectors
are used as the basis when representing the
linear transformation as Λ.
Conversely, suppose a matrix A is diagonalizable.
Let P be a non-singular square matrix such
that P−1AP is some diagonal matrix D. Left
multiplying both by P, AP = PD. Each column
of P must therefore be an eigenvector of A
whose eigenvalue is the corresponding diagonal
element of D. Since the columns of P must
be linearly independent for P to be invertible,
there exist n linearly independent eigenvectors
of A. It then follows that the eigenvectors
of A form a basis if and only if A is diagonalizable.
A matrix that is not diagonalizable is said
to be defective. For defective matrices, the
notion of eigenvectors generalizes to generalized
eigenvectors and the diagonal matrix of eigenvalues
generalizes to the Jordan normal form. Over
an algebraically closed field, any matrix
A has a Jordan normal form and therefore admits
a basis of generalized eigenvectors and a
decomposition into generalized eigenspaces.
=== Variational characterization ===
In the Hermitian case, eigenvalues can be
given a variational characterization. The
largest eigenvalue of
H
{\displaystyle H}
is the maximum value of the quadratic form
x
T
H
x
/
x
T
x
{\displaystyle x^{\textsf {T}}Hx/x^{\textsf
{T}}x}
. A value of
x
{\displaystyle x}
that realizes that maximum, is an eigenvector.
=== Matrix examples ===
==== Two-dimensional matrix example ====
Consider the matrix
A
=
[
2
1
1
2
]
.
{\displaystyle A={\begin{bmatrix}2&1\\1&2\end{bmatrix}}.}
The figure on the right shows the effect of
this transformation on point coordinates in
the plane.
The eigenvectors v of this transformation
satisfy Equation (1), and the values of λ
for which the determinant of the matrix (A
− λI) equals zero are the eigenvalues.
Taking the determinant to find characteristic
polynomial of A,
|
A
−
λ
I
|
=
|
[
2
1
1
2
]
−
λ
[
1
0
0
1
]
|
=
|
2
−
λ
1
1
2
−
λ
|
,
=
3
−
4
λ
+
λ
2
.
{\displaystyle {\begin{aligned}|A-\lambda
I|&=\left|{\begin{bmatrix}2&1\\1&2\end{bmatrix}}-\lambda
{\begin{bmatrix}1&0\\0&1\end{bmatrix}}\right|={\begin{vmatrix}2-\lambda
&1\\1&2-\lambda \end{vmatrix}},\\[6pt]&=3-4\lambda
+\lambda ^{2}.\end{aligned}}}
Setting the characteristic polynomial equal
to zero, it has roots at λ = 1 and λ = 3,
which are the two eigenvalues of A.
For λ = 1, Equation (2) becomes,
(
A
−
I
)
v
λ
=
1
=
[
1
1
1
1
]
[
v
1
v
2
]
=
[
0
0
]
.
{\displaystyle (A-I)v_{\lambda =1}={\begin{bmatrix}1&1\\1&1\end{bmatrix}}{\begin{bmatrix}v_{1}\\v_{2}\end{bmatrix}}={\begin{bmatrix}0\\0\end{bmatrix}}.}
Any non-zero vector with v1 = −v2 solves
this equation. Therefore,
v
λ
=
1
=
[
1
−
1
]
{\displaystyle v_{\lambda =1}={\begin{bmatrix}1\\-1\end{bmatrix}}}
is an eigenvector of A corresponding to λ
= 1, as is any scalar multiple of this vector.
For λ = 3, Equation (2) becomes
(
A
−
3
I
)
v
λ
=
3
=
[
−
1
1
1
−
1
]
[
v
1
v
2
]
=
[
0
0
]
.
{\displaystyle (A-3I)v_{\lambda =3}={\begin{bmatrix}-1&1\\1&-1\end{bmatrix}}{\begin{bmatrix}v_{1}\\v_{2}\end{bmatrix}}={\begin{bmatrix}0\\0\end{bmatrix}}.}
Any non-zero vector with v1 = v2 solves this
equation. Therefore,
v
λ
=
3
=
[
1
1
]
{\displaystyle v_{\lambda =3}={\begin{bmatrix}1\\1\end{bmatrix}}}
is an eigenvector of A corresponding to λ
= 3, as is any scalar multiple of this vector.
Thus, the vectors vλ=1 and vλ=3 are eigenvectors
of A associated with the eigenvalues λ = 1
and λ = 3, respectively.
==== Three-dimensional matrix example ====
Consider the matrix
A
=
[
2
0
0
0
3
4
0
4
9
]
.
{\displaystyle A={\begin{bmatrix}2&0&0\\0&3&4\\0&4&9\end{bmatrix}}.}
The characteristic polynomial of A is
|
A
−
λ
I
|
=
|
[
2
0
0
0
3
4
0
4
9
]
−
λ
[
1
0
0
0
1
0
0
0
1
]
|
=
|
2
−
λ
0
0
0
3
−
λ
4
0
4
9
−
λ
|
,
=
(
2
−
λ
)
[
(
3
−
λ
)
(
9
−
λ
)
−
16
]
=
−
λ
3
+
14
λ
2
−
35
λ
+
22.
{\displaystyle {\begin{aligned}|A-\lambda
I|&=\left|{\begin{bmatrix}2&0&0\\0&3&4\\0&4&9\end{bmatrix}}-\lambda
{\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}}\right|={\begin{vmatrix}2-\lambda
&0&0\\0&3-\lambda &4\\0&4&9-\lambda \end{vmatrix}},\\[6pt]&=(2-\lambda
){\bigl [}(3-\lambda )(9-\lambda )-16{\bigr
]}=-\lambda ^{3}+14\lambda ^{2}-35\lambda
+22.\end{aligned}}}
The roots of the characteristic polynomial
are 2, 1, and 11, which are the only three
eigenvalues of A. These eigenvalues correspond
to the eigenvectors
[
1
0
0
]
T
,
{\displaystyle {\begin{bmatrix}1&0&0\end{bmatrix}}^{\textsf
{T}},}
[
0
−
2
1
]
T
,
{\displaystyle {\begin{bmatrix}0&-2&1\end{bmatrix}}^{\textsf
{T}},}
and
[
0
1
2
]
T
{\displaystyle {\begin{bmatrix}0&1&2\end{bmatrix}}^{\textsf
{T}}}
, or any non-zero multiple thereof.
==== Three-dimensional matrix example with
complex eigenvalues ====
Consider the cyclic permutation matrix
A
=
[
0
1
0
0
0
1
1
0
0
]
.
{\displaystyle A={\begin{bmatrix}0&1&0\\0&0&1\\1&0&0\end{bmatrix}}.}
This matrix shifts the coordinates of the
vector up by one position and moves the first
coordinate to the bottom. Its characteristic
polynomial is 1 − λ3, whose roots are
λ
1
=
1
λ
2
=
−
1
2
+
i
3
2
λ
3
=
λ
2
∗
=
−
1
2
−
i
3
2
{\displaystyle {\begin{aligned}\lambda _{1}&=1\\\lambda
_{2}&=-{\frac {1}{2}}+\mathbf {i} {\frac {\sqrt
{3}}{2}}\\\lambda _{3}&=\lambda _{2}^{*}=-{\frac
{1}{2}}-\mathbf {i} {\frac {\sqrt {3}}{2}}\end{aligned}}}
where
i
{\displaystyle \mathbf {i} }
is an imaginary unit with
i
2
=
−
1.
{\displaystyle \mathbf {i} ^{2}=-1.}
For the real eigenvalue λ1 = 1, any vector
with three equal non-zero entries is an eigenvector.
For example,
A
[
5
5
5
]
=
[
5
5
5
]
=
1
⋅
[
5
5
5
]
.
{\displaystyle A{\begin{bmatrix}5\\5\\5\end{bmatrix}}={\begin{bmatrix}5\\5\\5\end{bmatrix}}=1\cdot
{\begin{bmatrix}5\\5\\5\end{bmatrix}}.}
For the complex conjugate pair of imaginary
eigenvalues, note that
λ
2
λ
3
=
1
,
λ
2
2
=
λ
3
,
λ
3
2
=
λ
2
.
{\displaystyle \lambda _{2}\lambda _{3}=1,\quad
\lambda _{2}^{2}=\lambda _{3},\quad \lambda
_{3}^{2}=\lambda _{2}.}
Then
A
[
1
λ
2
λ
3
]
=
[
λ
2
λ
3
1
]
=
λ
2
⋅
[
1
λ
2
λ
3
]
,
{\displaystyle A{\begin{bmatrix}1\\\lambda
_{2}\\\lambda _{3}\end{bmatrix}}={\begin{bmatrix}\lambda
_{2}\\\lambda _{3}\\1\end{bmatrix}}=\lambda
_{2}\cdot {\begin{bmatrix}1\\\lambda _{2}\\\lambda
_{3}\end{bmatrix}},}
and
A
[
1
λ
3
λ
2
]
=
[
λ
3
λ
2
1
]
=
λ
3
⋅
[
1
λ
3
λ
2
]
.
{\displaystyle A{\begin{bmatrix}1\\\lambda
_{3}\\\lambda _{2}\end{bmatrix}}={\begin{bmatrix}\lambda
_{3}\\\lambda _{2}\\1\end{bmatrix}}=\lambda
_{3}\cdot {\begin{bmatrix}1\\\lambda _{3}\\\lambda
_{2}\end{bmatrix}}.}
Therefore, the other two eigenvectors of A
are complex and are
v
λ
2
=
[
1
λ
2
λ
3
]
T
{\displaystyle v_{\lambda _{2}}={\begin{bmatrix}1&\lambda
_{2}&\lambda _{3}\end{bmatrix}}^{\textsf {T}}}
and
v
λ
3
=
[
1
λ
3
λ
2
]
T
{\displaystyle v_{\lambda _{3}}={\begin{bmatrix}1&\lambda
_{3}&\lambda _{2}\end{bmatrix}}^{\textsf {T}}}
with eigenvalues λ2 and λ3, respectively.
Note that the two complex eigenvectors also
appear in a complex conjugate pair,
v
λ
2
=
v
λ
3
∗
.
{\displaystyle v_{\lambda _{2}}=v_{\lambda
_{3}}^{*}.}
==== Diagonal matrix example ====
Matrices with entries only along the main
diagonal are called diagonal matrices. The
eigenvalues of a diagonal matrix are the diagonal
elements themselves. Consider the matrix
A
=
[
1
0
0
0
2
0
0
0
3
]
.
{\displaystyle A={\begin{bmatrix}1&0&0\\0&2&0\\0&0&3\end{bmatrix}}.}
The characteristic polynomial of A is
|
A
−
λ
I
|
=
(
1
−
λ
)
(
2
−
λ
)
(
3
−
λ
)
,
{\displaystyle |A-\lambda I|=(1-\lambda )(2-\lambda
)(3-\lambda ),}
which has the roots λ1 = 1, λ2 = 2, and
λ3 = 3. These roots are the diagonal elements
as well as the eigenvalues of A.
Each diagonal element corresponds to an eigenvector
whose only non-zero component is in the same
row as that diagonal element. In the example,
the eigenvalues correspond to the eigenvectors,
v
λ
1
=
[
1
0
0
]
,
v
λ
2
=
[
0
1
0
]
,
v
λ
3
=
[
0
0
1
]
,
{\displaystyle v_{\lambda _{1}}={\begin{bmatrix}1\\0\\0\end{bmatrix}},\quad
v_{\lambda _{2}}={\begin{bmatrix}0\\1\\0\end{bmatrix}},\quad
v_{\lambda _{3}}={\begin{bmatrix}0\\0\\1\end{bmatrix}},}
respectively, as well as scalar multiples
of these vectors.
==== Triangular matrix example ====
A matrix whose elements above the main diagonal
are all zero is called a lower triangular
matrix, while a matrix whose elements below
the main diagonal are all zero is called an
upper triangular matrix. As with diagonal
matrices, the eigenvalues of triangular matrices
are the elements of the main diagonal.
Consider the lower triangular matrix,
A
=
[
1
0
0
1
2
0
2
3
3
]
.
{\displaystyle A={\begin{bmatrix}1&0&0\\1&2&0\\2&3&3\end{bmatrix}}.}
The characteristic polynomial of A is
|
A
−
λ
I
|
=
(
1
−
λ
)
(
2
−
λ
)
(
3
−
λ
)
,
{\displaystyle |A-\lambda I|=(1-\lambda )(2-\lambda
)(3-\lambda ),}
which has the roots λ1 = 1, λ2 = 2, and
λ3 = 3. These roots are the diagonal elements
as well as the eigenvalues of A.
These eigenvalues correspond to the eigenvectors,
v
λ
1
=
[
1
−
1
1
2
]
,
v
λ
2
=
[
0
1
−
3
]
,
v
λ
3
=
[
0
0
1
]
,
{\displaystyle v_{\lambda _{1}}={\begin{bmatrix}1\\-1\\{\frac
{1}{2}}\end{bmatrix}},\quad v_{\lambda _{2}}={\begin{bmatrix}0\\1\\-3\end{bmatrix}},\quad
v_{\lambda _{3}}={\begin{bmatrix}0\\0\\1\end{bmatrix}},}
respectively, as well as scalar multiples
of these vectors.
==== Matrix with repeated eigenvalues example
====
As in the previous example, the lower triangular
matrix
A
=
[
2
0
0
0
1
2
0
0
0
1
3
0
0
0
1
3
]
,
{\displaystyle A={\begin{bmatrix}2&0&0&0\\1&2&0&0\\0&1&3&0\\0&0&1&3\end{bmatrix}},}
has a characteristic polynomial that is the
product of its diagonal elements,
|
A
−
λ
I
|
=
|
2
−
λ
0
0
0
1
2
−
λ
0
0
0
1
3
−
λ
0
0
0
1
3
−
λ
|
=
(
2
−
λ
)
2
(
3
−
λ
)
2
.
{\displaystyle |A-\lambda I|={\begin{vmatrix}2-\lambda
&0&0&0\\1&2-\lambda &0&0\\0&1&3-\lambda &0\\0&0&1&3-\lambda
\end{vmatrix}}=(2-\lambda )^{2}(3-\lambda
)^{2}.}
The roots of this polynomial, and hence the
eigenvalues, are 2 and 3. The algebraic multiplicity
of each eigenvalue is 2; in other words they
are both double roots. The sum of the algebraic
multiplicities of each distinct eigenvalue
is μA = 4 = n, the order of the characteristic
polynomial and the dimension of A.
On the other hand, the geometric multiplicity
of the eigenvalue 2 is only 1, because its
eigenspace is spanned by just one vector
[
0
1
−
1
1
]
T
{\displaystyle {\begin{bmatrix}0&1&-1&1\end{bmatrix}}^{\textsf
{T}}}
and is therefore 1-dimensional. Similarly,
the geometric multiplicity of the eigenvalue
3 is 1 because its eigenspace is spanned by
just one vector
[
0
0
0
1
]
T
{\displaystyle {\begin{bmatrix}0&0&0&1\end{bmatrix}}^{\textsf
{T}}}
. The total geometric multiplicity γA is
2, which is the smallest it could be for a
matrix with two distinct eigenvalues. Geometric
multiplicities are defined in a later section.
== Eigenvalues and eigenfunctions of differential
operators ==
The definitions of eigenvalue and eigenvectors
of a linear transformation T remains valid
even if the underlying vector space is an
infinite-dimensional Hilbert or Banach space.
A widely used class of linear transformations
acting on infinite-dimensional spaces are
the differential operators on function spaces.
Let D be a linear differential operator on
the space C∞ of infinitely differentiable
real functions of a real argument t. The eigenvalue
equation for D is the differential equation
D
f
(
t
)
=
λ
f
(
t
)
{\displaystyle Df(t)=\lambda f(t)}
The functions that satisfy this equation are
eigenvectors of D and are commonly called
eigenfunctions.
=== Derivative operator example ===
Consider the derivative operator
d
d
t
{\displaystyle {\tfrac {d}{dt}}}
with eigenvalue equation
d
d
t
f
(
t
)
=
λ
f
(
t
)
.
{\displaystyle {\frac {d}{dt}}f(t)=\lambda
f(t).}
This differential equation can be solved by
multiplying both sides by dt/f(t) and integrating.
Its solution, the exponential function
f
(
t
)
=
f
(
0
)
e
λ
t
,
{\displaystyle f(t)=f(0)e^{\lambda t},}
is the eigenfunction of the derivative operator.
Note that in this case the eigenfunction is
itself a function of its associated eigenvalue.
In particular, note that for λ = 0 the eigenfunction
f(t) is a constant.
The main eigenfunction article gives other
examples.
== General definition ==
The concept of eigenvalues and eigenvectors
extends naturally to arbitrary linear transformations
on arbitrary vector spaces. Let V be any vector
space over some field K of scalars, and let
T be a linear transformation mapping V into
V,
T
:
V
→
V
.
{\displaystyle T:V\to V.}
We say that a non-zero vector v ∈ V is an
eigenvector of T if and only if there exists
a scalar λ ∈ K such that
This equation is called the eigenvalue equation
for T, and the scalar λ is the eigenvalue
of T corresponding to the eigenvector v. Note
that T(v) is the result of applying the transformation
T to the vector v, while λv is the product
of the scalar λ with v.
=== Eigenspaces, geometric multiplicity, and
the eigenbasis ===
Given an eigenvalue λ, consider the set
E
=
{
v
:
T
(
v
)
=
λ
v
}
,
{\displaystyle E=\left\{\mathbf {v} :T(\mathbf
{v} )=\lambda \mathbf {v} \right\},}
which is the union of the zero vector with
the set of all eigenvectors associated with
λ. E is called the eigenspace or characteristic
space of T associated with λ.
By definition of a linear transformation,
T
(
x
+
y
)
=
T
(
x
)
+
T
(
y
)
,
T
(
α
x
)
=
α
T
(
x
)
,
{\displaystyle {\begin{aligned}T(\mathbf {x}
+\mathbf {y} )&=T(\mathbf {x} )+T(\mathbf
{y} ),\\T(\alpha \mathbf {x} )&=\alpha T(\mathbf
{x} ),\end{aligned}}}
for (x,y) ∈ V and α ∈ K. Therefore, if
u and v are eigenvectors of T associated with
eigenvalue λ, namely (u,v) ∈ E, then
T
(
u
+
v
)
=
λ
(
u
+
v
)
,
T
(
α
v
)
=
λ
(
α
v
)
.
{\displaystyle {\begin{aligned}T(\mathbf {u}
+\mathbf {v} )&=\lambda (\mathbf {u} +\mathbf
{v} ),\\T(\alpha \mathbf {v} )&=\lambda (\alpha
\mathbf {v} ).\end{aligned}}}
So, both u + v and αv are either zero or
eigenvectors of T associated with λ, namely
(u + v, αv) ∈ E, and E is closed under
addition and scalar multiplication. The eigenspace
E associated with λ is therefore a linear
subspace of V. If that subspace has dimension
1, it is sometimes called an eigenline.The
geometric multiplicity γT(λ) of an eigenvalue
λ is the dimension of the eigenspace associated
with λ, i.e., the maximum number of linearly
independent eigenvectors associated with that
eigenvalue. By the definition of eigenvalues
and eigenvectors, γT(λ) ≥ 1 because every
eigenvalue has at least one eigenvector.
The eigenspaces of T always form a direct
sum. As a consequence, eigenvectors of different
eigenvalues are always linearly independent.
Therefore, the sum of the dimensions of the
eigenspaces cannot exceed the dimension n
of the vector space on which T operates, and
there cannot be more than n distinct eigenvalues.Any
subspace spanned by eigenvectors of T is an
invariant subspace of T, and the restriction
of T to such a subspace is diagonalizable.
Moreover, if the entire vector space V can
be spanned by the eigenvectors of T, or equivalently
if the direct sum of the eigenspaces associated
with all the eigenvalues of T is the entire
vector space V, then a basis of V called an
eigenbasis can be formed from linearly independent
eigenvectors of T. When T admits an eigenbasis,
T is diagonalizable.
=== Zero vector as an eigenvector ===
While the definition of an eigenvector used
in this article excludes the zero vector,
it is possible to define eigenvalues and eigenvectors
such that the zero vector is an eigenvector.Consider
again the eigenvalue equation, Equation (5).
Define an eigenvalue to be any scalar λ ∈ K
such that there exists a non-zero vector v
∈ V satisfying Equation (5). It is important
that this version of the definition of an
eigenvalue specify that the vector be non-zero,
otherwise by this definition the zero vector
would allow any scalar in K to be an eigenvalue.
Define an eigenvector v associated with the
eigenvalue λ to be any vector that, given
λ, satisfies Equation (5). Given the eigenvalue,
the zero vector is among the vectors that
satisfy Equation (5), so the zero vector is
included among the eigenvectors by this alternate
definition.
=== Spectral theory ===
If λ is an eigenvalue of T, then the operator
(T − λI) is not one-to-one, and therefore
its inverse (T − λI)−1 does not exist.
The converse is true for finite-dimensional
vector spaces, but not for infinite-dimensional
vector spaces. In general, the operator (T
− λI) may not have an inverse even if λ
is not an eigenvalue.
For this reason, in functional analysis eigenvalues
can be generalized to the spectrum of a linear
operator T as the set of all scalars λ for
which the operator (T − λI) has no bounded
inverse. The spectrum of an operator always
contains all its eigenvalues but is not limited
to them.
=== Associative algebras and representation
theory ===
One can generalize the algebraic object that
is acting on the vector space, replacing a
single operator acting on a vector space with
an algebra representation – an associative
algebra acting on a module. The study of such
actions is the field of representation theory.
The representation-theoretical concept of
weight is an analog of eigenvalues, while
weight vectors and weight spaces are the analogs
of eigenvectors and eigenspaces, respectively.
== Dynamic equations ==
The simplest difference equations have the
form
x
t
=
a
1
x
t
−
1
+
a
2
x
t
−
2
+
⋯
+
a
k
x
t
−
k
.
{\displaystyle x_{t}=a_{1}x_{t-1}+a_{2}x_{t-2}+\cdots
+a_{k}x_{t-k}.}
The solution of this equation for x in terms
of t is found by using its characteristic
equation
λ
k
−
a
1
λ
k
−
1
−
a
2
λ
k
−
2
−
⋯
−
a
k
−
1
λ
−
a
k
=
0
,
{\displaystyle \lambda ^{k}-a_{1}\lambda ^{k-1}-a_{2}\lambda
^{k-2}-\cdots -a_{k-1}\lambda -a_{k}=0,}
which can be found by stacking into matrix
form a set of equations consisting of the
above difference equation and the k – 1
equations
x
t
−
1
=
x
t
−
1
,
…
,
x
t
−
k
+
1
=
x
t
−
k
+
1
,
{\displaystyle x_{t-1}=x_{t-1},\ \dots ,\ x_{t-k+1}=x_{t-k+1},}
giving a k-dimensional system of the first
order in the stacked variable vector
[
x
t
⋯
x
t
−
k
+
1
]
{\displaystyle {\begin{bmatrix}x_{t}&\cdots
&x_{t-k+1}\end{bmatrix}}}
in terms of its once-lagged value, and taking
the characteristic equation of this system's
matrix. This equation gives k characteristic
roots
λ
1
,
…
,
λ
k
,
{\displaystyle \lambda _{1},\,\ldots ,\,\lambda
_{k},}
for use in the solution equation
x
t
=
c
1
λ
1
t
+
⋯
+
c
k
λ
k
t
.
{\displaystyle x_{t}=c_{1}\lambda _{1}^{t}+\cdots
+c_{k}\lambda _{k}^{t}.}
A similar procedure is used for solving a
differential equation of the form
d
k
x
d
t
k
+
a
k
−
1
d
k
−
1
x
d
t
k
−
1
+
⋯
+
a
1
d
x
d
t
+
a
0
x
=
0.
{\displaystyle {\frac {d^{k}x}{dt^{k}}}+a_{k-1}{\frac
{d^{k-1}x}{dt^{k-1}}}+\cdots +a_{1}{\frac
{dx}{dt}}+a_{0}x=0.}
== Calculation ==
The calculation of eigenvalues and eigenvectors
is a topic where theory, as presented in elementary
linear algebra textbooks, is often very far
from practice.
=== Classical method ===
The classical method is to first find the
eigenvalues, and then calculate the eigenvectors
for each eigenvalue. It is in several ways
poorly suited for non-exact arithmetics such
as floating-point.
==== Eigenvalues ====
The eigenvalues of a matrix
A
{\displaystyle A}
can be determined by finding the roots of
the characteristic polynomial. This is easy
for
2
×
2
{\displaystyle 2\times 2}
matrices, but the difficulty increases rapidly
with the size of the matrix.
In theory, the coefficients of the characteristic
polynomial can be computed exactly, since
they are sums of products of matrix elements;
and there are algorithms that can find all
the roots of a polynomial of arbitrary degree
to any required accuracy. However, this approach
is not viable in practice because the coefficients
would be contaminated by unavoidable round-off
errors, and the roots of a polynomial can
be an extremely sensitive function of the
coefficients (as exemplified by Wilkinson's
polynomial). Even for matrices whose elements
are integers the calculation becomes nontrivial,
because the sums are very long; the constant
term is the determinant, which for an
n
×
n
{\displaystyle n\times n}
is a sum of
n
!
{\displaystyle n!}
different products.Explicit algebraic formulas
for the roots of a polynomial exist only if
the degree
n
{\displaystyle n}
is 4 or less. According to the Abel–Ruffini
theorem there is no general, explicit and
exact algebraic formula for the roots of a
polynomial with degree 5 or more. (Generality
matters because any polynomial with degree
n
{\displaystyle n}
is the characteristic polynomial of some companion
matrix of order
n
{\displaystyle n}
.) Therefore, for matrices of order 5 or more,
the eigenvalues and eigenvectors cannot be
obtained by an explicit algebraic formula,
and must therefore be computed by approximate
numerical methods. Even the exact formula
for the roots of a degree 3 polynomial is
numerically impractical.
==== Eigenvectors ====
Once the (exact) value of an eigenvalue is
known, the corresponding eigenvectors can
be found by finding non-zero solutions of
the eigenvalue equation, that becomes a system
of linear equations with known coefficients.
For example, once it is known that 6 is an
eigenvalue of the matrix
A
=
[
4
1
6
3
]
{\displaystyle A={\begin{bmatrix}4&1\\6&3\end{bmatrix}}}
we can find its eigenvectors by solving the
equation
A
v
=
6
v
{\displaystyle Av=6v}
, that is
[
4
1
6
3
]
[
x
y
]
=
6
⋅
[
x
y
]
{\displaystyle {\begin{bmatrix}4&1\\6&3\end{bmatrix}}{\begin{bmatrix}x\\y\end{bmatrix}}=6\cdot
{\begin{bmatrix}x\\y\end{bmatrix}}}
This matrix equation is equivalent to two
linear equations
{
4
x
+
y
=
6
x
6
x
+
3
y
=
6
y
{\displaystyle \left\{{\begin{aligned}4x+y&=6x\\6x+3y&=6y\end{aligned}}\right.}
that is
{
−
2
x
+
y
=
0
6
x
−
3
y
=
0
{\displaystyle \left\{{\begin{aligned}-2x+y&=0\\6x-3y&=0\end{aligned}}\right.}
Both equations reduce to the single linear
equation
y
=
2
x
{\displaystyle y=2x}
. Therefore, any vector of the form
[
a
2
a
]
{\displaystyle {\begin{bmatrix}a\\2a\end{bmatrix}}}
, for any non-zero real number
a
{\displaystyle a}
, is an eigenvector of
A
{\displaystyle A}
with eigenvalue
λ
=
6
{\displaystyle \lambda =6}
.
The matrix
A
{\displaystyle A}
above has another eigenvalue
λ
=
1
{\displaystyle \lambda =1}
. A similar calculation shows that the corresponding
eigenvectors are the non-zero solutions of
3
x
+
y
=
0
{\displaystyle 3x+y=0}
, that is, any vector of the form
[
b
−
3
b
]
{\displaystyle {\begin{bmatrix}b\\-3b\end{bmatrix}}}
, for any non-zero real number
b
{\displaystyle b}
.
=== Simple iterative methods ===
The converse approach, of first seeking the
eigenvectors and then determining each eigenvalue
from its eigenvector, turns out to be far
more tractable for computers. The easiest
algorithm here consists of picking an arbitrary
starting vector and then repeatedly multiplying
it with the matrix (optionally normalising
the vector to keep its elements of reasonable
size); surprisingly this makes the vector
converge towards an eigenvector. A variation
is to instead multiply the vector by
(
A
−
μ
I
)
−
1
{\displaystyle (A-\mu I)^{-1}}
; this causes it to converge to an eigenvector
of the eigenvalue closest to
μ
∈
C
{\displaystyle \mu \in \mathbb {C} }
.
If
v
{\displaystyle \mathbf {v} }
is (a good approximation of) an eigenvector
of
A
{\displaystyle A}
, then the corresponding eigenvalue can be
computed as
λ
=
v
∗
A
v
v
∗
v
{\displaystyle \lambda ={\frac {\mathbf {v}
^{*}A\mathbf {v} }{\mathbf {v} ^{*}\mathbf
{v} }}}
where
v
∗
{\displaystyle \mathbf {v} ^{*}}
denotes the conjugate transpose of
v
{\displaystyle \mathbf {v} }
.
=== Modern methods ===
Efficient, accurate methods to compute eigenvalues
and eigenvectors of arbitrary matrices were
not known until the advent of the QR algorithm
in 1961. Combining the Householder transformation
with the LU decomposition results in an algorithm
with better convergence than the QR algorithm.
For large Hermitian sparse matrices, the Lanczos
algorithm is one example of an efficient iterative
method to compute eigenvalues and eigenvectors,
among several other possibilities.Most numeric
methods that compute the eigenvalues of a
matrix also determine a set of corresponding
eigenvectors as a by-product of the computation,
although sometimes the implementors choose
to discard the eigenvector information as
soon as it is not needed anymore.
== Applications ==
=== Eigenvalues of geometric transformations
===
The following table presents some example
transformations in the plane along with their
2×2 matrices, eigenvalues, and eigenvectors.
Note that the characteristic equation for
a rotation is a quadratic equation with discriminant
D
=
−
4
(
sin
⁡
θ
)
2
{\displaystyle D=-4(\sin \theta )^{2}}
, which is a negative number whenever θ is
not an integer multiple of 180°. Therefore,
except for these special cases, the two eigenvalues
are complex numbers,
cos
⁡
θ
±
i
sin
⁡
θ
{\displaystyle \cos \theta \pm \mathbf {i}
\sin \theta }
; and all eigenvectors have non-real entries.
Indeed, except for those special cases, a
rotation changes the direction of every nonzero
vector in the plane.
A linear transformation that takes a square
to a rectangle of the same area (a squeeze
mapping) has reciprocal eigenvalues.
=== Schrödinger equation ===
An example of an eigenvalue equation where
the transformation
T
{\displaystyle T}
is represented in terms of a differential
operator is the time-independent Schrödinger
equation in quantum mechanics:
H
ψ
E
=
E
ψ
E
{\displaystyle H\psi _{E}=E\psi _{E}\,}
where
H
{\displaystyle H}
, the Hamiltonian, is a second-order differential
operator and
ψ
E
{\displaystyle \psi _{E}}
, the wavefunction, is one of its eigenfunctions
corresponding to the eigenvalue
E
{\displaystyle E}
, interpreted as its energy.
However, in the case where one is interested
only in the bound state solutions of the Schrödinger
equation, one looks for
ψ
E
{\displaystyle \psi _{E}}
within the space of square integrable functions.
Since this space is a Hilbert space with a
well-defined scalar product, one can introduce
a basis set in which
ψ
E
{\displaystyle \psi _{E}}
and
H
{\displaystyle H}
can be represented as a one-dimensional array
(i.e., a vector) and a matrix respectively.
This allows one to represent the Schrödinger
equation in a matrix form.
The bra–ket notation is often used in this
context. A vector, which represents a state
of the system, in the Hilbert space of square
integrable functions is represented by
|
Ψ
E
⟩
{\displaystyle |\Psi _{E}\rangle }
. In this notation, the Schrödinger equation
is:
H
|
Ψ
E
⟩
=
E
|
Ψ
E
⟩
{\displaystyle H|\Psi _{E}\rangle =E|\Psi
_{E}\rangle }
where
|
Ψ
E
⟩
{\displaystyle |\Psi _{E}\rangle }
is an eigenstate of
H
{\displaystyle H}
and
E
{\displaystyle E}
represents the eigenvalue.
H
{\displaystyle H}
is an observable self adjoint operator, the
infinite-dimensional analog of Hermitian matrices.
As in the matrix case, in the equation above
H
|
Ψ
E
⟩
{\displaystyle H|\Psi _{E}\rangle }
is understood to be the vector obtained by
application of the transformation
H
{\displaystyle H}
to
|
Ψ
E
⟩
{\displaystyle |\Psi _{E}\rangle }
.
=== Molecular orbitals ===
In quantum mechanics, and in particular in
atomic and molecular physics, within the Hartree–Fock
theory, the atomic and molecular orbitals
can be defined by the eigenvectors of the
Fock operator. The corresponding eigenvalues
are interpreted as ionization potentials via
Koopmans' theorem. In this case, the term
eigenvector is used in a somewhat more general
meaning, since the Fock operator is explicitly
dependent on the orbitals and their eigenvalues.
Thus, if one wants to underline this aspect,
one speaks of nonlinear eigenvalue problems.
Such equations are usually solved by an iteration
procedure, called in this case self-consistent
field method. In quantum chemistry, one often
represents the Hartree–Fock equation in
a non-orthogonal basis set. This particular
representation is a generalized eigenvalue
problem called Roothaan equations.
=== Geology and glaciology ===
In geology, especially in the study of glacial
till, eigenvectors and eigenvalues are used
as a method by which a mass of information
of a clast fabric's constituents' orientation
and dip can be summarized in a 3-D space by
six numbers. In the field, a geologist may
collect such data for hundreds or thousands
of clasts in a soil sample, which can only
be compared graphically such as in a Tri-Plot
(Sneed and Folk) diagram, or as a Stereonet
on a Wulff Net.The output for the orientation
tensor is in the three orthogonal (perpendicular)
axes of space. The three eigenvectors are
ordered
v
1
,
v
2
,
v
3
{\displaystyle v_{1},v_{2},v_{3}}
by their eigenvalues
E
1
≥
E
2
≥
E
3
{\displaystyle E_{1}\geq E_{2}\geq E_{3}}
;
v
1
{\displaystyle v_{1}}
then is the primary orientation/dip of clast,
v
2
{\displaystyle v_{2}}
is the secondary and
v
3
{\displaystyle v_{3}}
is the tertiary, in terms of strength. The
clast orientation is defined as the direction
of the eigenvector, on a compass rose of 360°.
Dip is measured as the eigenvalue, the modulus
of the tensor: this is valued from 0° (no
dip) to 90° (vertical). The relative values
of
E
1
{\displaystyle E_{1}}
,
E
2
{\displaystyle E_{2}}
, and
E
3
{\displaystyle E_{3}}
are dictated by the nature of the sediment's
fabric. If
E
1
=
E
2
=
E
3
{\displaystyle E_{1}=E_{2}=E_{3}}
, the fabric is said to be isotropic. If
E
1
=
E
2
>
E
3
{\displaystyle E_{1}=E_{2}>E_{3}}
, the fabric is said to be planar. If
E
1
>
E
2
>
E
3
{\displaystyle E_{1}>E_{2}>E_{3}}
, the fabric is said to be linear.
=== Principal component analysis ===
The eigendecomposition of a symmetric positive
semidefinite (PSD) matrix yields an orthogonal
basis of eigenvectors, each of which has a
nonnegative eigenvalue. The orthogonal decomposition
of a PSD matrix is used in multivariate analysis,
where the sample covariance matrices are PSD.
This orthogonal decomposition is called principal
components analysis (PCA) in statistics. PCA
studies linear relations among variables.
PCA is performed on the covariance matrix
or the correlation matrix (in which each variable
is scaled to have its sample variance equal
to one). For the covariance or correlation
matrix, the eigenvectors correspond to principal
components and the eigenvalues to the variance
explained by the principal components. Principal
component analysis of the correlation matrix
provides an orthonormal eigen-basis for the
space of the observed data: In this basis,
the largest eigenvalues correspond to the
principal components that are associated with
most of the covariability among a number of
observed data.
Principal component analysis is used to study
large data sets, such as those encountered
in bioinformatics, data mining, chemical research,
psychology, and in marketing. PCA is popular
especially in psychology, in the field of
psychometrics. In Q methodology, the eigenvalues
of the correlation matrix determine the Q-methodologist's
judgment of practical significance (which
differs from the statistical significance
of hypothesis testing; cf. criteria for determining
the number of factors). More generally, principal
component analysis can be used as a method
of factor analysis in structural equation
modeling.
=== Vibration analysis ===
Eigenvalue problems occur naturally in the
vibration analysis of mechanical structures
with many degrees of freedom. The eigenvalues
are the natural frequencies (or eigenfrequencies)
of vibration, and the eigenvectors are the
shapes of these vibrational modes. In particular,
undamped vibration is governed by
m
x
¨
+
k
x
=
0
{\displaystyle m{\ddot {x}}+kx=0}
or
m
x
¨
=
−
k
x
{\displaystyle m{\ddot {x}}=-kx}
that is, acceleration is proportional to position
(i.e., we expect
x
{\displaystyle x}
to be sinusoidal in time).
In
n
{\displaystyle n}
dimensions,
m
{\displaystyle m}
becomes a mass matrix and
k
{\displaystyle k}
a stiffness matrix. Admissible solutions are
then a linear combination of solutions to
the generalized eigenvalue problem
−
k
x
=
ω
2
m
x
{\displaystyle -kx=\omega ^{2}mx}
where
ω
2
{\displaystyle \omega ^{2}}
is the eigenvalue and
ω
{\displaystyle \omega }
is the (imaginary) angular frequency. Note
that the principal vibration modes are different
from the principal compliance modes, which
are the eigenvectors of
k
{\displaystyle k}
alone. Furthermore, damped vibration, governed
by
m
x
¨
+
c
x
˙
+
k
x
=
0
{\displaystyle m{\ddot {x}}+c{\dot {x}}+kx=0}
leads to a so-called quadratic eigenvalue
problem,
(
ω
2
m
+
ω
c
+
k
)
x
=
0.
{\displaystyle \left(\omega ^{2}m+\omega c+k\right)x=0.}
This can be reduced to a generalized eigenvalue
problem by algebraic manipulation at the cost
of solving a larger system.
The orthogonality properties of the eigenvectors
allows decoupling of the differential equations
so that the system can be represented as linear
summation of the eigenvectors. The eigenvalue
problem of complex structures is often solved
using finite element analysis, but neatly
generalize the solution to scalar-valued vibration
problems.
=== Eigenfaces ===
In image processing, processed images of faces
can be seen as vectors whose components are
the brightnesses of each pixel. The dimension
of this vector space is the number of pixels.
The eigenvectors of the covariance matrix
associated with a large set of normalized
pictures of faces are called eigenfaces; this
is an example of principal component analysis.
They are very useful for expressing any face
image as a linear combination of some of them.
In the facial recognition branch of biometrics,
eigenfaces provide a means of applying data
compression to faces for identification purposes.
Research related to eigen vision systems determining
hand gestures has also been made.
Similar to this concept, eigenvoices represent
the general direction of variability in human
pronunciations of a particular utterance,
such as a word in a language. Based on a linear
combination of such eigenvoices, a new voice
pronunciation of the word can be constructed.
These concepts have been found useful in automatic
speech recognition systems for speaker adaptation.
=== Tensor of moment of inertia ===
In mechanics, the eigenvectors of the moment
of inertia tensor define the principal axes
of a rigid body. The tensor of moment of inertia
is a key quantity required to determine the
rotation of a rigid body around its center
of mass.
=== Stress tensor ===
In solid mechanics, the stress tensor is symmetric
and so can be decomposed into a diagonal tensor
with the eigenvalues on the diagonal and eigenvectors
as a basis. Because it is diagonal, in this
orientation, the stress tensor has no shear
components; the components it does have are
the principal components.
=== Graphs ===
In spectral graph theory, an eigenvalue of
a graph is defined as an eigenvalue of the
graph's adjacency matrix
A
{\displaystyle A}
, or (increasingly) of the graph's Laplacian
matrix due to its discrete Laplace operator,
which is either
T
−
A
{\displaystyle T-A}
(sometimes called the combinatorial Laplacian)
or
I
−
T
−
1
/
2
A
T
−
1
/
2
{\displaystyle I-T^{-1/2}AT^{-1/2}}
(sometimes called the normalized Laplacian),
where
T
{\displaystyle T}
is a diagonal matrix with
T
i
i
{\displaystyle T_{ii}}
equal to the degree of vertex
v
i
{\displaystyle v_{i}}
, and in
T
−
1
/
2
{\displaystyle T^{-1/2}}
, the
i
{\displaystyle i}
th diagonal entry is
1
/
deg
⁡
(
v
i
)
{\displaystyle 1/{\sqrt {\deg \left(v_{i}\right)}}}
. The
k
{\displaystyle k}
th principal eigenvector of a graph is defined
as either the eigenvector corresponding to
the
k
{\displaystyle k}
th largest or
k
{\displaystyle k}
th smallest eigenvalue of the Laplacian. The
first principal eigenvector of the graph is
also referred to merely as the principal eigenvector.
The principal eigenvector is used to measure
the centrality of its vertices. An example
is Google's PageRank algorithm. The principal
eigenvector of a modified adjacency matrix
of the World Wide Web graph gives the page
ranks as its components. This vector corresponds
to the stationary distribution of the Markov
chain represented by the row-normalized adjacency
matrix; however, the adjacency matrix must
first be modified to ensure a stationary distribution
exists. The second smallest eigenvector can
be used to partition the graph into clusters,
via spectral clustering. Other methods are
also available for clustering.
=== Basic reproduction number ===
The basic reproduction number (
R
0
{\displaystyle R_{0}}
) is a fundamental number in the study of
how infectious diseases spread. If one infectious
person is put into a population of completely
susceptible people, then
R
0
{\displaystyle R_{0}}
is the average number of people that one typical
infectious person will infect. The generation
time of an infection is 
the time,
t
G
{\displaystyle t_{G}}
, from one person becoming infected to the
next person becoming infected. In 
a heterogeneous population, the next generation
matrix defines how many people in 
the population will become infected after
time
t
G
{\displaystyle t_{G}}
has passed.
R
0
{\displaystyle R_{0}}
is then the largest eigenvalue of 
the next generation matrix.
== See also ==
Antieigenvalue theory
Eigenoperator
Eigenplane
Eigenvalue algorithm
Introduction to eigenstates
Jordan normal form
List of numerical analysis software
Nonlinear eigenproblem
Quadratic eigenvalue problem
Singular value
== Footnotes ==
== Notes
