In quantum mechanics, the Hellmann–Feynman
theorem relates the derivative of the total
energy with respect to a parameter, to the
expectation value of the derivative of the
Hamiltonian with respect to that same parameter.
According to the theorem, once the spatial
distribution of the electrons has been determined
by solving the Schrödinger equation, all
the forces in the system can be calculated
using classical electrostatics.
The theorem has been proven independently
by many authors, including Paul Güttinger
(1932), Wolfgang Pauli (1933), Hans Hellmann
(1937) and Richard Feynman (1939).The theorem
states
where
H
^
λ
{\displaystyle {\hat {H}}_{\lambda }}
is a Hamiltonian operator depending upon a
continuous parameter
λ
{\displaystyle \lambda \,}
,
|
ψ
λ
⟩
{\displaystyle |\psi _{\lambda }\rangle }
, is an eigen-state (eigenfunction) of the
Hamiltonian, depending implicitly upon
λ
{\displaystyle \lambda }
,
E
λ
{\displaystyle E_{\lambda }\,}
is the energy (eigenvalue) of the state
|
ψ
λ
⟩
{\displaystyle |\psi _{\lambda }\rangle }
, i.e.
H
^
λ
|
ψ
λ
⟩
=
E
λ
|
ψ
λ
⟩
{\displaystyle {\hat {H}}_{\lambda }|\psi
_{\lambda }\rangle =E_{\lambda }|\psi _{\lambda
}\rangle }
.
== Proof ==
This proof of the Hellmann–Feynman theorem
requires that the wavefunction be an eigenfunction
of the Hamiltonian under consideration; however,
one can also prove more generally that the
theorem holds for non-eigenfunction wavefunctions
which are stationary (partial derivative is
zero) for all relevant variables (such as
orbital rotations). The Hartree–Fock wavefunction
is an important example of an approximate
eigenfunction that still satisfies the Hellmann–Feynman
theorem. Notable example of where the Hellmann–Feynman
is not applicable is for example finite-order
Møller–Plesset perturbation theory, which
is not variational.The proof also employs
an identity of normalized wavefunctions – that
derivatives of the overlap of a wavefunction
with itself must be zero. Using Dirac's bra–ket
notation these two conditions are written
as
H
^
λ
|
ψ
λ
⟩
=
E
λ
|
ψ
λ
⟩
,
{\displaystyle {\hat {H}}_{\lambda }|\psi
_{\lambda }\rangle =E_{\lambda }|\psi _{\lambda
}\rangle ,}
⟨
ψ
λ
|
ψ
λ
⟩
=
1
⇒
d
d
λ
⟨
ψ
λ
|
ψ
λ
⟩
=
0.
{\displaystyle \langle \psi _{\lambda }|\psi
_{\lambda }\rangle =1\Rightarrow {\frac {\mathrm
{d} }{\mathrm {d} \lambda }}\langle \psi _{\lambda
}|\psi _{\lambda }\rangle =0.}
The proof then follows through an application
of the derivative product rule to the expectation
value of the Hamiltonian viewed as a function
of λ:
d
E
λ
d
λ
=
d
d
λ
⟨
ψ
λ
|
H
^
λ
|
ψ
λ
⟩
=
⟨
d
ψ
λ
d
λ
|
H
^
λ
|
ψ
λ
⟩
+
⟨
ψ
λ
|
H
^
λ
|
d
ψ
λ
d
λ
⟩
+
⟨
ψ
λ
|
d
H
^
λ
d
λ
|
ψ
λ
⟩
=
E
λ
⟨
d
ψ
λ
d
λ
|
ψ
λ
⟩
+
E
λ
⟨
ψ
λ
|
d
ψ
λ
d
λ
⟩
+
⟨
ψ
λ
|
d
H
^
λ
d
λ
|
ψ
λ
⟩
=
E
λ
d
d
λ
⟨
ψ
λ
|
ψ
λ
⟩
+
⟨
ψ
λ
|
d
H
^
λ
d
λ
|
ψ
λ
⟩
=
⟨
ψ
λ
|
d
H
^
λ
d
λ
|
ψ
λ
⟩
.
{\displaystyle {\begin{aligned}{\frac {\mathrm
{d} E_{\lambda }}{\mathrm {d} \lambda }}&={\frac
{\mathrm {d} }{\mathrm {d} \lambda }}\langle
\psi _{\lambda }|{\hat {H}}_{\lambda }|\psi
_{\lambda }\rangle \\&={\bigg \langle }{\frac
{\mathrm {d} \psi _{\lambda }}{\mathrm {d}
\lambda }}{\bigg |}{\hat {H}}_{\lambda }{\bigg
|}\psi _{\lambda }{\bigg \rangle }+{\bigg
\langle }\psi _{\lambda }{\bigg |}{\hat {H}}_{\lambda
}{\bigg |}{\frac {\mathrm {d} \psi _{\lambda
}}{\mathrm {d} \lambda }}{\bigg \rangle }+{\bigg
\langle }\psi _{\lambda }{\bigg |}{\frac {\mathrm
{d} {\hat {H}}_{\lambda }}{\mathrm {d} \lambda
}}{\bigg |}\psi _{\lambda }{\bigg \rangle
}\\&=E_{\lambda }{\bigg \langle }{\frac {\mathrm
{d} \psi _{\lambda }}{\mathrm {d} \lambda
}}{\bigg |}\psi _{\lambda }{\bigg \rangle
}+E_{\lambda }{\bigg \langle }\psi _{\lambda
}{\bigg |}{\frac {\mathrm {d} \psi _{\lambda
}}{\mathrm {d} \lambda }}{\bigg \rangle }+{\bigg
\langle }\psi _{\lambda }{\bigg |}{\frac {\mathrm
{d} {\hat {H}}_{\lambda }}{\mathrm {d} \lambda
}}{\bigg |}\psi _{\lambda }{\bigg \rangle
}\\&=E_{\lambda }{\frac {\mathrm {d} }{\mathrm
{d} \lambda }}\langle \psi _{\lambda }|\psi
_{\lambda }\rangle +{\bigg \langle }\psi _{\lambda
}{\bigg |}{\frac {\mathrm {d} {\hat {H}}_{\lambda
}}{\mathrm {d} \lambda }}{\bigg |}\psi _{\lambda
}{\bigg \rangle }\\&={\bigg \langle }\psi
_{\lambda }{\bigg |}{\frac {\mathrm {d} {\hat
{H}}_{\lambda }}{\mathrm {d} \lambda }}{\bigg
|}\psi _{\lambda }{\bigg \rangle }.\end{aligned}}}
== Alternate proof ==
The Hellmann–Feynman theorem is actually
a direct, and to some extent trivial, consequence
of the variational principle (the Rayleigh-Ritz
variational principle) from which the Schrödinger
equation can be made to derive. This is why
the Hellmann–Feynman theorem holds for wave-functions
(such as the Hartree–Fock wave-function)
that, though not eigenfunctions of the Hamiltonian,
do derive from a variational principle. This
is also why it holds, e.g., in density functional
theory, which is not wave-function based and
for which the standard derivation does not
apply.
According to the Rayleigh–Ritz variational
principle, the eigenfunctions of the Schrödinger
equation are stationary points of the functional
(which we nickname Schrödinger functional
for brevity):
The eigenvalues are the values that the Schrödinger
functional takes at the stationary points:
where
ψ
λ
{\displaystyle \psi _{\lambda }}
satisfies the variational condition:
Let us differentiate Eq. (3) using the chain
rule:
Due to the variational condition, Eq. (4),
the second term in Eq. (5) vanishes. In one
sentence, the Hellmann–Feynman theorem states
that the derivative of the stationary values
of a function(al) with respect to a parameter
on which it may depend, can be computed from
the explicit dependence only, disregarding
the implicit one. On account of the fact that
the Schrödinger functional can only depend
explicitly on an external parameter through
the Hamiltonian, Eq. (1) trivially follows.
== Example applications ==
=== Molecular forces ===
The most common application of the Hellmann–Feynman
theorem is to the calculation of intramolecular
forces in molecules. This allows for the calculation
of equilibrium geometries – the nuclear
coordinates where the forces acting upon the
nuclei, due to the electrons and other nuclei,
vanish. The parameter λ corresponds to the
coordinates of the nuclei. For a molecule
with 1 ≤ i ≤ N electrons with coordinates
{ri}, and 1 ≤ α ≤ M nuclei, each located
at a specified point {Rα={Xα,Yα,Zα)} and
with nuclear charge Zα, the clamped nucleus
Hamiltonian is
H
^
=
T
^
+
U
^
−
∑
i
=
1
N
∑
α
=
1
M
Z
α
|
r
i
−
R
α
|
+
∑
α
M
∑
β
>
α
M
Z
α
Z
β
|
R
α
−
R
β
|
.
{\displaystyle {\hat {H}}={\hat {T}}+{\hat
{U}}-\sum _{i=1}^{N}\sum _{\alpha =1}^{M}{\frac
{Z_{\alpha }}{|\mathbf {r} _{i}-\mathbf {R}
_{\alpha }|}}+\sum _{\alpha }^{M}\sum _{\beta
>\alpha }^{M}{\frac {Z_{\alpha }Z_{\beta }}{|\mathbf
{R} _{\alpha }-\mathbf {R} _{\beta }|}}.}
The x-component of the force acting on a given
nucleus is equal to the negative of the derivative
of the total energy with respect to that coordinate.
Employing the Hellmann–Feynman theorem this
is equal to
F
X
γ
=
−
∂
E
∂
X
γ
=
−
⟨
ψ
|
∂
H
^
∂
X
γ
|
ψ
⟩
.
{\displaystyle F_{X_{\gamma }}=-{\frac {\partial
E}{\partial X_{\gamma }}}=-{\bigg \langle
}\psi {\bigg |}{\frac {\partial {\hat {H}}}{\partial
X_{\gamma }}}{\bigg |}\psi {\bigg \rangle
}.}
Only two components of the Hamiltonian contribute
to the required derivative – the electron-nucleus
and nucleus-nucleus terms. Differentiating
the Hamiltonian yields
∂
H
^
∂
X
γ
=
∂
∂
X
γ
(
−
∑
i
=
1
N
∑
α
=
1
M
Z
α
|
r
i
−
R
α
|
+
∑
α
M
∑
β
>
α
M
Z
α
Z
β
|
R
α
−
R
β
|
)
,
=
−
Z
γ
∑
i
=
1
N
x
i
−
X
γ
|
r
i
−
R
γ
|
3
+
Z
γ
∑
α
≠
γ
M
Z
α
X
α
−
X
γ
|
R
α
−
R
γ
|
3
.
{\displaystyle {\begin{aligned}{\frac {\partial
{\hat {H}}}{\partial X_{\gamma }}}&={\frac
{\partial }{\partial X_{\gamma }}}\left(-\sum
_{i=1}^{N}\sum _{\alpha =1}^{M}{\frac {Z_{\alpha
}}{|\mathbf {r} _{i}-\mathbf {R} _{\alpha
}|}}+\sum _{\alpha }^{M}\sum _{\beta >\alpha
}^{M}{\frac {Z_{\alpha }Z_{\beta }}{|\mathbf
{R} _{\alpha }-\mathbf {R} _{\beta }|}}\right),\\&=-Z_{\gamma
}\sum _{i=1}^{N}{\frac {x_{i}-X_{\gamma }}{|\mathbf
{r} _{i}-\mathbf {R} _{\gamma }|^{3}}}+Z_{\gamma
}\sum _{\alpha \neq \gamma }^{M}Z_{\alpha
}{\frac {X_{\alpha }-X_{\gamma }}{|\mathbf
{R} _{\alpha }-\mathbf {R} _{\gamma }|^{3}}}.\end{aligned}}}
Insertion of this in to the Hellmann–Feynman
theorem returns the x-component of the force
on the given nucleus in terms of the electronic
density (ρ(r)) and the atomic coordinates
and nuclear charges:
F
X
γ
=
Z
γ
(
∫
d
r
ρ
(
r
)
x
−
X
γ
|
r
−
R
γ
|
3
−
∑
α
≠
γ
M
Z
α
X
α
−
X
γ
|
R
α
−
R
γ
|
3
)
.
{\displaystyle F_{X_{\gamma }}=Z_{\gamma }\left(\int
\mathrm {d} \mathbf {r} \ \rho (\mathbf {r}
){\frac {x-X_{\gamma }}{|\mathbf {r} -\mathbf
{R} _{\gamma }|^{3}}}-\sum _{\alpha \neq \gamma
}^{M}Z_{\alpha }{\frac {X_{\alpha }-X_{\gamma
}}{|\mathbf {R} _{\alpha }-\mathbf {R} _{\gamma
}|^{3}}}\right).}
=== Expectation values ===
An alternative approach for applying the Hellmann–Feynman
theorem is to promote a fixed or discrete
parameter which appears in a Hamiltonian to
be a continuous variable solely for the mathematical
purpose of taking a derivative. Possible parameters
are physical constants or discrete quantum
numbers. As an example, the radial Schrödinger
equation for a hydrogen-like atom is
H
^
l
=
−
ℏ
2
2
μ
r
2
(
d
d
r
(
r
2
d
d
r
)
−
l
(
l
+
1
)
)
−
Z
e
2
r
,
{\displaystyle {\hat {H}}_{l}=-{\frac {\hbar
^{2}}{2\mu r^{2}}}\left({\frac {\mathrm {d}
}{\mathrm {d} r}}\left(r^{2}{\frac {\mathrm
{d} }{\mathrm {d} r}}\right)-l(l+1)\right)-{\frac
{Ze^{2}}{r}},}
which depends upon the discrete azimuthal
quantum number l. Promoting l to be a continuous
parameter allows for the derivative of the
Hamiltonian to be taken:
∂
H
^
l
∂
l
=
ℏ
2
2
μ
r
2
(
2
l
+
1
)
.
{\displaystyle {\frac {\partial {\hat {H}}_{l}}{\partial
l}}={\frac {\hbar ^{2}}{2\mu r^{2}}}(2l+1).}
The Hellmann–Feynman theorem then allows
for the determination of the expectation value
of
1
r
2
{\displaystyle {\frac {1}{r^{2}}}}
for hydrogen-like atoms:
⟨
ψ
n
l
|
1
r
2
|
ψ
n
l
⟩
=
2
μ
ℏ
2
1
2
l
+
1
⟨
ψ
n
l
|
∂
H
^
l
∂
l
|
ψ
n
l
⟩
=
2
μ
ℏ
2
1
2
l
+
1
∂
E
n
∂
l
=
2
μ
ℏ
2
1
2
l
+
1
∂
E
n
∂
n
∂
n
∂
l
=
2
μ
ℏ
2
1
2
l
+
1
Z
2
μ
e
4
ℏ
2
n
3
=
Z
2
μ
2
e
4
ℏ
4
n
3
(
l
+
1
/
2
)
.
{\displaystyle {\begin{aligned}{\bigg \langle
}\psi _{nl}{\bigg |}{\frac {1}{r^{2}}}{\bigg
|}\psi _{nl}{\bigg \rangle }&={\frac {2\mu
}{\hbar ^{2}}}{\frac {1}{2l+1}}{\bigg \langle
}\psi _{nl}{\bigg |}{\frac {\partial {\hat
{H}}_{l}}{\partial l}}{\bigg |}\psi _{nl}{\bigg
\rangle }\\&={\frac {2\mu }{\hbar ^{2}}}{\frac
{1}{2l+1}}{\frac {\partial E_{n}}{\partial
l}}\\&={\frac {2\mu }{\hbar ^{2}}}{\frac {1}{2l+1}}{\frac
{\partial E_{n}}{\partial n}}{\frac {\partial
n}{\partial l}}\\&={\frac {2\mu }{\hbar ^{2}}}{\frac
{1}{2l+1}}{\frac {Z^{2}\mu e^{4}}{\hbar ^{2}n^{3}}}\\&={\frac
{Z^{2}\mu ^{2}e^{4}}{\hbar ^{4}n^{3}(l+1/2)}}.\end{aligned}}}
=== Van der Waals forces ===
In the end of Feynman's paper, he states that,
"Van der Waals's forces can also be interpreted
as arising from charge distributions
with higher concentration between the nuclei.
The Schrödinger perturbation theory for two
interacting atoms at a separation R, large
compared to the radii of the atoms, leads
to the result that the charge distribution
of each is distorted from central
symmetry, a dipole moment of order 1/R7 being
induced in each atom. The negative charge
distribution of each atom has its center of
gravity moved slightly toward the other. It
is not the interaction of these dipoles which
leads to van der Waals's force, but rather
the attraction of each nucleus for the distorted
charge distribution of its own electrons that
gives the attractive 1/R7 force."
== Hellmann–Feynman theorem for time-dependent
wavefunctions ==
For a general time-dependent wavefunction
satisfying the time-dependent Schrödinger
equation, the Hellmann–Feynman theorem is
not valid.
However, the following identity holds:
⟨
Ψ
λ
(
t
)
|
∂
H
λ
∂
λ
|
Ψ
λ
(
t
)
⟩
=
i
ℏ
∂
∂
t
⟨
Ψ
λ
(
t
)
|
∂
Ψ
λ
(
t
)
∂
λ
⟩
{\displaystyle {\bigg \langle }\Psi _{\lambda
}(t){\bigg |}{\frac {\partial H_{\lambda }}{\partial
\lambda }}{\bigg |}\Psi _{\lambda }(t){\bigg
\rangle }=i\hbar {\frac {\partial }{\partial
t}}{\bigg \langle }\Psi _{\lambda }(t){\bigg
|}{\frac {\partial \Psi _{\lambda }(t)}{\partial
\lambda }}{\bigg \rangle }}
For
i
ℏ
∂
Ψ
λ
(
t
)
∂
t
=
H
λ
Ψ
λ
(
t
)
{\displaystyle i\hbar {\frac {\partial \Psi
_{\lambda }(t)}{\partial t}}=H_{\lambda }\Psi
_{\lambda }(t)}
=== Proof ===
The proof only relies on the Schrödinger
equation and the assumption that partial derivatives
with respect to λ and t can be interchanged.
⟨
Ψ
λ
(
t
)
|
∂
H
λ
∂
λ
|
Ψ
λ
(
t
)
⟩
=
∂
∂
λ
⟨
Ψ
λ
(
t
)
|
H
λ
|
Ψ
λ
(
t
)
⟩
−
⟨
∂
Ψ
λ
(
t
)
∂
λ
|
H
λ
|
Ψ
λ
(
t
)
⟩
−
⟨
Ψ
λ
(
t
)
|
H
λ
|
∂
Ψ
λ
(
t
)
∂
λ
⟩
=
i
ℏ
∂
∂
λ
⟨
Ψ
λ
(
t
)
|
∂
Ψ
λ
(
t
)
∂
t
⟩
−
i
ℏ
⟨
∂
Ψ
λ
(
t
)
∂
λ
|
∂
Ψ
λ
(
t
)
∂
t
⟩
+
i
ℏ
⟨
∂
Ψ
λ
(
t
)
∂
t
|
∂
Ψ
λ
(
t
)
∂
λ
⟩
=
i
ℏ
⟨
Ψ
λ
(
t
)
|
∂
2
Ψ
λ
(
t
)
∂
λ
∂
t
⟩
+
i
ℏ
⟨
∂
Ψ
λ
(
t
)
∂
t
|
∂
Ψ
λ
(
t
)
∂
λ
⟩
=
i
ℏ
∂
∂
t
⟨
Ψ
λ
(
t
)
|
∂
Ψ
λ
(
t
)
∂
λ
⟩
{\displaystyle {\begin{aligned}{\bigg \langle
}\Psi _{\lambda }(t){\bigg |}{\frac {\partial
H_{\lambda }}{\partial \lambda }}{\bigg |}\Psi
_{\lambda }(t){\bigg \rangle }&={\frac {\partial
}{\partial \lambda }}\langle \Psi _{\lambda
}(t)|H_{\lambda }|\Psi _{\lambda }(t)\rangle
-{\bigg \langle }{\frac {\partial \Psi _{\lambda
}(t)}{\partial \lambda }}{\bigg |}H_{\lambda
}{\bigg |}\Psi _{\lambda }(t){\bigg \rangle
}-{\bigg \langle }\Psi _{\lambda }(t){\bigg
|}H_{\lambda }{\bigg |}{\frac {\partial \Psi
_{\lambda }(t)}{\partial \lambda }}{\bigg
\rangle }\\&=i\hbar {\frac {\partial }{\partial
\lambda }}{\bigg \langle }\Psi _{\lambda }(t){\bigg
|}{\frac {\partial \Psi _{\lambda }(t)}{\partial
t}}{\bigg \rangle }-i\hbar {\bigg \langle
}{\frac {\partial \Psi _{\lambda }(t)}{\partial
\lambda }}{\bigg |}{\frac {\partial \Psi _{\lambda
}(t)}{\partial t}}{\bigg \rangle }+i\hbar
{\bigg \langle }{\frac {\partial \Psi _{\lambda
}(t)}{\partial t}}{\bigg |}{\frac {\partial
\Psi _{\lambda }(t)}{\partial \lambda }}{\bigg
\rangle }\\&=i\hbar {\bigg \langle }\Psi _{\lambda
}(t){\bigg |}{\frac {\partial ^{2}\Psi _{\lambda
}(t)}{\partial \lambda \partial t}}{\bigg
\rangle }+i\hbar {\bigg \langle }{\frac {\partial
\Psi _{\lambda }(t)}{\partial t}}{\bigg |}{\frac
{\partial \Psi _{\lambda }(t)}{\partial \lambda
}}{\bigg \rangle }\\&=i\hbar {\frac {\partial
}{\partial t}}{\bigg \langle }\Psi _{\lambda
}(t){\bigg |}{\frac {\partial \Psi _{\lambda
}(t)}{\partial \lambda }}{\bigg \rangle }\end{aligned}}}
== Notes ==
