In this chapter we discuss some classes
of endomorphisms (or square matrices) whose eigenvalues and
eigenvectors have special properties. Such properties only exist
under further assumptions, and in this chapter our assumptions
concern the relationship between the given endomorphism and its
adjoint endomorphism. Thus, we focus on Euclidean or unitary vector
spaces. This leads to the classes of normal, orthogonal, unitary
and selfadjoint endomorphisms. Each of these classes has a natural
counterpart in the set of square (real or complex) matrices.
18.1 Normal Endomorphisms
We start with the definition of a
normal1 endomorphism
or matrix.
Definition
18.1
Let be a finite dimensional Euclidean or
unitary vector space. An endomorphism is called
normal if . A matrix
or is called normal if or , respectively.
For all we have . The property of
normality can therefore be interpreted as a generalization of this
property of complex numbers.
We will first study the properties of
normal endomorphisms on a finite dimensional unitary vector space .
Recall the following results:
Using these results we obtain the
following characterization of normal endomorphisms on a unitary
vector space.
Theorem
18.2
If is a finite dimensional unitary vector
space, then is normal
if and only if there exists an orthonormal basis B of such that is a diagonal matrix, i.e., f is unitarily diagonalizable.
Proof
Let be normal
and let B be an orthonormal
basis of
such that
is an upper triangular matrix. Then , and from we obtain
We now show by induction on that R is diagonal. This is obvious for
.
Let the assertion hold for an
, and
let be upper triangular with
. We
write R as
where is upper triangular,
, and . Then
From we obtain
,
hence and
. By the induction hypothesis,
is diagonal, and therefore
is diagonal as well.
Conversely, suppose that there exists
orthonormal basis B of
such that is
diagonal. Then is diagonal and, since
diagonal matrices commute, we have
which implies , and hence f is normal.
The application of this theorem to the
unitary vector space with the standard scalar
product and a matrix viewed as element of
yields the
following “matrix version”.
Corollary
18.3
A matrix is normal if and only if there
exists an orthonormal basis of consisting of eigenvectors of
A, i.e., A is unitarily diagonalizable.
The following theorem presents another
characterization of normal endomorphisms on a unitary vector
space.
Theorem
18.4
If is a finite dimensional unitary vector
space, then is normal if
and only if there exists a polynomial with .
Proof
If for a polynomial , then
and hence f is
normal.
Conversely, if f is normal, then there exists an
orthonormal basis B of
,
such that .
Furthermore,
Let be a polynomial with for .
Such a polynomial can be explicitly constructed using the Lagrange
basis of (cp.
Exercise 10.12). Then
and hence also .
Several other characterizations of
normal endomorphisms on a finite dimensional unitary vector space
and of normal matrices can be found in the
article [HorJ12] (see also Exercise 18.8).
We now consider the Euclidean case,
where we focus on real square matrices. All the results can be
formulated analogously for normal endomorphisms on a finite
dimensional Euclidean vector space.
Let be normal, i.e., .
Then A also satisfies
and
when A is considered as an
element of , it is unitarily diagonalizable,
i.e., holds
for a unitary matrix and a diagonal matrix
. Despite the fact that
A has real entries, neither
S nor D will be real in general, since
A as an element of
may not be diagonalizable. For
instance,
is a normal matrix that is not diagonalizable (over ).
Considered as element of , it has the eigenvalues
and
and it is unitarily diagonalizable.
To discuss the case of real normal
matrices in more detail, we first prove a “real version” of Schur’s
theorem.
Theorem
18.5
For every matrix there exists an orthogonal
matrix with
where for every
either or
In the second case has, considered as complex matrix, a pair of
complex conjugate eigenvalues of the form with and . The matrix
R is called a real Schur form of A.
Proof
We proceed via induction on
n. For we have
and .
Suppose that the assertion holds for
some and
let be given. We
consider A as an
element of . Then A has an eigenvalue ,
, corresponding to the
eigenvector , , and we have .
Dividing this equation into its real and imaginary parts, we obtain
the two real equations
We have two cases:
(18.1)
Case
1: .
Then the two equations in (18.1) are and . Thus at least one of the real vectors
x or y is an eigenvector corresponding to
the real eigenvalue of A.
Without loss of generality we assume that this is the vector
x and that . We extend x by the vectors to an orthonormal basis of
with respect to the standard scalar
product. The matrix
then is orthogonal and satisfies
for a matrix . By the induction hypothesis
there exists an orthogonal matrix such that has the desired form. The matrix
is orthogonal and satisfies
where R has the desired
form.
Case
2: .
We first show that x, y are linearly independent. If
, then
using
in the first equation in (18.1) implies that also . This is
not possible, since the eigenvector must be nonzero. Thus, , and
using
in the second equation in (18.1) implies that also . If
are linearly
dependent, then there exists a with . The
two equations in (18.1) then can be written as
which implies that . Since for all , this implies ,
which contradicts the assumption that . Consequently, x, y are linearly independent.
We can combine the two equations in
(18.1) to the
system
where . Applying the Gram-Schmidt
method with respect to the standard scalar product of to the matrix yields
with
and . It then follows that
The real matrix
has, considered as element of , the pair of complex conjugate
eigenvalues with .
In particular, the (2, 1)-entry of is
nonzero, since otherwise would have two real eigenvalues.
We again extend by
vectors to an orthonormal basis of
with respect to the standard scalar
product. (For the list
is empty.) Then
is orthogonal and we have
for a matrix . Analogously to the first
case, an application of the induction hypothesis to this matrix
yields the desired matrices R and U.
Theorem 18.5 implies the
following result for real normal matrices.
Corollary
18.6
A matrix is normal if and only if there
exists an orthogonal matrix with
where, for every
either or
In the second case the matrix has, considered as complex matrix, a pair of
complex conjugate eigenvalues of the form .
Proof
Exercise.
Example
18.7
The matrix
has, considered as a complex matrix, the eigenvalues . It is therefore neither
diagonalizable nor can it be triangulated over .
For the orthogonal matrix
the transformed matrix
is in real Schur form.
18.2 Orthogonal and Unitary Endomorphisms
In this section we extend the concept
of orthogonal and unitary matrices to endomorphisms.
Definition
18.8
Let be a finite dimensional Euclidean or
unitary vector space. An endomorphism is called
orthogonal or unitary, respectively, if .
If , then
is bijective and hence f is injective (cp.
Exercise 2.7). Corollary 10.11 implies that f is bijective. Hence is
the unique inverse of f,
and we also have (cp. our
remarks following Definition 2.21).
Note that an orthogonal or unitary
endomorphism f is normal,
and therefore all results from the previous section also apply to
f.
Lemma
18.9
Let be a finite dimensional Euclidean or
unitary vector space and let be
orthogonal or unitary, respectively. If B is an orthonormal basis of
,
then is
an orthogonal or unitary matrix, respectively.
Proof
Let . For every orthonormal basis
B of
we have
and thus is
orthogonal or unitary, respectively. (In the Euclidean case
.)
In the following theorem we show that
an orthogonal or unitary endomorphism is characterized by the fact
that it does not change the scalar product of arbitrary
vectors.
Lemma
18.10
Let be a finite dimensional Euclidean or
unitary vector space with the scalar product . Then is
orthogonal or unitary, respectively, if and only if for
all .
Proof
If f is orthogonal or unitary and if
, then
On the other hand, suppose that for all
. Then
Since the scalar product is non-degenerate and v can be chosen arbitrarily, we have
for all
, and hence .
We have the following corollary (cp.
Lemma 12.13).
Corollary
18.11
If is a finite dimensional Euclidean or
unitary vector space with the scalar product , is
orthogonal or unitary, respectively, and
is the norm induced by the scalar product, then for all .
For the vector space with the standard scalar
product and induced norm as well as a unitary
matrix , we have for all . Thus,
(cp. (6) in Example 12.4). This holds analogously for
orthogonal matrices .
We now study the eigenvalues and
eigenvectors of orthogonal and unitary endomorphisms.
Lemma
18.12
Let be a finite dimensional Euclidean or
unitary vector space and let be
orthogonal or unitary, respectively. If is
an eigenvalue of f, then
.
Proof
Let be the scalar product
on .
If with , then
and implies that .
The statement of
Lemma 18.12 holds, in particular, for unitary and
orthogonal matrices. However, one should keep in mind that an
orthogonal matrix (or an orthogonal endomorphism) may not have an
eigenvalue. For example, the orthogonal matrix
has the characteristic polynomial , which has no real roots. If considered as
an element of , the matrix A has the eigenvalues
and .
Theorem
18.13
- (1)
If is unitary, then there exists a unitary matrix with
- (2)
If is orthogonal, then there exists an orthogonal matrix with
Proof
- (1)
- (2)
An orthogonal matrix A is normal and hence by Corollary 18.6 there exists an orthogonal matrix with , where either or
We now study two important classes of
orthogonal matrices.
Example
18.14
Let with and let . We define
The matrix is equal
to the identity matrix except for its entries
For we have
the matrix
which satisfies
One easily sees that each of the matrices is orthogonal. The
multiplication of a vector with the matrix results in a (counterclockwise)
rotation of v by the angle
in
the (i, j)-coordinate plane. In Numerical
Mathematics, the matrices are called Givens rotations.2 This is illustrated in the figure
below for the vector and the matrices
and , which represent rotations by
90 and 120 degrees, respectively.
Example
18.15
For we define the
Householder matrix
and for we set
.
For every then H(u) is an orthogonal matrix (cp.
Exercise 12.17). The multiplication of a
vector with the matrix H(u) describes a reflection of
v at the hyperplane
i.e., the hyperplane of vectors that are orthogonal to u with respect to the standard scalar
product. This is illustrated in the figure below for the vector
and the
Householder matrix
which corresponds to .
(18.2)
MATLAB-Minute.
Let . Apply the command
norm(u)
to compute the Euclidean norm of u and form the Householder
matrix H=eye(3)-(2/(u’
u))
(u
u’). Check the
orthogonality of H via the computation of
norm(H’
H-eye(3)). Form
the vector v=H
u and compare
the Euclidean norms of u and v.
18.3 Selfadjoint Endomorphisms
We have already studied selfadjoint
endomorphisms f on a finite
dimensional Euclidean or unitary vector space. The defining
property for this class of endomorphisms is
(cp. Definition 13.13).
Obviously, selfadjoint endomorphisms
are normal and hence the results of Sect. 18.1 hold. We now
strengthen some of these results.
Lemma
18.16
For a finite dimensional Euclidean or
unitary vector space and , the
following statements are equivalent:
(In the Euclidean case .)
- (1)
f is selfadjoint.
- (2)
For every orthonormal basis B of we have .
- (3)
There exists an orthonormal basis B of with .
Proof
We have the following strong result on
the diagonalizability of selfadjoint endomorphisms in both the
Euclidean and the unitary case.
Theorem
18.17
If is a finite dimensional Euclidean or
unitary vector space and is
selfadjoint, then there exists an orthonormal basis B of such that is a real diagonal matrix.
Proof
Consider first the unitary case. If
f is selfadjoint, then
f is normal and hence
unitarily diagonalizable (cp. Theorem 18.2). Let B be an orthonormal basis of
so that is
a diagonal matrix. Then implies
that the diagonal entries of , which are the eigenvalues of f, are real.
Let be an n-dimensional Euclidean vector space.
If is an orthonormal
basis of ,
then is symmetric and in
particular normal. By Corollary 18.6, there exists an
orthogonal matrix with
where for
either or
Since is symmetric, a
block with
cannot occur. Thus, is a real
diagonal matrix.
We define the basis of by
Then, by construction, and
hence .
Therefore, . If
is the scalar product
on ,
then , . With we get
Hence B is an orthonormal
basis of .
This theorem has the following “matrix
version”.
Corollary
18.18
- (1)
If is symmetric, then there exist an orthogonal matrix and a diagonal matrix with .
- (2)
If is Hermitian, then there exist a unitary matrix and a diagonal matrix with .
The statement (1) in this corollary is
known as the principal axes
transformation. We will briefly discuss the background of
this name from the theory of bilinear forms and their applications
in geometry. A symmetric matrix defines a symmetric
bilinear form on via
The map
is called the quadratic
form associated with this symmetric bilinear form.
Since A is symmetric, there exists an
orthogonal matrix such that is a
real diagonal matrix. If , then . The set forms an orthonormal basis of
with respect to the standard scalar
product, and , hence
. For the
change of bases from of to we obtain
(cp. Theorem 11.14). Thus, the real diagonal
matrix D represents the
bilinear form
defined by A with respect
to the basis .
The quadratic form
associated with is
also transformed to a simpler form by this change of bases, since
analogously
Thus, the quadratic form is turned into a “sum of squares”, defined by the
quadratic form .
The principal axes transformation is given
by the change of bases from the canonical basis of to the basis given by the pairwise
orthonormal eigenvectors of A in . The n pairwise orthogonal subspaces
, , form the n principal axes. The geometric
interpretation of this term is illustrated in the following
example.
Example
18.19
For the symmetric matrix
we have
with the orthogonal matrix and
(The numbers here are rounded to the fourth significant digit.)
With the associated quadratic form , we define the set
As described above, the principal axes transformation consists in
the transformation from the canonical coordinate system to a
coordinate system given by an orthonormal basis of eigenvectors of
A. If we carry out this
transformation and replace by the quadratic form , we get
the set
This set forms the ellipse centered at the origin of the two
dimensional cartesian coordinate system (spanned by the canonical
basis vectors )
with axes of lengths and , which is illustrated on the left part of
the following figure:
The elements are
given by for
.
The orthogonal matrix
is a Givens rotation that rotates the ellipse
counterclockwise by the angle (approximately 22.5 degrees).
Hence is just
a “rotated version” of . The right part of the figure above shows the
ellipse in the
cartesian coordinate system. The dashed lines indicate the
respective spans of the vectors and , which are the eigenvectors of A and the principal axes of the ellipse
.
Let be symmetric. For a given vector
and a scalar ,
is a quadratic function in n variables (the entries of the vector
x). The set of zeros of
this function, i.e., the set , is called a
hypersurface of degree 2 or
a quadric. In
Example 18.19 we have already seen quadrics in the
case and with
. We next
give some further examples.
Example
18.20
- (1)
Let , , and . The corresponding quadric
- (2)
Let , , and . The corresponding quadric
- (3)
Let , , and . The corresponding quadric
Corollary 18.18 motivates the
following definition.
Definition
18.21
If is symmetric or is Hermitian with
positive, negative
and zero
eigenvalues (counted with their corresponding multiplicities), then
the triple is called the inertia of A.
Let us first consider, for simplicity,
only the case of real symmetric matrices.
Lemma
18.22
If symmetric has the inertia
, then A and are
congruent.
Proof
Let be symmetric and let
with an orthogonal matrix and .
If A has the inertia
, then we can assume without loss of
generality that
where the diagonal matrices and contain the positive and negative
eigenvalues of A,
respectively, and . We have , where
Here
and thus
This result will be used in the proof
of Sylvester’s law of
inertia.3
Theorem
18.23
The inertia of a symmetric matrix
is invariant under congruence,
i.e., for every matrix the matrices A and have the same inertia.
Proof
The assertion is trivial for
. Let
have
the inertia , then not both and
can be
equal to zero. We assume without loss of generality that
.
(If , then
the following argument can be applied for .)
By Lemma 18.22 there exist
and with
. Let be arbitrary and set
. Then B is symmetric and has an inertia
.
Therefore, for
and a matrix . If we show that and , then also .
We have
and implies that ,
hence .
We set
Let
and .
Since , we
have . If , then
for some that are
not all zero. This implies
If, on the other hand, , then an analogous argument shows
that .
Hence , and the
dimension formula for subspaces (cp. Theorem 9.29) yields
and thus . If we repeat the same
construction by interchanging the roles of and
, then . Thus, and the proof is
complete.
Theorem
18.24
Let be Hermitian with the inertia
. Then there exists a matrix
with
Moreover, for every matrix the matrices A and have the same inertia.
Proof
Exercise.
Finally, we discuss a special class
of symmetric and Hermitian matrices.
Definition
18.25
A real symmetric or complex Hermitian
matrix A is called
- (1)
positive semidefinite, if for all resp. ,
- (2)
positive definite, if for all resp. .
If in (1) or (2) the reverse
inequality holds, then the corresponding matrices are called
negative semidefinite or
negative definite,
respectively.
For selfadjoint endomorphisms we
define analogously: If is a finite dimensional Euclidean or
unitary vector space with the scalar product and if is
selfadjoint, then f is
called positive
semidefinite or positive
definite, if for all resp. for all .
The following theorem characterizes
symmetric positive definite matrices; see Exercise 18.19 and
Exercise 18.20 for the transfer of the results to positive
semidefinite matrices resp. positive definite endomorphisms.
Theorem
18.26
If is symmetric, then the following
statements are equivalent:
- (1)
A is positive definite.
- (2)
All eigenvalues of A are real and positive.
- (3)
There exists a lower triangular matrix with .
Proof
- (1)
The symmetric matrix A is diagonalizable with real eigenvalues (cp. (1) in Corollary 18.18). If is an eigenvalue with associated eigenvector v, i.e., , then and implies that .
- (2)
Let be a diagonalization A with an orthogonal matrix (cp. (1) in Corollary 18.18) and , . Let be arbitrary and let . Then and , so that
- (3)
If with , then for every we have
- (1)
Let be a diagonalization of A with an orthogonal matrix (cp. (1) in Corollary 18.18). Since A is positive definite, we know from (2) that , . We set
One easily sees that an analogous
result holds for complex Hermitian matrices . In this case in assertion (3)
the lower triangular matrix is with .
The factorization in
(3) is called a Cholesky
factorization 4 of A. It is special case of the
LU-decomposition in
Theorem 5.4. In fact,
Theorem 18.26 shows that an LU-decomposition of a (real) symmetric
positive definite matrix can be computed without row
permutations.
In order to compute the Cholesky
factorization of the symmetric positive definite matrix
, we consider the
equation
For the first row of A we
obtain
Analogously, for the rows of A we obtain
The symmetric or Hermitian positive definite matrices are closely
related to the positive definite bilinear forms on Euclidian or
unitary vector spaces.
(18.3)
(18.4)
(18.5)
(18.6)
Theorem
18.27
If is a finite dimensional Euclidian or
unitary vector space and if is a symmetric or Hermitian bilinear form on
,
respectively, then the following statements are equivalent:
- (1)
is positive definite, i.e., for all .
- (2)
For every basis B of the matrix representation is (symmetric or Hermitian) positive definite.
- (3)
There exists a basis B of such that the matrix representation is (symmetric or Hermitian) positive definite.
Proof
Exercise.
Exercises
- 18.1
Let be normal. Show that for every , for every , and p(A) for every are normal.
- 18.2
Let be normal. Are and A B then normal as well?
- 18.3
Let be normal but not symmetric. Show that then
- 18.4
- 18.5
Show that real skew-symmetric matrices (i. e., matrices with ) and complex skew-Hermitian matrices (i. e., matrices with ) are normal.
- 18.6
Let be a finite dimensional unitary vector space and let be normal. Show the following assertions:
- (a)
If , then f is selfadjoint.
- (b)
If , then .
- (c)
If f is nilpotent, then .
- (a)
- 18.7
Let be a finite dimensional real or complex vector space and let be diagonalizable. Show that there exists a scalar product on such that f is normal with respect to this scalar products.
- 18.8
Let . Show the following assertions:
- (a)
A is normal if and only if there exists a normal matrix B with n distinct eigenvalues that commutes with A.
- (b)
A is normal if and only if is normal for every .
- (c)
Let be the Hermitian and the skew-Hermitian part of A. Show that , and . Show, furthermore, that A is normal if and only if H(A) and S(A) commute.
- (a)
- 18.9
Show that if is normal and if with is defined on the spectrum of A, then . (The map f(z) is called a Möbius transformation.5 Such transformations play an important role in Function Theory and in many other areas of Mathematics.)
- 18.10
Let be a finite dimensional Euclidian or unitary vector space and let be orthogonal or unitary, respectively. Show that exists and is again orthogonal or unitary, respectively.
- 18.11
Let and let the Householder matrix H(u) be defined as in (18.2). Show the following assertions:
- (a)
For the matrices H(u) and are orthogonally similar, i.e., there exists an orthogonal matrix with
- (b)
Every orthogonal matrix can be written as product of n Householder matrices, i.e., there exist with .
- (a)
- 18.12
Let satisfy . Show that there exists an orthogonal matrix with .
- 18.13
- 18.14
Determine for the symmetric matrix
- 18.15
Let and let be a basis of . Prove or disprove: A matrix is positive definite if and only if for all .
- 18.16
Use Definition 18.25 to test whether the symmetric matrices
- 18.17
- 18.18
Show that is Hermitian positive definite if and only if defines a scalar product on .
- 18.19
Prove the following version of Theorem 18.26 for positive semidefinite matrices. If is symmetric, then the following statements are equivalent:
- (1)
A is positive semidefinite.
- (2)
All eigenvalues of A are real and nonnegative.
- (3)
There exists an upper triangular matrix with .
- (1)
- 18.20
Let be a finite dimensional Euclidian or unitary vector space and let be selfadjoint. Show that f is positive definite if and only if all eigenvalues of f are real and positive.
- 18.21
Let . A matrix with is called a square root of A (cp. Sect. 17.1).
- (a)
Show that a symmetric positive definite matrix has a symmetric positive definite square root.
- (b)
Show that the matrix
- (c)
Show that the matrix , , does not have a square root.
- (a)
- 18.22
- 18.23
Let be Hermitian and let B be furthermore positive definite. Show that the polynomial has exactly n real roots.
- 18.24
Prove Theorem 18.27.
Footnotes
3
James Joseph Sylvester (1814–1897)
proved this result for quadratic forms in 1852. He also coined the
name law of inertia which
according to him is “expressing the fact of the existence of an
invariable number inseparably attached to such [bilinear]
forms”.