In this chapter we study vector spaces
over the fields and
.
Using the definition of bilinear and sesquilinear forms, we
introduce scalar products on such vector spaces. Scalar products
allow the extension of well-known concepts from elementary
geometry, such as length and angles, to abstract real and complex
vector spaces. This, in particular, leads to the idea of
orthogonality and to orthonormal bases of vector spaces. As an
example for the importance of these concepts in many applications
we study least-squares approximations.
12.1 Scalar Products and Norms
We start with the definition of a
scalar product and the Euclidean or unitary vector spaces.
Definition
12.1
Let be a K-vector space, where either
or .
A map
is called a scalar product
on ,
when the following properties hold:
- (1)
If , then is a symmetric bilinear form.If , then is an Hermitian sesquilinear form.
- (2)
is positive definite, i.e., holds for all , with equality if and only if .
An -vector space with a scalar product is
called a Euclidean vector
space 1, and a
-vector space with a scalar product is
called a unitary vector
space.
Scalar products are sometimes called
inner products. Note that
is nonnegative and real also when is
a -vector space. It is easy to see that a
subspace of
a Euclidean or unitary vector space is again a Euclidean or unitary vector
space, respectively, when the scalar product on the space
is
restricted to the subspace .
Example
12.2
- (1)
A scalar product on is given by
- (2)
A scalar product on is given by
- (3)
For both and ,
- (4)
A scalar product on the vector space of the continuous and real valued functions on the real interval is given by
We will now show how to use the
Euclidean or unitary structure of a vector space in order to
introduce geometric concepts such as the length of a vector or the
angle between vectors.
As a motivation of a general concept
of length we have the absolute
value of real numbers, i.e., the map ,
.
This map has the following properties:
- (1)
for all .
- (2)
for all , with equality if and only if .
- (3)
for all .
These properties are generalized to
real or complex vector spaces as follows.
Definition
12.3
Let be a K-vector space, where either
or .
A map
is called a norm on
,
when for all and the following properties hold:
- (1)
.
- (2)
, with equality if and only if .
- (3)
(triangle inequality).
A K-vector space on which a norm is
defined is called a normed
space.
Example
12.4
- (1)
If is the standard scalar product on , then
- (2)
If is the standard scalar product on , then
- (3)
For both and ,
- (4)
If is the vector space of the continuous and real valued functions on the real interval , then
- (5)
Let or , and let , be given. Then for the p -norm of is defined by(12.1)
- (6)
For or the p -norm of is defined by
The norms in the above examples
(1)–(4) have the form , where
is a given scalar
product. We will show now that the map always defines a
norm. Our proof is based on the following theorem.
Theorem
12.5
If is a Euclidean or unitary vector space
with the scalar product , then
with equality if and only if v, w are linearly dependent.
(12.2)
Proof
If v, w are linearly dependent, then
for a scalar ,
and hence
On the other hand, let .
If , then
v, w are linearly dependent. If
, then
we define as
above and get
Since the scalar product is positive definite, we have , and thus v, w are linearly dependent.
The inequality (12.2) is called
Cauchy-Schwarz
inequality.3
It is an important tool in Analysis, in particular in the
estimation of approximation and interpolation errors.
Corollary
12.6
If is a Euclidean or unitary vector space
with the scalar product , then the map
is a norm on
that is called the norm induced by
the scalar product.
Proof
We have to prove the three defining
properties of the norm. Since is positive definite,
we have , with equality if and only if
. If
and (where in the Euclidean case
and in the unitary case ), then
and hence . In
order to show the triangle inequality, we use the Cauchy-Schwarz
inequality and the fact that for every complex number
z. For all we have
and thus .
12.2 Orthogonality
We will now use the scalar product to
introduce angles between vectors. As motivation we consider the
Euclidean vector space with the standard scalar product
and the induced Euclidean norm . The
Cauchy-Schwarz inequality shows that
If , then the
angle between v and
w is the uniquely
determined real number with
The vectors v, w are orthogonal if , so that . Thus, v, w are orthogonal if and only if
.
An elementary calculation now leads to
the cosine theorem for
triangles:
If v, w are orthogonal, i.e., , then the cosine theorem
implies the Pythagorean
theorem 4:
The following figures illustrate the cosine theorem and the
Pythagorean theorem for vectors in :
In the following definition we
generalize the ideas of angles and orthogonality.
Definition
12.7
Let be a Euclidean or unitary vector space
with the scalar product .
- (1)
In the Euclidean case, the angle between two vectors is the uniquely determined real number with
- (2)
Two vectors are called orthogonal, if .
- (3)
A basis of is called an orthogonal basis, if
Note that the terms in (1)(3) are
defined with respect to the given scalar product. Different scalar
products yield different angles between vectors. In particular, the
orthogonality of two given vectors may be lost when we consider a
different scalar product.
Example
12.8
The standard basis vectors
are orthogonal and
is an orthonormal basis of with respect to the standard scalar
product (cp. (1) in Example 12.2). Consider the symmetric and invertible
matrix
which defines a symmetric and non-degenerate bilinear form on
by
(cp. (1) in Example 11.10). This bilinear form is
positive definite, since for all we have
The bilinear form therefore is a scalar product on , which we denote by . We denote the
induced norm by .
With respect to the scalar product
the vectors
satisfy
Clearly,
is not an orthonormal basis of with respect to . Also note that
.
On the other hand, the vectors
and satisfy
Hence and , so that is an orthonormal basis
of with respect to the scalar product
We now show that every finite
dimensional Euclidean or unitary vector space has an orthonormal
basis.
Theorem
12.9
Let be a Euclidean or unitary vector space
with the basis . Then there exists an orthonormal
basis of with
Proof
We give the proof by induction on
. If , then we set . Then , and is an orthonormal basis of
with .
Let the assertion hold for an
. Let
and let be a basis of .
Then is an
n-dimensional subspace of
.
By the induction hypothesis there exists an orthonormal basis
of with
for . We define
Since ,
we must have , and Lemma 9.16 yields .
For we have
Finally,
which completes the proof.
The proof of Theorem 12.9 shows how a given
basis can be orthonormalized, i.e., transformed into
an orthonormal basis with
The resulting algorithm is called the Gram-Schmidt method 5:
Algorithm
12.10
Given a basis of .
- (1)
Set .
- (2)
For set
A slight reordering and combination of
steps in the Gram-Schmidt method yields
The upper triangular matrix on the right hand side is the
coordinate transformation matrix from the basis to the basis of (cp. Theorem 9.25 or 10.2). Thus, we have shown the
following result.
Theorem
12.11
If is a finite dimensional Euclidean or
unitary vector space with a given basis , then
the Gram-Schmidt method applied to yields an orthonormal basis of
,
such that is an invertible
upper triangular matrix.
Consider an m-dimensional subspace of or with the standard scalar product
, and write the
m vectors of an orthonormal
basis as columns of a matrix,
. Then we obtain in the real case
and analogously in the complex case
If, on the other hand, or for a matrix or , respectively, then the
m columns of Q form an orthonormal basis (with
respect to the standard scalar product) of an m-dimensional subspace of or , respectively. A “matrix version” of
Theorem 12.11 can therefore be formulated as
follows.
Corollary
12.12
Let or and let be linearly independent.
Then there exists a matrix with its m columns being orthonormal with
respect to the standard scalar product of ,
i.e., for
or for , and an upper triangular matrix
, such that
(12.3)
The factorization (12.3) is called a
QR-decomposition of the
matrix . The QR-decomposition has many applications
in Numerical Mathematics (cp. Example 12.16 below).
Lemma
12.13
Let or and let be a matrix with orthonormal columns
with respect to the standard scalar product of .
Then holds for all
. (Here is the Euclidean norm of
and
of .)
Proof
For we have
and the proof for is analogous.
We now introduce two important classes
of matrices.
Definition
12.14
- (1)
A matrix whose columns form an orthonormal basis with respect to the standard scalar product of is called orthogonal.
- (2)
A matrix whose columns form an orthonormal basis with respect to the standard scalar product of is called unitary.
A matrix is therefore
orthogonal if and only if
In particular, an orthogonal matrix Q is invertible with
(cp. Corollary 7.20). The equation
means that the n rows of
Q form an orthonormal basis
of (with respect to the scalar product
).
Analogously, a unitary matrix
is invertible with
and . The n columns of Q form an orthonormal basis of
.
Lemma
12.15
The sets of orthogonal and of unitary
matrices form subgroups of and , respectively.
Proof
We consider only ; the proof for is analogous.
Since every orthogonal matrix is
invertible, we have that . The identity
matrix is
orthogonal, and hence . If , then also , since . Finally, if , then
and thus .
Example
12.16
In many applications measurements or
samples lead to a data set that is represented by tuples
, . Here , are the pairwise
distinct measurement points and are the corresponding
measurements. In order to approximate the given data set by a
simple model, one can try to construct a polynomial p of small degree so that the values
are as close as possible
to the measurements .
The simplest case is a real polynomial
of degree (at most) 1. Geometrically, this corresponds to the
construction of a straight line in that has a minimal distance to the given
points, as shown in the figure below (cp. Sect. 1.4). There are many possibilities
to measure the distance. In the following we will describe one of
them in more detail and use the Gram-Schmidt method, or the
QR-decomposition, for the
construction of the straight line. In Statistics this method is
called linear
regression.
A real polynomial of degree (at most)
1 has the form and we are looking for coefficients
with
Using matrices we can write this problem as
As mentioned above, there are different possibilities for
interpreting the symbol “”. In particular, there are different norms
in which we can measure the distance between the given values
and the polynomial values
. Here we will use the
Euclidean norm and consider the minimization
problem
The vectors are linearly independent,
since the entries of are pairwise distinct, while all entries of
are
equal. Let
be a QR-decomposition. We
extend the vectors to an orthonormal basis
of . Then is an
orthogonal matrix and
Here we have used that and for all . The upper triangular matrix
R is invertible and thus
the minimization problem is solved by
Using the definition of the Euclidean norm, we can write the
minimizing property of the polynomial
as
Since the polynomial minimizes the sum of squares of the
distances between the measurements and the polynomial values , this polynomial yields a
least squares approximation
of the measurement values.
Consider the example from
Sect. 1.4. In the four quarters of a year,
a company has profits of million Euros. Under the
assumption that the profits grows linearly, i.e., like a straight
line, the goal is to estimate the profit in the last quarter of the
following year. The given data leads to the approximation problem
The numerical computation of a QR-decomposition of
yields
and the resulting profit estimate for the last quarter of the
following year is , i.e., 11.7 million
Euros.
MATLAB-Minute.
In Example 12.16 one could imagine
that the profit grows quadratically instead of linearly. Determine,
analogously to the procedure in Example 12.16, a polynomial
that solves the least squares problem
Use the MATLAB command qr for computing a
QR-decomposition, and
determine the estimated profit in the last quarter of the following
year.
We will now analyze the properties of
orthonormal bases in more detail.
Lemma
12.17
If is a Euclidean or unitary vector space
with the scalar product and the orthonormal
basis , then
for all .
Proof
For every there exist uniquely determined
coordinates with . For every
we then have .
The coordinates , , of v with respect to an orthonormal basis
are often called the Fourier coefficients 6 of v with respect to this basis. The
representation is called the
(abstract) Fourier
expansion of v in
the given orthonormal basis.
Corollary
12.18
Proof
- (1)
We have , and thus
- (2)
is a special case of (1) for .
By Bessel’s identity, every vector
satisfies
where is the norm induced by the scalar
product. The absolute value of each coordinate of v with respect to an orthonormal basis
of
is therefore bounded by the norm of v. This property does not hold for a
general basis of .
Example
12.19
Consider with the standard scalar
product and the Euclidean norm, then for every real the set
is a basis of .
For every vector we then have
If are moderate numbers and if
is (very) small, then and are (very) large. In numerical
algorithms such a situation can lead to significant problems (e.g.
due to roundoff errors) that are avoided when orthonormal bases are
used.
Definition
12.20
Let be a Euclidean or unitary vector space
with the scalar product , and let be a subspace. Then
is called the orthogonal
complement of (in ).
Lemma
12.21
The orthogonal complement
is a subspace of .
Proof
Exercise.
Lemma
12.22
If is an n-dimensional Euclidean or unitary
vector space, and if is an m-dimensional subspace, then
and .
Proof
We know that (cp.
Lemma 9.27). If , then
, and thus
so that the assertion is trivial.
Thus let and let be an orthonormal basis of
.
We extend this basis to a basis of and apply the Gram-Schmidt method in order
to obtain an orthonormal basis of .
Then
and therefore . If
, then
, and hence , since
the scalar product is positive definite. Thus, , which
implies that and
(cp.
Theorem 9.29). In particular, we have
.
12.3 The Vector Product in
In this section we consider a further
product on the vector space that is frequently used in Physics
and Electrical Engineering.
Definition
12.23
The vector product or cross product in is the map
where and .
In contrast to the scalar product,
the vector product of two elements of the vector space is not a scalar but again a vector in
. Using the canonical basis vectors of
,
we can write the vector product as
Lemma
12.24
The vector product is linear in both
components and for all the following properties
hold:
- (1)
, i.e., the vector product is anti commutative or alternating.
- (2)
, where is the standard scalar product and the Euclidean norm of .
- (3)
, where is the standard scalar product of .
Proof
Exercise.
By (2) and the Cauchy-Schwarz
inequality (12.2), it follows that
holds if and only if v, w are linearly dependent. From (3) we
obtain
for arbitrary . If v, w are linearly independent, then the
product is
orthogonal to the plane through the origin spanned by v and w in , i.e.,
Geometrically, there are two possibilities:
The positions of the three vectors
on the left side of this figure
correspond to the “right-handed orientation” of the usual
coordinate system of , where the canonical basis vectors
are associated with thumb, index finger and middle finger of the
right hand. This motivates the name right-hand rule. In order to explain
this in detail, one needs to introduce the concept of orientation, which we omit here.
If is the angle between the vectors
v, w, then
(cp. Definition 12.7) and we can write (2) in
Lemma 12.24 as
so that
A geometric interpretation of this equation is the following:
The norm of the vector product of
v and w is equal to the area of the parallelogram spanned by v and
w. This interpretation is illustrated in the following
figure:
Exercises
- 12.1
Let be a finite dimensional real or complex vector space. Show that there exists a scalar product on .
- 12.2
Show that the maps defined in Example 12.2 are scalar products on the corresponding vector spaces.
- 12.3
Let be an arbitrary scalar product on . Show that there exists a matrix with for all .
- 12.4
Let be a finite dimensional - or -vector space. Let and be scalar products on with the following property: If satisfy , then also . Prove or disprove: There exists a real scalar with for all .
- 12.5
Show that the maps defined in Example 12.4 are norms on the corresponding vector spaces.
- 12.6
Show that
- 12.7
Sketch for the matrix A from (6) in Example 12.4 and , the sets .
- 12.8
Let be a Euclidean or unitary vector space and let be the norm induced by a scalar product on . Show that satisfies the parallelogram identity
- 12.9
Let be a K-vector space ( or ) with the scalar product and the induced norm . Show that are orthogonal with respect to if and only if for all .
- 12.10
Does there exist a scalar product on , such that the 1-norm of (cp. (5) in Example 12.4) is the induced norm by this scalar product?
- 12.11
Show that the inequality
- 12.12
Let be a finite dimensional Euclidean or unitary vector space with the scalar product . Let be a map with for all . Show that f is an isomorphism.
- 12.13
Let be a unitary vector space and suppose that satisfies for all . Prove or disprove that . Does the same statement also hold for Euclidean vector spaces?
- 12.14
Let with . Show that is a scalar product on . Analyze which properties of a scalar product are violated if at least one of the is zero, or when all are nonzero but have different signs.
- 12.15
Orthonormalize the following basis of the vector space with respect to the scalar product :
- 12.16
Let be an orthogonal or let be a unitary matrix. What are the possible values of ?
- 12.17
- 12.18
Prove Lemma 12.21.
- 12.19
Let
- 12.20
Let be a Euclidean or unitary vector space with the scalar product , let and let . Show that for we have if and only if for .
- 12.21
In the unitary vector space with the standard scalar product let and be given. Determine an orthonormal basis of .
- 12.22
Prove Lemma 12.24.
Footnotes