© Springer Nature Switzerland AG 2018
Michael Oberguggenberger and Alexander OstermannAnalysis for Computer ScientistsUndergraduate Topics in Computer Sciencehttps://doi.org/10.1007/978-3-319-91155-7_8

8. Applications of the Derivative

Michael Oberguggenberger1   and Alexander Ostermann1  
(1)
University of Innsbruck, Innsbruck, Austria
 
 
Michael Oberguggenberger (Corresponding author)
 
Alexander Ostermann

This chapter is devoted to some applications of the derivative which form part of the basic skills in modelling. We start with a discussion of features of graphs. More precisely, we use the derivative to describe geometric properties like maxima, minima and monotonicity. Even though plotting functions with MATLAB or maple is simple, understanding the connection with the derivative is important, for example, when a function with given properties is to be chosen from a particular class of functions.

In the following section we discuss Newton’s method and the concept of order of convergence. Newton’s method is one of the most important tools for computing zeros of functions. It is nearly universally in use.

The final section of this chapter is devoted to an elementary method from data analysis. We show how to compute a regression line through the origin. There are many areas of application that involve linear regression. This topic will be developed in more detail in Chap. 18.

8.1 Curve Sketching

In the following we investigate some geometric properties of graphs of functions using the derivative: maxima and minima, intervals of monotonicity and convexity. We further discuss the mean value theorem which is an important technical tool for proofs.

Definition 8.1

A function f: $$[a, b] \rightarrow \mathbb R$$ has

(a) a global maximum at $$x_0 \in [a, b]$$ if
$$ f(x) \le f(x_0)\ \text {for all}\ x \in [a, b]; $$
(b) a local maximum at $$x_0 \in [a, b]$$, if there exists a neighbourhood $$U_\varepsilon (x_0)$$ so that
$$ f(x) \le f(x_0)\ \text {for all} \ x \in U_\varepsilon (x_0) \cap [a, b]. $$
The maximum is called strict if the strict inequality $$f(x) < f(x_0)$$ holds in (a) or (b) for $$x \ne x_0$$.
The definition for minimum is analogous by inverting the inequalities. Maxima and minima are subsumed under the term extrema. Figure 8.1 shows some possible situations. Note that the function there does not have a global minimum on the chosen interval.
images/215236_2_En_8_Chapter/215236_2_En_8_Fig1_HTML.gif
Fig. 8.1

Minima and maxima of a function

For points $$x_0$$ in the open interval (ab) one has a simple necessary condition for extrema of differentiable functions:

Proposition 8.2

Let $$x_0 \in (a, b)$$ and f be differentiable at $$x_0$$. If f has a local maximum or minimum at $$x_0$$ then $$f'(x_0) = 0$$.

Proof

Due to the differentiability of f we have
$$ f'(x_0) = \lim _{h \rightarrow 0+} \frac{f(x_0 + h) - f(x_0)}{h} = \lim _{h \rightarrow 0-} \frac{f(x_0 + h) - f(x_0)}{h}. $$
In the case of a maximum the slope of the secant satisfies the inequalities
$$\begin{aligned} \frac{f(x_0 + h) - f(x_0)}{h} \le 0,&\quad \text {if}&h > 0 ,\\ \frac{f(x_0 + h) - f(x_0)}{h} \ge 0,&\quad \text {if}&h < 0 . \end{aligned}$$
Consequently the limit $$f'(x_0)$$ has to be greater than or equal to zero as well as smaller than or equal to zero, thus necessarily $$f'(x_0) = 0$$.    $$\square $$

The function $$f(x) = x^3$$, whose derivative vanishes at $$x =0$$, shows that the condition of the proposition is not sufficient for the existence of a maximum or minimum.

The geometric content of the proposition is that in the case of differentiability the graph of the function has a horizontal tangent at a maximum or minimum. A point $$x_0 \in (a, b)$$ where $$f'(x_0) = 0$$ is called a stationary point.

Remark 8.3

The proposition shows that the following point sets have to be checked in order to determine the maxima and minima of a function f: $$[a, b] \rightarrow \mathbb R$$:
  1. (a)

    the boundary points $$x_0 = a, x_0 = b$$;

     
  2. (b)

    points $$x_0 \in (a, b)$$ at which f is not differentiable;

     
  3. (c)

    points $$x_0 \in (a, b)$$ at which f is differentiable and $$f'(x_0) = 0$$.

     

The following proposition is a useful technical tool for proofs. One of its applications lies in estimating the error of numerical methods. Similarly to the intermediate value theorem, the proof is based on the completeness of the real numbers. We are not going to present it here but instead refer to the literature, for instance [3, Chap. 3.2].

Proposition 8.4

(Mean value theorem)   Let f be continuous on [ab] and differentiable on (ab). Then there exists a point $$\xi \in (a, b)$$ such that
$$ \frac{f(b) - f(a)}{b-a} = f'(\xi ). $$
Geometrically this means that the tangent at $$\xi $$ has the same slope as the secant through (af(a)), (bf(b)). Figure 8.2 illustrates this fact.
images/215236_2_En_8_Chapter/215236_2_En_8_Fig2_HTML.gif
Fig. 8.2

The mean value theorem

We now turn to the description of the behaviour of the slope of differentiable functions.

Definition 8.5

A function f: $$I \rightarrow \mathbb R$$ is called monotonically increasing, if
$$ x_1 < x_2 \quad \Rightarrow \quad f(x_1) \le f(x_2) $$
for all $$x_1, x_2 \in I$$. It is called strictly monotonically increasing, if
$$ x_1< x_2 \quad \Rightarrow \quad f(x_1) < f(x_2). $$
A function f is said to be (strictly) monotonically decreasing, if $$-f$$ is (strictly) monotonically increasing.

Examples of strictly monotonically increasing functions are the power functions $$x \mapsto x^n$$ with odd powers n; a monotonically, but not strictly monotonically increasing function is the sign function $$x\mapsto {\text {sign}}\, x$$, for instance. The behaviour of the slope of a differentiable function can be described by the sign of the first derivative.

Proposition 8.6

For differentiable functions f: $$(a, b) \rightarrow \mathbb R$$ the following implications hold:
  1. (a)

    $$ \begin{array}{llll} f' \ge 0 &{} \text {on} \ (a, b) &{}\quad \Leftrightarrow \quad &{} f \ \text {is monotonically increasing};\\ f' > 0 &{} \text{ on } \ (a, b) &{}\quad \Rightarrow \quad &{} f \ \text {is strictly monotonically increasing}. \end{array} $$

     
  2. (b)

    $$ \begin{array}{llll} f' \le 0 &{} \text {on} \ (a, b) &{}\quad \Leftrightarrow \quad &{} f \ \text {is monotonically decreasing};\\ f' < 0 &{} \text{ on } \ (a, b) &{} \quad \Rightarrow \quad &{} f \ \text {is strictly monotonically decreasing}. \end{array} $$

     

Proof

(a) According to the mean value theorem we have $$f(x_2) - f(x_1) = f'(\xi ) \cdot (x_2-x_1)$$ for a certain $$\xi \in (a, b)$$. If $$x_1 < x_2$$ and $$f'(\xi ) \ge 0$$ then $$f(x_2) - f(x_1) \ge 0$$. If $$f'(\xi ) > 0$$ then $$f(x_2) - f(x_1) > 0$$. Conversely
$$ f'(x) = \lim _{h \rightarrow 0} \frac{f(x+h) - f(x)}{h} \ge 0, $$
if f is increasing. The proof for (b) is similar.    $$\square $$

Remark 8.7

The example $$f(x) = x^3$$ shows that f can be strictly monotonically increasing even if $$f' = 0$$ at isolated points.

Proposition 8.8

(Criterion for local extrema)   Let f be differentiable on (ab), $$x_0 \in (a, b)$$ and $$f'(x_0) = 0$$. Then
  1. (a)

    $$ \left. \begin{array}{lll} f'(x)> 0 &{} \quad \text {for} &{} x< x_0\\ f'(x) < 0 &{} \quad \text {for} &{} x > x_0 \end{array} \right\} \quad \Rightarrow \quad f \ \text {has a local maximum in} \ x_0, $$

     
  2. (b)

    $$ \left. \begin{array}{lll} f'(x)< 0 &{} \quad \text {for} &{} x < x_0\\ f'(x)> 0 &{} \quad \text {for} &{} x > x_0 \end{array} \right\} \quad \Rightarrow \quad f \ \text {has a local minimum in} \ x_0. $$

     
images/215236_2_En_8_Chapter/215236_2_En_8_Fig3_HTML.gif
Fig. 8.3

Local maximum

Proof

The proof follows from the previous proposition which characterises the monotonic behaviour as shown in Fig. 8.3.    $$\square $$

Remark 8.9

(Convexity and concavity of a function graph)   If $$f'' > 0$$ holds in an interval then $$f'$$ is monotonically increasing there. Thus the graph of f is curved to the left or convex. On the other hand, if $$f'' < 0$$, then $$f'$$ is monotonically decreasing and the graph of f is curved to the right or concave (see Fig. 8.4). A quantitative description of the curvature of the graph of a function will be given in Sect. 14.​2.

Let $$x_0$$ be a point where $$f'(x_0) = 0$$. If $$f'$$ does not change its sign at $$x_0$$, then $$x_0$$ is an inflection point. Here f changes from positive to negative curvature or vice versa.

Proposition 8.10

(Second derivative criterion for local extrema)   Let f be twice continuously differentiable on (ab), $$x_0\in (a, b)$$ and $$f'(x_0)= 0$$.

  1. (a)

    If $$f''(x_0) > 0$$ then f has a local minimum at $$x_0$$.

     
  2. (b)

    If $$f'' (x_0) < 0$$ then f has a local maximum at $$x_0$$.

     

Proof

(a) Since $$f''$$ is continuous, $$f''(x) > 0$$ for all x in a neighbourhood of $$x_0$$. According to Proposition 8.6, $$f'$$ is strictly monotonically increasing in this neighbourhood. Because of $$f'(x_0) = 0$$ this means that $$f'(x_0) < 0$$ for $$x < x_0$$ and $$f'(x) > 0$$ for $$x > x_0$$; according to the criterion for local extrema, $$x_0$$ is a minimum. The assertion (b) can be shown similarly.    $$\square $$

Remark 8.11

If $$f''(x_0) = 0$$ there can either be an inflection point or a minimum or maximum. The functions $$f(x) = x^n$$, $$n = 3,4,5,\ldots $$ supply a typical example. In fact, they have for n even a global minimum at $$x=0$$, and an inflection point for n odd. More general functions can easily be assessed using Taylor expansion. An extreme value criterion based on this expansion will be discussed in Application 12.​14.

images/215236_2_En_8_Chapter/215236_2_En_8_Fig4_HTML.gif
Fig. 8.4

Convexity/concavity and second derivative

One of the applications of the previous propositions is curve sketching, which is the detailed investigation of the properties of the graph of a function usingdifferential calculus. Even though graphs can easily be plotted in MATLAB or maple itis still often necessary to check the graphical output at certain points using analytic methods.

Experiment 8.12

Plot the function
$$ y = x({\text {sign}}\, x - 1)(x+1)^3 + \big ({\text {sign}}\,(x-1)+1\big )\big ((x-2)^4-1/2\big ) $$
on the interval $$-2 \le x \le 3$$ and try to read off the local and global extrema, the inflection points and the monotonic behaviour. Check your observations using the criteria discussed above.

A further application of the previous propositions consists in finding extrema, i.e. solving one-dimensional optimisation problems. We illustrate this topic using a standard example.

Example 8.13

Which rectangle with a given perimeter has the largest area? To answer this question we denote the lengths of the sides of the rectangle by x and y. Then the perimeter and the area are given by
$$ U = 2x + 2y,\qquad F = xy. $$
Since U is fixed, we obtain $$y = U/2 - x$$, and from that
$$ F = x (U/2 - x), $$
where x can vary in the domain $$ 0 \le x \le U/2$$. We want to find the maximum of the function F on the interval [0, U / 2]. Since F is differentiable, we only have to investigate the boundary points and the stationary points. At the boundary points $$x = 0$$ and $$x = U/2$$ we have $$F(0) = 0$$ and $$F(U/2) = 0$$. The stationary points are obtained by setting the derivative to zero
$$ F'(x) = U/2 - 2x = 0, $$
which brings us to $$x = U/4$$ with the function value $$ F(U/4) = U^2/16$$.

As result we get that the maximum area is obtained at $$x = U/4$$, thus in the case of a square.

8.2 Newton’s Method

With the help of differential calculus efficient numerical methods for computing zeros of differentiable functions can be constructed. One of the basic procedures is Newton’s method1 which will be discussed in this section for the case of real-valued functions f: $$D \subset \mathbb R\rightarrow \mathbb R$$.

First we recall the bisection method discussed in Sect.  6.​3. Consider a continuous, real-valued function f on an interval [ab] with
$$ f(a)< 0,\ f(b)> 0 \quad \text {or} \quad f(a) > 0,\ f(b) < 0. $$
With the help of continued bisection of the interval, one obtains a zero $$\xi $$ of f satisfying
$$ a = a_1 \le a_2 \le a_3 \le \cdots \le \xi \le \cdots \le b_3 \le b_2 \le b_1 = b, $$
where
$$ |b_{n+1} - a_{n+1}| = \frac{1}{2} \, |b_n - a_n| = \frac{1}{4} \, |b_{n-1} - a_{n-1}| = \, \ldots \, = \frac{1}{2^n}\, |b_1 - a_1|. $$
If one stops after n iterations and chooses $$a_n$$ or $$b_n$$ as approximation for $$\xi $$ then one gets a guaranteed error bound
$$ |\text {error}| \le \varphi (n) = |b_n - a_n|. $$
Note that we have
$$ \varphi (n + 1) = \frac{1}{2}\, \varphi (n). $$
The error thus decays with each iteration by (at least) a constant factor $$\frac{1}{2}$$, and one calls the method linearly convergent. More generally, an iteration scheme is called convergent of order $$\alpha $$ if there exist error bounds $$(\varphi (n))_{n \ge 1}$$ and a constant $$C>0$$ such that
$$ \lim _{n \rightarrow \infty } \frac{\varphi (n + 1)}{(\varphi (n))^\alpha } = C. $$
For sufficiently large n, one thus has approximately
$$ \varphi (n +1) \approx C(\varphi (n))^\alpha . $$
Linear convergence ($$\alpha =1$$) therefore implies
$$\begin{aligned} \varphi (n + 1) \, \approx \, C \varphi (n) \, \approx \, C^2 \varphi (n-1) \, \approx \, \ldots \, \approx C^{n} \,\varphi (1). \end{aligned}$$
Plotting the logarithm of $$\varphi (n)$$ against n (semi-logarithmic representation, as shown for example in Fig. 8.6) results in a straight line:
$$ \log \varphi (n + 1) \, \approx \, n \log C + \log \varphi (1). $$
If $$C < 1$$ then the error bound $$\varphi (n+1)$$ tends to 0 and the number of correct decimal places increases with each iteration by a constant. Quadratic convergence would mean that the number of correct decimal places approximately doubles with each iteration.
Derivation of Newton’s method. The aim of the construction is to obtain a procedure that provides quadratic convergence $$(\alpha = 2)$$, at least if one starts sufficiently close to a simple zero $$\xi $$ of a differentiable function. The geometric idea behind Newton’s method is simple: Once an approximation $$x_n$$ is chosen, one calculates $$x_{n+1}$$ as the intersection of the tangent to the graph of f through $$(x_n, f(x_n))$$ with the x-axis, see Fig. 8.5. The equation of the tangent is given by
$$ y = f(x_n) + f'(x_n)(x - x_n). $$
The point of intersection $$x_{n+1}$$ with the x-axis is obtained from
$$ 0 = f(x_n) + f'(x_n)(x_{n+1} - x_n), $$
thus
$$ x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}, \quad n\ge 1. $$
Obviously it has to be assumed that $$f'(x_n) \ne 0$$. This condition is fulfilled, if $$f'$$ is continuous, $$f'(\xi )\ne 0$$ and $$x_n$$ is sufficiently close to the zero $$\xi $$.
images/215236_2_En_8_Chapter/215236_2_En_8_Fig5_HTML.gif
Fig. 8.5

Two steps of Newton’s method

Proposition 8.14

(Convergence of Newton’s method)   Let f be a real-valued function, twice differentiable with a continuous second derivative. Further, let $$f(\xi ) = 0$$ and $$f'(\xi ) \ne 0$$. Then there exists a neighbourhood $$U_\varepsilon (\xi )$$ such that Newton’s method converges quadratically to $$\xi $$ for every starting value $$x_1 \in U_\varepsilon (\xi )$$.

Proof

Since $$f'(\xi ) \ne 0$$ and $$f'$$ is continuous, there exist a neighbourhood $$U_\delta (\xi )$$ and a bound $$m > 0$$ so that $$|f'(x)| \ge m$$ for all $$x \in U_\delta (\xi )$$. Applying the mean value theorem twice gives
$$\begin{aligned} |x_{n+1} - \xi |= & {} \bigg | x_n - \xi - \frac{f(x_n)-f(\xi )}{f'(x_n)}\bigg |\\\le & {} |x_n - \xi | \ \bigg | 1 - \frac{f'(\eta )}{f'(x_n)}\bigg | = |x_n - \xi | \ \frac{|f'(x_n) - f'(\eta )|}{|f'(x_n)|}\\\le & {} |x_n - \xi |^2 \ \frac{|f''(\zeta )|}{|f'(x_n)|} \end{aligned}$$
with $$\eta $$ between $$x_n$$ and $$\xi $$ and $$\zeta $$ between $$x_n$$ and $$\eta $$. Let M denote the maximum of $$|f''|$$ on $$U_\delta (\xi )$$. Under the assumption that all iterates $$x_n$$ lie in the neighbourhood $$U_\delta (\xi )$$, we obtain the quadratic error bound
$$ \varphi (n+1) = |x_{n+1} - \xi | \le |x_n - \xi |^2\, \frac{M}{m} = (\varphi (n))^2\frac{M}{m} $$
for the error $$\varphi (n)=|x_n-\xi |$$. Thus, the assertion of the proposition holds with the neighbourhood $$U_\delta (\xi )$$. Otherwise we have to decrease the neighbourhood bychoosing an $$\varepsilon < \delta $$ which satisfies the inequality $$\varepsilon \frac{M}{m} \le 1$$. Then
$$ |x_n - \xi | \le \varepsilon \quad \Rightarrow \quad |x_{n+1} - \xi | \le \varepsilon ^2 \frac{M}{m} \le \varepsilon . $$
This means that if an approximate value $$x_n$$ lies in $$U_\varepsilon (\xi )$$ then so does the subsequent value $$x_{n+1}$$. Since $$U_\varepsilon (\xi ) \subset U_\delta (\xi )$$, the quadratic error estimate from above is still valid. Thus the assertion of the proposition is valid with the smaller neighbourhood $$U_\varepsilon (\xi )$$.    $$\square $$

Example 8.15

In computing the root $$\xi = \root 3 \of {2}$$ of $$x^3 - 2 = 0$$, we compare the bisection method with starting interval $$[-2,2]$$ and Newton’s method with starting value $$x_1 = 2$$. The interval boundaries $$[a_n, b_n]$$ and the iterates $$x_n$$ are listed in Tables 8.1 and 8.2, respectively. Newton’s method gives the value
$$ \root 3 \of {2} = 1.25992104989487 $$
correct to 14 decimal places after only six iterations.
Table 8.1

Bisection method for calculating the third root of 2

n

$$\mathtt{{a}_{n}}$$

$$\mathtt{{b}_{n}}$$

Error

1

−2.00000000000000

2.00000000000000

4.00000000000000

2

0.00000000000000

2.00000000000000

2.00000000000000

3

1.00000000000000

2.00000000000000

1.00000000000000

4

1.00000000000000

1.50000000000000

0.50000000000000

5

1.25000000000000

1.50000000000000

0.25000000000000

6

1.25000000000000

1.37500000000000

0.12500000000000

7

1.25000000000000

1.31250000000000

0.06250000000000

8

1.25000000000000

1.28125000000000

0.03125000000000

9

1.25000000000000

1.26562500000000

0.01562500000000

10

1.25781250000000

1.26562500000000

0.00781250000000

11

1.25781250000000

1.26171875000000

0.00390625000000

12

1.25976562500000

1.26171875000000

0.00195312500000

13

1.25976562500000

1.26074218750000

0.00097656250000

14

1.25976562500000

1.26025390625000

0.00048828125000

15

1.25976562500000

1.26000976562500

0.00024414062500

16

1.25988769531250

1.26000976562500

0.00012207031250

17

1.25988769531250

1.25994873046875

0.00006103515625

18

1.25991821289063

1.25994873046875

0.00003051757813

Table 8.2

Newton’s method for calculating the third root of 2

n

x $$_\mathtt{n}$$

Error

1

2.00000000000000

0.74007895010513

2

1.50000000000000

0.24007895010513

3

1.29629629629630

0.03637524640142

4

1.26093222474175

0.00101117484688

5

1.25992186056593

0.00000081067105

6

1.25992104989539

0.00000000000052

7

1.25992104989487

0.00000000000000

The error curves for the bisection method and Newton’s method can be seen in Fig. 8.6. A semi-logarithmic representation (MATLAB command semilogy) is used there.

Remark 8.16

The convergence behaviour of Newton’s method depends on the conditions of Proposition 8.14. If the starting value $$x_1$$ is too far away from the zero $$\xi $$, then the method might diverge, oscillate or converge to a different zero. If $$f'(\xi ) = 0$$, which means the zero $$\xi $$ has a multiplicity $$ > 1$$, then the order of convergence may be reduced.

Experiment 8.17

Open the applet Newton’s method and test—using the sine function—how the choice of the starting value influences the result (in the applet the right interval boundary is the initial value). Experiment with the intervals $$[-2,x_0]$$ for $$x_0 = 1, 1.1, 1.2,1.3,1.5,1.57,1.5707,1.57079$$ and interpret your observations. Also carry out the calculations with the same starting values with the help of the M-file mat08_2.m.

Experiment 8.18

With the help of the applet Newton’s method, study how the order of convergence drops for multiple zeros. For this purpose, use the two polynomial functions given in the applet.

Remark 8.19

Variants of Newton’s method can be obtained by evaluating the derivative $$f'(x_n)$$ numerically. For example, the approximation
$$ f'(x_n) \approx \frac{f(x_n) - f(x_{n-1})}{x_n - x_{n -1}} $$
provides the secant method
$$ x_{n+1} = x_n - \frac{(x_n - x_{n-1}) f(x_n)}{f(x_n)- f(x_{n-1})}\, , $$
which computes $$x_{n+1}$$ as intercept of the secant through $$(x_n, f(x_n))$$ and $$(x_{n-1}, f(x_{n-1}))$$ with the x-axis. It has a fractional order less than 2.
images/215236_2_En_8_Chapter/215236_2_En_8_Fig6_HTML.gif
Fig. 8.6

Error of the bisection method and of Newton’s method for the calculation of $$\root 3 \of {2}$$

8.3 Regression Line Through the Origin

This section is a first digression into data analysis: Given a collection of data points scattered in the plane, find the line of best fit (regression line) through the origin. We will discuss this problem as an application of differentiation; it can also be solved by using methods of linear algebra. The general problem of multiple linear regression will be dealt with in Chap. 18.

In the year 2002, the height x [cm] and the weight y [kg] of 70 students in Computer Science at the University of Innsbruck were collected. The data can be obtained from the M-file mat08_3.m.

The measurements $$(x_i, y_i), i = 1,\ldots , n$$ of height and weight form a scatter plot in the plane as shown in Fig. 8.7. Under the assumption that there is a linear relation of the form $$y = kx$$ between height and weight, k should be determined such that the straight line $$y = kx$$ represents the scatter plot as closely as possible (Fig. 8.8). The approach that we discuss below goes back to Gauss2 and understands the data fit in the sense of minimising the sum of squares of the errors.
images/215236_2_En_8_Chapter/215236_2_En_8_Fig7_HTML.gif
Fig. 8.7

Scatter plot height/weight

images/215236_2_En_8_Chapter/215236_2_En_8_Fig8_HTML.gif
Fig. 8.8

Line of best fit $$y=kx$$

Application 8.20

(Line of best fit through the origin)   A straight line through the origin
$$ y = kx $$
is to be fitted to a scatter plot $$(x_i, y_i), i = 1,\ldots , n$$. If k is known, one can compute the square of the deviation of the measurement $$y_i$$ from the value $$kx_i$$ given by the equation of the straight line as
$$ (y_i - kx_i)^2 $$
(the square of the error). We are looking for the specific k which minimises the sum of squares of the errors; thus
$$ F(k) = \sum ^n_{i= 1} (y_i - kx_i)^2 \rightarrow \min $$
Obviously, F(k) is a quadratic function of k. First we compute the derivatives
$$ F'(k) = \sum ^n_{i = 1} (-2x_i)(y_i - kx_i),\qquad F''(k) = \sum ^n_{i = 1} 2x_i^2. $$
By setting $$F'(k) =0$$ we obtain the formula
$$ F'(k) = -2 \sum ^n_{i = 1} x_i y_i + 2k \sum ^n_{i = 1} x_i^2 = 0. $$
Since evidently $$F'' > 0$$, its solution
$$ k = \frac{\sum x_i y_i}{\sum x_i^2} $$
is the global minimum and gives the slope of the line of best fit.

Example 8.21

To illustrate the regression line through the origin we use the Austrian consumer price index 2010–2016 (data taken from [26]):

year

2010

2011

2012

2013

2014

2015

2016

index

100.0

103.3

105.8

107.9

109.7

110.7

111.7

For the calculation it is useful to introduce new variables x and y, where $$x = 0$$ corresponds to the year 2010 and $$y = 0$$ to the index 100. This means that $$x = (\text {year} - 2010)$$ and $$y = (\text {index} - 100)$$; y describes the relative price increase (in per cent) with respect to the year 2010. The re-scaled data are

$$x_i$$

0

1

2

3

4

5

6

$$y_i$$

0.0

3.3

5.8

7.9

9.7

10.7

11.7

We are looking for the line of best fit to these data through the origin. For this purpose we have to minimise
$$\begin{aligned} F(k)&= (3.3 - k \cdot 1)^2 + (5.8 - k \cdot 2)^2 + (7.9 - k \cdot 3)^2 + (9.7 - k \cdot 4)^2\\&\quad \,\, + (10.7 - k \cdot 5)^2 + (11.7 - k \cdot 6)^2 \end{aligned}$$
which results in (rounded)
$$ k = \frac{1 \cdot 3.3 + 2 \cdot 5.8 + 3 \cdot 7.9 + 4 \cdot 9.7 + 5\cdot 10.7 + 6\cdot 11.7}{1 \cdot 1 + 2 \cdot 2 + 3 \cdot 3 + 4 \cdot 4 + 5 \cdot 5 + 6 \cdot 6} = \frac{201.1}{91} = 2.21. $$
The line of best fit is thus
$$ y = 2.21 x $$
or transformed back
$$ \text {index} = 100 + (\text {year} - 2010) \cdot 2.21. $$
The result is shown in Fig. 8.9, in a year/index-scale as well as in the transformed variables. For the year 2017, extrapolation along the regression line would forecast
$$ \text {index} (2017) = 100 + 7 \cdot 2.21 = 115.5. $$
The actual consumer price index in 2017 had the value 114.0. Inspection of Fig. 8.9 shows that the consumer price index stopped growing linearly around 2014; thus the straight line is a bad fit to the data in the period under consideration. How to choose better regression models will be discussed in Chap. 18.
images/215236_2_En_8_Chapter/215236_2_En_8_Fig9_HTML.gif
Fig. 8.9

Consumer price index and regression line

images/215236_2_En_8_Chapter/215236_2_En_8_Fig10_HTML.gif
Fig. 8.10

Failure wedge with sliding surface

8.4 Exercises

1.
Find out which of the following (continuous) functions are differentiable at $$x=0$$:
$$ y = x|x|;\qquad y = |x|^{1/2},\qquad y = |x|^{3/2},\qquad y = x\sin (1/x). $$
2.
Find all maxima and minima of the functions
$$ f(x) = \frac{x}{x^2+1}\quad \mathrm{and}\quad g(x) = x^2 \mathrm{e}^{-x^2}. $$
3.
Find the maxima of the functions
$$ y = \frac{1}{x} \mathrm{e}^{-(\log x)^2/2},\ x > 0 \qquad \text {and} \qquad y = \mathrm{e}^{-x} \mathrm{e}^{-(\mathrm{e}^{-x})},\ x \in \mathbb R. $$
These functions represent the densities of the standard lognormal distribution and of the Gumbel distribution, respectively.
4.
Find all maxima and minima of the function
$$ f(x) = \frac{x}{\sqrt{x^4 + 1}}, $$
determine on what intervals it is increasing or decreasing, analyse its behaviour as $$x\rightarrow \pm \infty $$, and sketch its graph.
5.

Find the proportions of the cylinder which has the smallest surface area F for a given volume V.

Hint. $$F = 2 r \pi h + 2 r^2 \pi \rightarrow \min \!.$$ Calculate the height h as a function of the radius r from $$V = r^2 \pi h$$, substitute and minimise F(r).

6.

(From mechanics of solids) The moment of inertia with respect to the central axis of a beam with rectangular cross section is $$I = \frac{1}{12} b h^3$$ (b the width, h the height). Find the proportions of the beam which can be cut from a log with circular cross section of given radius r such that its moment of inertia becomes maximal.

Hint. Write b as function of h, $$I(h) \rightarrow \max \!.$$

7.
(From soil mechanics) The mobilised cohesion $$c_\mathrm{m}(\theta ) $$ of a failure wedge with sliding surface, inclined by an angle $$\theta $$, is
$$\begin{aligned} c_\mathrm{m}(\theta ) = \frac{\gamma h \sin (\theta - \varphi _\mathrm{m})\, \cos \theta }{2\,\cos \varphi _\mathrm{m}}. \end{aligned}$$
Here h is the height of the failure wedge, $$\varphi _\mathrm{m}$$ the angle of internal friction, $$\gamma $$ the specific weight of the soil (see Fig. 8.10). Show that the mobilised cohesion $$c_\mathrm{m}$$ with given $$h, \, \varphi _\mathrm{m}, \, \gamma $$ is a maximum for the angle of inclination $$\theta = \varphi _\mathrm{m}/2 + 45^\circ $$.
8.
This exercise aims at investigating the convergence of Newton’s method for solving the equations
$$\begin{aligned} x^3 - 3x^2 + 3x - 1&= 0, \\ x^3 - 3x^2 + 3x - 2&= 0 \end{aligned}$$
on the interval [0, 3].
(a)

Open the applet Newton’s method and carry out Newton’s method for both equations with an accuracy of 0.0001. Explain why you need a different number of iterations.

(b)

With the help of the M-file mat08_1.m, generate a list of approximations in each case (starting value x1 = 1.5, tol = 100*eps, maxk = 100) and plot the errors $$ |x_n - \xi |$$ in each case using semilogy. Discuss the results.

9.

Apply the MATLAB program mat08_2.m to the functions which are defined by the M-files mat08_f1.m and mat08_f2.m (with respective derivatives mat08_df1.m and mat08_df2.m). Choose x1 = 2, maxk = 250. How do you explain the results?

10.

Rewrite the MATLAB program mat08_2.m so that termination occurs when either the given number of iterations maxk or a given error bound tol is reached (termination at the nth iteration, if either $$n > \mathtt{maxk}$$ or $$|f(x_n)| < \mathtt{tol}$$). Compute n, $$x_n$$ and the error $$|f(x_n)|$$. Test your program using the functions from Exercise 8 and explain the results.

Hint. Consult the M-file mat08_ex9.m.

11.

Write a MATLAB program which carries out the secant method for cubic polynomials.

12.
(a)

By minimising the sum of squares of the errors, derive a formula for the coefficient c of the regression parabola $$y = cx^2$$ through the data $$(x_1, y_1),..., (x_n, y_n)$$.

(b)

A series of measurements of braking distances s [m] (without taking into account the perception-reaction distance) of a certain type of car in dependence on the velocity v [km/h] produced the following values:

$$v_i$$

10

20

40

50

60

70

80

100

120

$$s_i$$

1

3

8

13

18

23

31

47

63

Calculate the coefficient c of the regression parabola $$s = cv^2$$ and plot the result.
13.
Show that the best horizontal straight line $$y = d$$ through the data points $$(x_i, y_i), i = 1,\ldots , n$$ is given by the arithmetic mean of the y-values:
$$ d = \frac{1}{n}\sum _{i=1}^n y_i. $$

Hint. Minimise $$G(d) = \sum _{i=1}^n (y_i-d)^2$$.

14.
(From geotechnics) The angle of internal friction of a soil specimen can be obtained by means of a direct shear test, whereby the material is subjected to normal stress $$\sigma $$ and the lateral shear stress $$\tau $$ at failure is recorded. In case the cohesion is negligible, the relation between $$\tau $$ and $$\sigma $$ can be modelled by a regression line through the origin of the form $$\tau = k \sigma $$. The slope of the regression line is interpreted as the tangent of the friction angle $$\varphi $$, $$k = \tan \varphi $$. In a laboratory experiment, the following data have been obtained for a specimen of glacial till (data from [25]):

$$\sigma _i$$ [kPa]

100

150

200

300

150

250

300

100

150

250

100

150

200

250

$$\tau _{i}$$ [kPa]

68

127

135

206

127

148

197

76

78

168

123

97

124

157

Calculate the angle of internal friction of the specimen.
15.
(a)

Convince yourself by applying the mean value theorem that the function $$f(x) = \cos x$$ is a contraction (see Definition C.17) on the interval [0, 1] and compute the fixed point $$x^* = \cos x^*$$ up to two decimal places using the iteration of Proposition C.18.

(b)

Write a MATLAB program which carries out the first N iterations for the computation of $$x^* = \cos x^*$$ for a given initial value $$ x_1 \in [0,1]$$ and displays $$x_1, x_2, \dots , x_N$$ in a column.