Why Taylor Series Work

Taylor series allow us to very accurately approximate some infinitely differentiable functions, while only having knowledge of its derivatives at one point. This post deals with understanding how this is possible.

Only a basic familiarity with calculus is assumed, but motivation to understand mathematics is almost essential to understanding what's going on. Formally, you should know about sequences, series, and general properties/applications of derivatives.

All of the proofs in this post are presented as a set of exercises. The idea is to provide guidance through the proof and encourage a better understanding of the underlying motivation and concepts. However, solutions or hints are provided for guidance.

A Math Trick

Let's say you have an approximation $p$ of $\pi$ so that $|\pi-p| < 1$ . That is, suppose we have a pretty close approximation of $\pi$ , like $p = 3$ or $p = 3.14$ . Let $n$ be the number of decimal places that $p$ successfully approximates. For example, taking $p = 3.14$ would mean $n = 2$ .

Then $p + \sin(p)$ successfully approximates $3n$ decimal places. That means that $p + \sin(p)$ is at least 3 times better than $p$ at approximating $\pi$ . To see this, we can start with $p = 3.1$ . Then

p + \sin(p) = 3.1 + \sin(3.1) \approx 3.14158066243.

Repeating this once more, we have

3.14158066243 + \sin(3.14159265359) \approx 3.14159265389.

After repeating this only twice, we started at a number which approximates

\pi

with

1

decimal place of accuracy and received a number which approximates

\pi

with

11

decimal places of accuracy. Why does this happen?

Like most things in math, this is no coincidence, and the tools provided by calculus allow you to see precisely why this "magic trick" works so well.

The first challenge is determining the role of $\sin$ . Clearly it has something to do with the approximations getting closer to $\pi,$ but it not so clear what this relationship is. What we'll try to do is simplify this problem so that instead of studying functions like $\sin$ or $\cos$ , we study polynomials.

This is because polynomials are generally easier to work with: they are nicer to differentiate, it is easy to study their roots, and it is easier to describe how they transform a given input (i.e. describing what $x^2$ does to $x$ is easier than describing what $\sin(x)$ does to $x$ ).

Our new goal will be to write the aforementioned functions (and many more) as polynomials. It turns out that if this function is "nice" enough (we'll figure out what "nice" means later), we can represent these functions as an infinite series. This infinite series is called the Taylor series of $f$ , and taking its partial sums are polynomials. To understand this, we need to first understand what an "infinite series" or "partial sum" is.

Infinite Series: Definition and Properties

Let $(a_n)_{n \geq 1}$ be a sequence. We can then define another sequence $(s_n)_{n \geq 1}$ by

s_n = \sum_{k = 1}^n a_k.

Then we define the infinite series

\sum_{k=1}^\infty a_k

\sum_{k=1}^\infty a_k = \lim_{n \to \infty} \sum_{k=1}^n a_k = \lim_{n \to \infty} s_n.

For a particular positive integer

k

s_k

is called the $k$ th partial sum of the series.

If the sequence $(s_n)$ diverges (i.e. does not converge), then $\sum_{k=1}^\infty a_k$ is said to diverge and does not represent any value.

There are some properties of series that are important for us to note. The first is that not every infinite series converges. For example, $\sum_{n=1}^\infty n$ diverges.

Another, less obvious note is that even if $a_n \to 0$ (read " $a_n$ converges to zero"), $\sum_{n = 1}^\infty a_n$ does not necessarily converge. For example, choosing $a_n = \frac 1 n$ yields what is called the harmonic series:

\sum_{n = 1}^\infty \frac 1 n

This series diverges, and various proofs can be found here. However,

a_n

does converge to zero.

Approximating Functions with Power Series

We now know enough about series to study the Taylor series.

Suppose $f$ is defined on $(a, b)$ and has all derivatives at some point $a < c < b$ . Then the Taylor series of $f$ centered at $c$ is a function defined by

T_c(x) = \sum_{n = 0}^\infty \frac{f^{(n)}(c)(x-c)^n}{n!}.

Note that

T_c(x)

need not converge, meaning that

T_c

need not be defined for all values in the domain of

f.

We also define

P_n(x) = \sum_{k = 0}^n \frac{f^{(k)}(c)(x-c)^k}{k!}

to denote the first

n

terms of the Taylor series.

P_n

is called the $n$ th Taylor polynomial of

f

centered at

c.

It is a polynomial of degree (at most)

n.

The interesting part about the Taylor series is that for certain functions

f

and certain values of

x

T_a(x) = f(x).

To see why this happens, we need to study Taylor's theorem.

Taylor's Theorem

Taylor's theorem gives us a way to estimate how closely $P_n$ approximates the original function. To quantify this "closeness," we define the remainder $R_n(x)$ by

R_n(x) = f(x) - P_n(x) = f(x) - \sum_{k = 0}^n \frac{f^{(k)}(c)(x-c)^k}{k!}.

Some simple algebraic manipulation shows us exactly what this remainder means, as well as why it is called the remainder:

f(x) = P_n(x) + R_n(x).

Taylor's theorem gives us a formula for the remainder, which will later prove to be useful in studying the Taylor series of a function.

Taylor's Theorem: Let $n$ be some positive integer. Suppose $f$ be a function defined on $(a, b)$ (we allow $a = -\infty$ and/or $b = \infty$ ) which is differentiable $n + 1$ times on $(a, b)$ . Then for any real numbers $a < c < x < b$ , there exists some $c < z < x$ so that

f(x) = \sum_{k = 0}^n \frac{f^{(k)}(c)(x-c)^k}{k!} + \frac{f^{(n+1)}(z)(x-c)^{n+1}}{(n+1)!}.

Exercise 1: Give an example of a function which satisfies the hypothesis of Taylor's theorem on some interval $(a, b)$

Since $\sin$ has all derivatives on $(-\infty, \infty)$ , it satisfies the hypothesis of Taylor's theorem.

Exercise 2: Explain how the theorem tells us that for a function $f$ satisfying the hypothesis above,

R_n(x) = \frac{f^{(n+1)}(z)(x-c)^{n+1}}{(n+1)!}

for some

c < z < x

Observe that

\frac{f^{(n+1)}(z)(x-c)^{n+1}}{(n+1)!} = f(x) - \sum_{k = 0}^n \frac{f^{(k)}(c)(x-c)^k}{k!} = f(x) - P_n(x) = R_n(x).

The first equality is given by Taylor's theorem.

The proof of Taylor's theorem relies on Rolle's Theorem (see below). If you want to prove Rolle's theorem yourself, you can see a similar set of guiding exercises in my other post.

Rolle's Theorem: If a function $h$ is continuous on $[a, b]$ and differentiable on $(a, b)$ and $h(a) = h(b)$ , then there exists at least one $p$ in $(a, b)$ such that $h'(p) = 0$ .

Our method of proof will be to observe that there exists an $M$ which satisfies

f(x) = P_n(x) + \frac{M(x-c)^{n+1}}{(n+1)!}.

Now we just need to show that

M = f^{(n+1)}(z)

for some

c < z < x

Exercise 3: Explain why such an $M$ exists.

A solution to $M$ can be found through algebraic manipulation, so it is clear that $M$ exists for a particular $x$ and $c.$

Let's define a function $g$ on $(a, b)$ by

g(t) = P_n(t) + \frac{M(t - c)^{n+1}}{(n+1)!} - f(t).

We will first try to prove some properties of

g

. Soon, the reason for defining

g

in this way will become clear.

Exercise 4: Is $g$ differentiable $n + 1$ times on $(a, b)?$ Explain.

The answer is yes. Can you explain why? Try to use the fact that $g$ is a sum of a polynomial and a function which is differentiable $n + 1$ times.

Polynomials have all derivatives on $(a, b)$ and $f$ is differentiable $n + 1$ times on $(a, b).$ Since $g$ is a sum of functions differentiable $n + 1$ times on $(a, b)$ , $g$ is itself differentiable $n + 1$ times.

Exercise 5: Let $j$ be a positive integer such that $0 \leq j \leq n.$ Show that $g^{(j)}(c) = 0$ for all such $j$ .

Differentiation can be done term by term, meaning

g^{(j)}(c) = P_n^{(j)}(c) + \frac{d^j}{dt^j} \left(\frac{M(t - c)^{n+1}}{(n+1)!}\right)(c) - f^{(j)}(c).

Since $c - c = 0$ and $j < n + 1,$

\frac{d^j}{dt^j} \left(\frac{M(t - c)^{n+1}}{(n+1)!}\right)(c) = \frac{M(c - c)^{n+1-j}}{(n+1-j)!} = 0

Using the fact that $P_n$ is a polynomial, we see that

P_n^{(j)}(c) = \sum_{k = j}^n \frac{f^{(k)}(c)0^{k-j}}{(k-j)!} = f^{(j)}(c).

For the solution, open all three hints at the same time and read them in order.

Exercise 6: Compute $g^{(n+1)}.$

Recall that we defined $g$ by

g(t) = P_n(t) + \frac{M(t - c)^{n+1}}{(n+1)!} - f(t).

Since

P_n(t)

is a polynomial of degree

n

P_n^{(n+1)}(t) = 0.

Similarly,

\frac{d^j}{dt^j} \left(\frac{M(t - c)^{n+1}}{(n+1)!}\right) = M.

Hence,

g^{(n+1)}(t) = M - f^{(n + 1)}(t).

Exercise 7: Show that there exists a $c < x_1 < x$ so that $g'(x_1) = 0.$

Does $g$ satisfy the conditions for Rolle's theorem on $[c, x]?$

Show that $g$ satisfies the conditions for Rolle's theorem on $[c, x]$ by showing that:

g

is continuous on

[c, x]

g

is differentiable on

(c, x)

g(c) = g(x)

Since $g$ is differentiable $n + 1$ times on $(a, b)$ by Exercise 4, it is continuous on $(a, b)$ . Then $g$ is continuous on $[c, x]$ and differentiable on $(c, x)$ since both $[c, x]$ and $(c, x)$ are contained in $(a, b)$ . Moreover, $g(c) = g(x) = 0$ by Exercise 5 and the definition of $M.$

Thus $g$ satisfies the conditions for Rolle's theorem on $[c, x]$ , meaning there exists $c < x_1 < x$ so that $g'(x_1) = 0.$

Exercise 8: Show that there exists $c < x_{n+1} < x$ such that $g^{(n + 1)}(x_{n+1}) = 0.$ This exercise is the meat of the proof, so make sure you understand it. For this reason, plenty of hints are provided.

Use a technique similar to what was used in Exercise 7.

Observe that $g'(x_1) = g'(c) = 0.$

Apply Rolle's theorem to $g'$ on $[c, x_1]$ (where $x_1$ is defined as in Exercise 7) to find $c < x_2 < x_1 < x$ so that $g''(x_2) = 0.$

Recall $x_2$ from Hint 3. Can we apply Rolle's theorem to $g''$ on $[c, x_2]$ ?

Recall what was concluded in Exercise 5. This should help you show that every $g^{(j)}$ (where $0 \leq j \leq n$ ) satisfies Rolle's theorem on some subinterval of $[c, x].$

Recall $x_1$ from Exercise 7 and observe that $g^{(j)}(c) = 0$ for $0 \leq j \leq n.$

Since $g'(x_1) = g'(c) = 0$ and $[c, x_1]$ is contained in $[c, x],$ there exists $c < x_2 < x_1 < x$ such that $g''(x_2) = 0$ by Rolle's theorem. By repeating this process, we find some $c < x_{n+1} < \dots < x_1 < x$ so that $g^{(n+1)}(x_{n+1}) = 0.$

Exercise 9: Use Exercises 6 and 8 to complete the proof of Taylor's theorem.

By Exercise 6,

g^{(n+1)}(t) = M - f^{(n + 1)(x)}.

Setting

z = x_{n+1}

from Exercise 8 results in

g^{(n+1)}(z) = M - f^{(n + 1)}(z) = 0 \iff M = f^{(n + 1)}(z)

as desired.

\ \Box

Note that the $\iff$ symbol means "if and only if." As used above, it shows that the two equations are algebraically equivalent to each other.

Exercise 10: Explain why Taylor's theorem also deals with the case where $c > x$ (when the center is bigger than $x$ ) even though we assumed that $a < c < x < b.$

Since $x$ and $c$ are arbitrary, we can choose $x$ as the center and the theorem gives us an equivalent result for $f(c).$ Thus, simply "switching" $x$ and $c$ would suffice if $x < c.$

We have now shown that remainder $R_n(x)$ can be written as:

R_n(x) = \frac{f^{(n+1)}(z)(x-c)^{n+1}}{(n+1)!}

for some

z

between

x

and

c.

The above expression is called the Lagrange form of the remainder.

Convergence of the Taylor Series

Recall that for a function $f$ which is infinitely differentiable at some point $c$ , we denoted the Taylor series of $f$ centered at $c$ by $T_c.$ Although it was hinted that $T_c(x)$ equals $f(x)$ for certain values of $x,$ it is not yet clear when this happens. This is precisely what Taylor's theorem allows us to study.

First, note that by the definition of an infinite series, $T_c(x) = \lim_{n \to \infty} P_n(x).$ Now suppose $\lim_{n \to \infty} R_n = 0$ . Then

\lim_{n \to \infty} [P_n(x) - f(x)] = 0 \iff \lim_{n \to \infty} P_n(x) = f(x) \iff T_c(x) = f(x).

So, we can show that $T_c(x) = f(x)$ by showing that $R_n(x) \to 0.$ To do this, we often consider the Lagrange error bound of the function, which follows intuitively from Taylor's theorem. Specifically, pick any $B \geq |f^{(n+1)}(t)|$ for all $t$ between $c$ and $x.$ Then

|R_n(x)| \leq \frac{B|x-c|^{n+1}}{(n+1)!}

Let's try applying this to the function $\sin.$ Let $x$ and $c$ be real numbers. Since $|\sin|$ and $|\cos|$ are both bounded by $1,$ we see that

|R_n| \leq \frac{|x-c|^{n+1}}{(n+1)!}.

Since

\lim_{n \to \infty} \left[\frac{|x-c|^{n+1}}{(n+1)!}\right] = 0

(the proof of this is beyond the scope of this post, but it can be concluded from Stirling's formula),

\lim_{n \to \infty} R_n(x) = \lim_{n \to \infty} |R_n(x)| = 0.

Note: For a proof which does not involve Stirling's formula, you can see the last slide of my presentation on the irrationality of $\pi.$

Thus, we have shown that the Taylor series $T_c(x)$ for $\sin$ converges to $\sin(x)$ for all $x$ , irrespective of the choice of center $c.$

Explaining the Math Trick

We now have the tools we need to make sense of the $\pi$ approximation trick from before. To start, we'll formally write this "trick" in mathematical terms.

Theorem: Let $p$ be a real number so that $|\pi - p| < 10^{-j}$ for some positive integer $j.$ Then $|\pi - (p + \sin(p))| < 10^{-3j}.$

You will once again prove this theorem through a series of exercises. Before proceeding, make sure you understand how the above theorem formalizes the math trick presented in the first section.

Exercise 11: Write $P_4(p)$ centered at $\pi$ for the function $\sin.$

First, observe that $\sin(\pi) = 0,$ $\sin^{(1)}(\pi) = -1,$ $\sin^{(2)}(\pi) = 0,$ $\sin^{(3)}(\pi) = 1,$ and $\sin^{(4)}(\pi) = 0.$ Then we write

P_4(p) = \pi - p + \frac 1 {6}(p - \pi)^3.

Exercise 12: Use the Lagrange error bound to find an upper bound for $|R_4(p)|.$

Since $\cos$ is bounded by $1,$ we may write

|R_4(p)| \leq \frac{|p-\pi|^5}{120}.

Exercise 13: Show that $|\pi - (p + \sin(p))| < 10^{-3j}$ . You may use the triangle inequality, which states that $|a + b| \leq |a| + |b|$ for all real numbers $a, b$

Substitute $P_4 + R_4(p)$ for $\sin(p).$

Substituting $P_4 + R_4(p)$ for $\sin(p)$ yields

\pi - (p + \sin(p)) = \pi - p - \sin(p) = -\frac 1 6 (p - \pi)^3 - R_4(x) = -(\frac 1 6 (p - \pi)^3 + R_4(x)).

Invoking the triangle inequality and Exercise 12, we have that

|\pi - (p + \sin(p))| \leq \frac 1 6 |p - \pi|^3 + \frac 1 {120} |p-\pi|^5 \leq \frac 1 6 10^{-3j} + \frac 1 {120} 10^{-5j}.

Since

\frac 5 6 10^{-3j} \geq \frac 1 {120} 10^{-5j},

the proof is complete.

Last Updated: December 2022