Why Taylor Series Work

Taylor series allow us to very accurately approximate some infinitely differentiable functions, while only having knowledge of its derivatives at one point. This post deals with understanding how this is possible.

Only a basic familiarity with calculus is assumed, but motivation to understand mathematics is almost essential to understanding what's going on. Formally, you should know about sequences, series, and general properties/applications of derivatives.

All of the proofs in this post are presented as a set of exercises. The idea is to provide guidance through the proof and encourage a better understanding of the underlying motivation and concepts. However, solutions or hints are provided for guidance.

A Math Trick

Let's say you have an approximation pp of π\pi so that πp<1|\pi-p| < 1. That is, suppose we have a pretty close approximation of π\pi, like p=3p = 3 or p=3.14p = 3.14. Let nn be the number of decimal places that pp successfully approximates. For example, taking p=3.14p = 3.14 would mean n=2n = 2.

Then p+sin(p)p + \sin(p) successfully approximates 3n3n decimal places. That means that p+sin(p)p + \sin(p) is at least 3 times better than pp at approximating π\pi. To see this, we can start with p=3.1p = 3.1. Then

p+sin(p)=3.1+sin(3.1)3.14158066243.p + \sin(p) = 3.1 + \sin(3.1) \approx 3.14158066243.
Repeating this once more, we have
3.14158066243+sin(3.14159265359)3.14159265389.3.14158066243 + \sin(3.14159265359) \approx 3.14159265389.
After repeating this only twice, we started at a number which approximates π\pi with 11 decimal place of accuracy and received a number which approximates π\pi with 1111 decimal places of accuracy. Why does this happen?

Like most things in math, this is no coincidence, and the tools provided by calculus allow you to see precisely why this "magic trick" works so well.

The first challenge is determining the role of sin\sin. Clearly it has something to do with the approximations getting closer to π,\pi, but it not so clear what this relationship is. What we'll try to do is simplify this problem so that instead of studying functions like sin\sin or cos\cos, we study polynomials.

This is because polynomials are generally easier to work with: they are nicer to differentiate, it is easy to study their roots, and it is easier to describe how they transform a given input (i.e. describing what x2x^2 does to xx is easier than describing what sin(x)\sin(x) does to xx).

Our new goal will be to write the aforementioned functions (and many more) as polynomials. It turns out that if this function is "nice" enough (we'll figure out what "nice" means later), we can represent these functions as an infinite series. This infinite series is called the Taylor series of ff, and taking its partial sums are polynomials. To understand this, we need to first understand what an "infinite series" or "partial sum" is.

Infinite Series: Definition and Properties

Let (an)n1(a_n)_{n \geq 1} be a sequence. We can then define another sequence (sn)n1(s_n)_{n \geq 1} by

sn=k=1nak.s_n = \sum_{k = 1}^n a_k.
Then we define the infinite series k=1ak\sum_{k=1}^\infty a_k by
k=1ak=limnk=1nak=limnsn.\sum_{k=1}^\infty a_k = \lim_{n \to \infty} \sum_{k=1}^n a_k = \lim_{n \to \infty} s_n.
For a particular positive integer kk, sks_k is called the kkth partial sum of the series.

If the sequence (sn)(s_n) diverges (i.e. does not converge), then k=1ak\sum_{k=1}^\infty a_k is said to diverge and does not represent any value.

There are some properties of series that are important for us to note. The first is that not every infinite series converges. For example, n=1n\sum_{n=1}^\infty n diverges.

Another, less obvious note is that even if an0a_n \to 0 (read "ana_n converges to zero"), n=1an\sum_{n = 1}^\infty a_n does not necessarily converge. For example, choosing an=1na_n = \frac 1 n yields what is called the harmonic series:

n=11n\sum_{n = 1}^\infty \frac 1 n
This series diverges, and various proofs can be found here. However, ana_n does converge to zero.

Approximating Functions with Power Series

We now know enough about series to study the Taylor series.

Suppose ff is defined on (a,b)(a, b) and has all derivatives at some point a<c<ba < c < b. Then the Taylor series of ff centered at cc is a function defined by

Tc(x)=n=0f(n)(c)(xc)nn!.T_c(x) = \sum_{n = 0}^\infty \frac{f^{(n)}(c)(x-c)^n}{n!}.
Note that Tc(x)T_c(x) need not converge, meaning that TcT_c need not be defined for all values in the domain of f.f.

We also define

Pn(x)=k=0nf(k)(c)(xc)kk!P_n(x) = \sum_{k = 0}^n \frac{f^{(k)}(c)(x-c)^k}{k!}
to denote the first nn terms of the Taylor series. PnP_n is called the nnth Taylor polynomial of ff centered at c.c. It is a polynomial of degree (at most) n.n. The interesting part about the Taylor series is that for certain functions ff and certain values of xx, Ta(x)=f(x).T_a(x) = f(x). To see why this happens, we need to study Taylor's theorem.

Taylor's Theorem

Taylor's theorem gives us a way to estimate how closely PnP_n approximates the original function. To quantify this "closeness," we define the remainder Rn(x)R_n(x) by

Rn(x)=f(x)Pn(x)=f(x)k=0nf(k)(c)(xc)kk!.R_n(x) = f(x) - P_n(x) = f(x) - \sum_{k = 0}^n \frac{f^{(k)}(c)(x-c)^k}{k!}.
Some simple algebraic manipulation shows us exactly what this remainder means, as well as why it is called the remainder:
f(x)=Pn(x)+Rn(x).f(x) = P_n(x) + R_n(x).
Taylor's theorem gives us a formula for the remainder, which will later prove to be useful in studying the Taylor series of a function.

Taylor's Theorem: Let nn be some positive integer. Suppose ff be a function defined on (a,b)(a, b) (we allow a=a = -\infty and/or b=b = \infty) which is differentiable n+1n + 1 times on (a,b)(a, b). Then for any real numbers a<c<x<ba < c < x < b, there exists some c<z<xc < z < x so that

f(x)=k=0nf(k)(c)(xc)kk!+f(n+1)(z)(xc)n+1(n+1)!.f(x) = \sum_{k = 0}^n \frac{f^{(k)}(c)(x-c)^k}{k!} + \frac{f^{(n+1)}(z)(x-c)^{n+1}}{(n+1)!}.

Exercise 1: Give an example of a function which satisfies the hypothesis of Taylor's theorem on some interval (a,b)(a, b)

Since sin\sin has all derivatives on (,)(-\infty, \infty), it satisfies the hypothesis of Taylor's theorem.

Exercise 2: Explain how the theorem tells us that for a function ff satisfying the hypothesis above,

Rn(x)=f(n+1)(z)(xc)n+1(n+1)!R_n(x) = \frac{f^{(n+1)}(z)(x-c)^{n+1}}{(n+1)!}
for some c<z<xc < z < x.

Observe that

f(n+1)(z)(xc)n+1(n+1)!=f(x)k=0nf(k)(c)(xc)kk!=f(x)Pn(x)=Rn(x).\frac{f^{(n+1)}(z)(x-c)^{n+1}}{(n+1)!} = f(x) - \sum_{k = 0}^n \frac{f^{(k)}(c)(x-c)^k}{k!} = f(x) - P_n(x) = R_n(x).
The first equality is given by Taylor's theorem.

The proof of Taylor's theorem relies on Rolle's Theorem (see below). If you want to prove Rolle's theorem yourself, you can see a similar set of guiding exercises in my other post.

Rolle's Theorem: If a function hh is continuous on [a,b][a, b] and differentiable on (a,b)(a, b) and h(a)=h(b)h(a) = h(b), then there exists at least one pp in (a,b)(a, b) such that h(p)=0h'(p) = 0.

Our method of proof will be to observe that there exists an MM which satisfies

f(x)=Pn(x)+M(xc)n+1(n+1)!.f(x) = P_n(x) + \frac{M(x-c)^{n+1}}{(n+1)!}.
Now we just need to show that M=f(n+1)(z)M = f^{(n+1)}(z) for some c<z<xc < z < x.

Exercise 3: Explain why such an MM exists.

A solution to MM can be found through algebraic manipulation, so it is clear that MM exists for a particular xx and c.c.

Let's define a function gg on (a,b)(a, b) by

g(t)=Pn(t)+M(tc)n+1(n+1)!f(t).g(t) = P_n(t) + \frac{M(t - c)^{n+1}}{(n+1)!} - f(t).
We will first try to prove some properties of gg. Soon, the reason for defining gg in this way will become clear.

Exercise 4: Is gg differentiable n+1n + 1 times on (a,b)?(a, b)? Explain.

The answer is yes. Can you explain why? Try to use the fact that gg is a sum of a polynomial and a function which is differentiable n+1n + 1 times.

Polynomials have all derivatives on (a,b)(a, b) and ff is differentiable n+1n + 1 times on (a,b).(a, b). Since gg is a sum of functions differentiable n+1n + 1 times on (a,b)(a, b), gg is itself differentiable n+1n + 1 times.

Exercise 5: Let jj be a positive integer such that 0jn.0 \leq j \leq n. Show that g(j)(c)=0g^{(j)}(c) = 0 for all such jj.

Differentiation can be done term by term, meaning

g(j)(c)=Pn(j)(c)+djdtj(M(tc)n+1(n+1)!)(c)f(j)(c).g^{(j)}(c) = P_n^{(j)}(c) + \frac{d^j}{dt^j} \left(\frac{M(t - c)^{n+1}}{(n+1)!}\right)(c) - f^{(j)}(c).

Since cc=0c - c = 0 and j<n+1,j < n + 1,

djdtj(M(tc)n+1(n+1)!)(c)=M(cc)n+1j(n+1j)!=0\frac{d^j}{dt^j} \left(\frac{M(t - c)^{n+1}}{(n+1)!}\right)(c) = \frac{M(c - c)^{n+1-j}}{(n+1-j)!} = 0

Using the fact that PnP_n is a polynomial, we see that

Pn(j)(c)=k=jnf(k)(c)0kj(kj)!=f(j)(c).P_n^{(j)}(c) = \sum_{k = j}^n \frac{f^{(k)}(c)0^{k-j}}{(k-j)!} = f^{(j)}(c).

For the solution, open all three hints at the same time and read them in order.

Exercise 6: Compute g(n+1).g^{(n+1)}.

Recall that we defined gg by

g(t)=Pn(t)+M(tc)n+1(n+1)!f(t).g(t) = P_n(t) + \frac{M(t - c)^{n+1}}{(n+1)!} - f(t).
Since Pn(t)P_n(t) is a polynomial of degree nn,
Pn(n+1)(t)=0.P_n^{(n+1)}(t) = 0.
Similarly,
djdtj(M(tc)n+1(n+1)!)=M.\frac{d^j}{dt^j} \left(\frac{M(t - c)^{n+1}}{(n+1)!}\right) = M.
Hence,
g(n+1)(t)=Mf(n+1)(t).g^{(n+1)}(t) = M - f^{(n + 1)}(t).

Exercise 7: Show that there exists a c<x1<xc < x_1 < x so that g(x1)=0.g'(x_1) = 0.

Does gg satisfy the conditions for Rolle's theorem on [c,x]?[c, x]?

Show that gg satisfies the conditions for Rolle's theorem on [c,x][c, x] by showing that:

  1. gg is continuous on [c,x][c, x]
  2. gg is differentiable on (c,x)(c, x)
  3. g(c)=g(x)g(c) = g(x)

Since gg is differentiable n+1n + 1 times on (a,b)(a, b) by Exercise 4, it is continuous on (a,b)(a, b). Then gg is continuous on [c,x][c, x] and differentiable on (c,x)(c, x) since both [c,x][c, x] and (c,x)(c, x) are contained in (a,b)(a, b). Moreover, g(c)=g(x)=0g(c) = g(x) = 0 by Exercise 5 and the definition of M.M.

Thus gg satisfies the conditions for Rolle's theorem on [c,x][c, x], meaning there exists c<x1<xc < x_1 < x so that g(x1)=0.g'(x_1) = 0.

Exercise 8: Show that there exists c<xn+1<xc < x_{n+1} < x such that g(n+1)(xn+1)=0.g^{(n + 1)}(x_{n+1}) = 0. This exercise is the meat of the proof, so make sure you understand it. For this reason, plenty of hints are provided.

Use a technique similar to what was used in Exercise 7.

Observe that g(x1)=g(c)=0.g'(x_1) = g'(c) = 0.

Apply Rolle's theorem to gg' on [c,x1][c, x_1] (where x1x_1 is defined as in Exercise 7) to find c<x2<x1<xc < x_2 < x_1 < x so that g(x2)=0.g''(x_2) = 0.

Recall x2x_2 from Hint 3. Can we apply Rolle's theorem to gg'' on [c,x2][c, x_2]?

Recall what was concluded in Exercise 5. This should help you show that every g(j)g^{(j)} (where 0jn0 \leq j \leq n) satisfies Rolle's theorem on some subinterval of [c,x].[c, x].

Recall x1x_1 from Exercise 7 and observe that g(j)(c)=0g^{(j)}(c) = 0 for 0jn.0 \leq j \leq n.

Since g(x1)=g(c)=0g'(x_1) = g'(c) = 0 and [c,x1][c, x_1] is contained in [c,x],[c, x], there exists c<x2<x1<xc < x_2 < x_1 < x such that g(x2)=0g''(x_2) = 0 by Rolle's theorem. By repeating this process, we find some c<xn+1<<x1<xc < x_{n+1} < \dots < x_1 < x so that g(n+1)(xn+1)=0.g^{(n+1)}(x_{n+1}) = 0.

Exercise 9: Use Exercises 6 and 8 to complete the proof of Taylor's theorem.

By Exercise 6,

g(n+1)(t)=Mf(n+1)(x).g^{(n+1)}(t) = M - f^{(n + 1)(x)}.
Setting z=xn+1z = x_{n+1} from Exercise 8 results in
g(n+1)(z)=Mf(n+1)(z)=0    M=f(n+1)(z)g^{(n+1)}(z) = M - f^{(n + 1)}(z) = 0 \iff M = f^{(n + 1)}(z)
as desired.  \ \Box

Note that the     \iff symbol means "if and only if." As used above, it shows that the two equations are algebraically equivalent to each other.

Exercise 10: Explain why Taylor's theorem also deals with the case where c>xc > x (when the center is bigger than xx) even though we assumed that a<c<x<b.a < c < x < b.

Since xx and cc are arbitrary, we can choose xx as the center and the theorem gives us an equivalent result for f(c).f(c). Thus, simply "switching" xx and cc would suffice if x<c.x < c.

We have now shown that remainder Rn(x)R_n(x) can be written as:

Rn(x)=f(n+1)(z)(xc)n+1(n+1)!R_n(x) = \frac{f^{(n+1)}(z)(x-c)^{n+1}}{(n+1)!}
for some zz between xx and c.c. The above expression is called the Lagrange form of the remainder.

Convergence of the Taylor Series

Recall that for a function ff which is infinitely differentiable at some point cc, we denoted the Taylor series of ff centered at cc by Tc.T_c. Although it was hinted that Tc(x)T_c(x) equals f(x)f(x) for certain values of x,x, it is not yet clear when this happens. This is precisely what Taylor's theorem allows us to study.

First, note that by the definition of an infinite series, Tc(x)=limnPn(x).T_c(x) = \lim_{n \to \infty} P_n(x). Now suppose limnRn=0\lim_{n \to \infty} R_n = 0. Then

limn[Pn(x)f(x)]=0    limnPn(x)=f(x)    Tc(x)=f(x).\lim_{n \to \infty} [P_n(x) - f(x)] = 0 \iff \lim_{n \to \infty} P_n(x) = f(x) \iff T_c(x) = f(x).

So, we can show that Tc(x)=f(x)T_c(x) = f(x) by showing that Rn(x)0.R_n(x) \to 0. To do this, we often consider the Lagrange error bound of the function, which follows intuitively from Taylor's theorem. Specifically, pick any Bf(n+1)(t)B \geq |f^{(n+1)}(t)| for all tt between cc and x.x. Then

Rn(x)Bxcn+1(n+1)!|R_n(x)| \leq \frac{B|x-c|^{n+1}}{(n+1)!}

Let's try applying this to the function sin.\sin. Let xx and cc be real numbers. Since sin|\sin| and cos|\cos| are both bounded by 1,1, we see that

Rnxcn+1(n+1)!.|R_n| \leq \frac{|x-c|^{n+1}}{(n+1)!}.
Since limn[xcn+1(n+1)!]=0\lim_{n \to \infty} \left[\frac{|x-c|^{n+1}}{(n+1)!}\right] = 0 (the proof of this is beyond the scope of this post, but it can be concluded from Stirling's formula), limnRn(x)=limnRn(x)=0.\lim_{n \to \infty} R_n(x) = \lim_{n \to \infty} |R_n(x)| = 0.

Note: For a proof which does not involve Stirling's formula, you can see the last slide of my presentation on the irrationality of π.\pi.

Thus, we have shown that the Taylor series Tc(x)T_c(x) for sin\sin converges to sin(x)\sin(x) for all xx, irrespective of the choice of center c.c.

Explaining the Math Trick

We now have the tools we need to make sense of the π\pi approximation trick from before. To start, we'll formally write this "trick" in mathematical terms.

Theorem: Let pp be a real number so that πp<10j|\pi - p| < 10^{-j} for some positive integer j.j. Then π(p+sin(p))<103j.|\pi - (p + \sin(p))| < 10^{-3j}.

You will once again prove this theorem through a series of exercises. Before proceeding, make sure you understand how the above theorem formalizes the math trick presented in the first section.

Exercise 11: Write P4(p)P_4(p) centered at π\pi for the function sin.\sin.

First, observe that sin(π)=0,\sin(\pi) = 0, sin(1)(π)=1,\sin^{(1)}(\pi) = -1, sin(2)(π)=0,\sin^{(2)}(\pi) = 0, sin(3)(π)=1,\sin^{(3)}(\pi) = 1, and sin(4)(π)=0.\sin^{(4)}(\pi) = 0. Then we write

P4(p)=πp+16(pπ)3.P_4(p) = \pi - p + \frac 1 {6}(p - \pi)^3.

Exercise 12: Use the Lagrange error bound to find an upper bound for R4(p).|R_4(p)|.

Since cos\cos is bounded by 1,1, we may write

R4(p)pπ5120.|R_4(p)| \leq \frac{|p-\pi|^5}{120}.

Exercise 13: Show that π(p+sin(p))<103j|\pi - (p + \sin(p))| < 10^{-3j}. You may use the triangle inequality, which states that a+ba+b|a + b| \leq |a| + |b| for all real numbers a,ba, b

Substitute P4+R4(p)P_4 + R_4(p) for sin(p).\sin(p).

Substituting P4+R4(p)P_4 + R_4(p) for sin(p)\sin(p) yields

π(p+sin(p))=πpsin(p)=16(pπ)3R4(x)=(16(pπ)3+R4(x)).\pi - (p + \sin(p)) = \pi - p - \sin(p) = -\frac 1 6 (p - \pi)^3 - R_4(x) = -(\frac 1 6 (p - \pi)^3 + R_4(x)).
Invoking the triangle inequality and Exercise 12, we have that
π(p+sin(p))16pπ3+1120pπ516103j+1120105j.|\pi - (p + \sin(p))| \leq \frac 1 6 |p - \pi|^3 + \frac 1 {120} |p-\pi|^5 \leq \frac 1 6 10^{-3j} + \frac 1 {120} 10^{-5j}.
Since
56103j1120105j,\frac 5 6 10^{-3j} \geq \frac 1 {120} 10^{-5j},
the proof is complete.

Last Updated: December 2022