We have now seen some of the most generally useful methods for discovering antiderivatives, and there are others. Unfortunately, some functions have no simple antiderivatives; in such cases if the value of a definite integral is needed it will have to be approximated. We will see two methods that work reasonably well and yet are fairly simple; in some cases more sophisticated techniques will be needed.

Of course, we already know one way to approximate an integral: if we think of the integral as computing an area, we can add up the areas of some rectangles. While this is quite simple, it is usually the case that a large number of rectangles is needed to get acceptable accuracy. A similar approach is much better: we approximate the area under a curve over a small interval as the area of a trapezoid. In figure 10.5.1 we see an area under a curve approximated by rectangles and by trapezoids; it is apparent that the trapezoids give a substantially better approximation on each subinterval.

Use the slider to change the number of subintervals.

Figure 10.5.1. Approximating an area with rectangles and with trapezoids.

As with rectangles, we divide the interval into $n$ equal subintervals of length $\Delta x$. A typical trapezoid is pictured in figure 10.5.2; it has area $\ds{f(x_i)+f(x_{i+1})\over2}\Delta x$. If we add up the areas of all trapezoids we get $$ \eqalign{ {f(x_0)+f(x_1)\over2}\Delta x&+{f(x_1)+f(x_2)\over2}\Delta x+\cdots+ {f(x_{n-1})+f(x_n)\over2}\Delta x=\cr &\left({f(x_0)\over2}+f(x_1)+f(x_2)+\cdots+f(x_{n-1})+{f(x_n)\over2}\right) \Delta x.\cr} $$ This is usually known as the Trapezoid Rule. For a modest number of subintervals this is not too difficult to do with a calculator; a computer can easily do many subintervals.

Figure 10.5.2. A single trapezoid.

In practice, an approximation is useful only if we know how accurate it is; for example, we might need a particular value accurate to three decimal places. When we compute a particular approximation to an integral, the error is the difference between the approximation and the true value of the integral. For any approximation technique, we need an error estimate, a value that is guaranteed to be larger than the actual error. If $A$ is an approximation and $E$ is the associated error estimate, then we know that the true value of the integral is between $A-E$ and $A+E$. In the case of our approximation of the integral, we want $E=E(\Delta x)$ to be a function of $\Delta x$ that gets small rapidly as $\Delta x$ gets small. Fortunately, for many functions, there is such an error estimate associated with the trapezoid approximation.

Theorem 10.5.1 Suppose $f$ has a second derivative $f''$ everywhere on the interval $[a,b]$, and $|f''(x)|\le M$ for all $x$ in the interval. With $\Delta x= (b-a)/n$, an error estimate for the trapezoid approximation is $$ E(\Delta x) = {b-a\over12}M(\Delta x)^2={(b-a)^3\over 12n^2}M. $$ $\qed$

Let's see how we can use this.

Example 10.5.2 Approximate $\ds\int_0^1 e^{-x^2}\,dx$ to two decimal places. The second derivative of $\ds f=e^{-x^2}$ is $\ds(4x^2-2)e^{-x^2}$, and it is not hard to see that on $[0,1]$, $\ds|(4x^2-2)e^{-x^2}|\le 2$. We begin by estimating the number of subintervals we are likely to need. To get two decimal places of accuracy, we will certainly need $E(\Delta x)< 0.005$ or $$ \eqalign{ {1\over12}(2){1\over n^2} &< 0.005\cr {1\over6}(200)&< n^2\cr 5.77\approx\sqrt{100\over3}&< n\cr} $$ With $n=6$, the error estimate is thus $\ds1/6^3< 0.0047$. We compute the trapezoid approximation for six intervals: $$ \left({f(0)\over2}+f(1/6)+f(2/6)+\cdots+f(5/6)+{f(1)\over2}\right){1\over6} \approx 0.74512. $$ So the true value of the integral is between $0.74512-0.0047=0.74042$ and $0.74512+0.0047=0.74982$. Unfortunately, the first rounds to $0.74$ and the second rounds to $0.75$, so we can't be sure of the correct value in the second decimal place; we need to pick a larger $n$. As it turns out, we need to go to $n=12$ to get two bounds that both round to the same value, which turns out to be $0.75$. For comparison, using $12$ rectangles to approximate the area gives $0.7727$, which is considerably less accurate than the approximation using six trapezoids.

In practice it generally pays to start by requiring better than the maximum possible error; for example, we might have initially required $E(\Delta x)< 0.001$, or $$ \eqalign{ {1\over12}(2){1\over n^2} &< 0.001\cr {1\over6}(1000)&< n^2\cr 12.91\approx\sqrt{500\over3}&< n\cr} $$ Had we immediately tried $n=13$ this would have given us the desired answer. $\square$

The trapezoid approximation works well, especially compared to rectangles, because the tops of the trapezoids form a reasonably good approximation to the curve when $\Delta x$ is fairly small. We can extend this idea: what if we try to approximate the curve more closely, by using something other than a straight line? The obvious candidate is a parabola: if we can approximate a short piece of the curve with a parabola with equation $\ds y=ax^2+bx+c$, we can easily compute the area under the parabola.

There are an infinite number of parabolas through any two given points, but only one through three given points. If we find a parabola through three consecutive points $(x_i,f(x_i))$, $(x_{i+1},f(x_{i+1}))$, $(x_{i+2},f(x_{i+2}))$ on the curve, it should be quite close to the curve over the whole interval $[x_i,x_{i+2}]$, as in figure 10.5.3. If we divide the interval $[a,b]$ into an even number of subintervals, we can then approximate the curve by a sequence of parabolas, each covering two of the subintervals. For this to be practical, we would like a simple formula for the area under one parabola, namely, the parabola through $(x_i,f(x_i))$, $(x_{i+1},f(x_{i+1}))$, and $(x_{i+2},f(x_{i+2}))$. That is, we should attempt to write down the parabola $y=ax^2+bx+c$ through these points and then integrate it, and hope that the result is fairly simple. Although the algebra involved is messy, this turns out to be possible. The algebra is well within the capability of a good computer algebra system like Sage, so we will present the result without all of the algebra; you can see how to do it in this Sage worksheet.

To find the parabola, we solve these three equations for $a$, $b$, and $c$: $$ \eqalign{ f(x_i)&=a(x_{i+1}-\Delta x)^2+b(x_{i+1}-\Delta x)+c\cr f(x_{i+1})&=a(x_{i+1})^2+b(x_{i+1})+c\cr f(x_{i+2})&=a(x_{i+1}+\Delta x)^2+b(x_{i+1}+\Delta x)+c\cr} $$ Not surprisingly, the solutions turn out to be quite messy. Nevertheless, Sage can easily compute and simplify the integral to get $$ \int_{x_{i+1}-\Delta x}^{x_{i+1}+\Delta x} ax^2+bx+c\,dx= {\Delta x\over3}(f(x_i)+4f(x_{i+1})+f(x_{i+2})). $$ Now the sum of the areas under all parabolas is $$ \displaylines{ {\Delta x\over3}(f(x_0)+4f(x_{1})+f(x_{2})+f(x_2)+4f(x_{3})+f(x_{4})+\cdots +f(x_{n-2})+4f(x_{n-1})+f(x_{n}))=\cr {\Delta x\over3}(f(x_0)+4f(x_{1})+2f(x_{2})+4f(x_{3})+2f(x_{4})+\cdots +2f(x_{n-2})+4f(x_{n-1})+f(x_{n})).\cr} $$ This is just slightly more complicated than the formula for trapezoids; we need to remember the alternating 2 and 4 coefficients; note that $n$ must be even for this to make sense. This approximation technique is referred to as Simpson's Rule.

Figure 10.5.3. A parabola (dashed) approximating a curve (solid).

As with the trapezoid method, this is useful only with an error estimate:

Theorem 10.5.3 Suppose $f$ has a fourth derivative $f^{(4)}$ everywhere on the interval $[a,b]$, and $|f^{(4)}(x)|\le M$ for all $x$ in the interval. With $\Delta x= (b-a)/n$, an error estimate for Simpson's approximation is $$ E(\Delta x) = {b-a\over180}M(\Delta x)^4={(b-a)^5\over 180n^4}M. $$ $\qed$

Example 10.5.4 Let us again approximate $\ds\int_0^1 e^{-x^2}\,dx$ to two decimal places. The fourth derivative of $\ds f=e^{-x^2}$ is $\ds(16x^4-48x^2+12)e^{-x^2}$; on $[0,1]$ this is at most $12$ in absolute value. We begin by estimating the number of subintervals we are likely to need. To get two decimal places of accuracy, we will certainly need $E(\Delta x)< 0.005$, but taking a cue from our earlier example, let's require $E(\Delta x)< 0.001$: $$ \eqalign{ {1\over180}(12){1\over n^4} &< 0.001\cr {200\over3}&< n^4\cr 2.86\approx\root 4 \of {200\over3}&< n\cr} $$ So we try $n=4$, since we need an even number of subintervals. Then the error estimate is $\ds12/180/4^4< 0.0003$ and the approximation is $$ (f(0)+4f(1/4)+2f(1/2)+4f(3/4)+f(1)){1\over3\cdot4} \approx 0.746855. $$ So the true value of the integral is between $0.746855-0.0003=0.746555$ and $0.746855+0.0003=0.7471555$, both of which round to $0.75$. $\square$

The figure below compares the three methods we have discussed, computing the area under $y=\sin x$, $0\le x\le \pi/2$. Of course, it's easy to compute this exactly: the area is $1$.

Use the slider to change the number of subintervals.

Exercises 10.5

In the following problems, compute the trapezoid and Simpson approximations using 4 subintervals, and compute the error estimate for each. (Finding the maximum values of the second and fourth derivatives can be challenging for some of these; you may use a graphing calculator or computer software to estimate the maximum values.) If you have access to Sage or similar software, approximate each integral to two decimal places. You can use this Sage worksheet to get started.

Ex 10.5.1 $\ds\int_1^3 x\,dx$ (answer)

Ex 10.5.2 $\ds\int_0^3 x^2\,dx$ (answer)

Ex 10.5.3 $\ds\int_2^4 x^3\,dx$ (answer)

Ex 10.5.4 $\ds\int_1^3 {1\over x}\,dx$ (answer)

Ex 10.5.5 $\ds\int_1^2 {1\over 1+x^2}\,dx$ (answer)

Ex 10.5.6 $\ds\int_0^1 x\sqrt{1+x}\,dx$ (answer)

Ex 10.5.7 $\ds\int_1^5 {x\over 1+x}\,dx$ (answer)

Ex 10.5.8 $\ds\int_0^1 \sqrt{x^3+1}\,dx$ (answer)

Ex 10.5.9 $\ds\int_0^1 \sqrt{x^4+1}\,dx$ (answer)

Ex 10.5.10 $\ds\int_1^4 \sqrt{1+1/x}\,dx$ (answer)

Ex 10.5.11 Using Simpson's rule on a parabola $f(x)$, even with just two subintervals, gives the exact value of the integral, because the parabolas used to approximate $f$ will be $f$ itself. Remarkably, Simpson's rule also computes the integral of a cubic function $f(x)=ax^3+bx^2+cx+d$ exactly. Show this is true by showing that $$ \int_{x_0}^{x_2} f(x)\,dx={x_2-x_0\over3\cdot2}(f(x_0)+4f((x_0+x_2)/2)+f(x_2)). $$ Note that the right hand side of this equation is exactly the Simpson approximation for the cubic. This does require a bit of messy algebra, so you may prefer to use this Sage worksheet.