Skip to contentUnited States Department of Transportation - Federal Highway Administration FHWA Home
Research Home
Report
This report is an archived publication and may contain dated technical, contact, and link information
Publication Number: FHWA-RD-04-080
Date: September 2004

Software Reliability: A Federal Highway Administration Preliminary Handbook

PDF Version (697 KB)

PDF files can be viewed with the Acrobat® Reader®

Appendix B: Roundoff Errors in Large Sum

Errors in the Sum of a List of Numbers

Accumulating Errors after Two Operations

To illustrate the errors after two arithmetic operations, the approximate error in (a+b)+c is computed here:

Click for text description

Where the terms containing e1*e2, which are much smaller than the other terms, are thrown away.

There are two interesting aspects of this formula for the error of two additions:

  • Addition in not associative. If a + (b+c) were computed, the result would be

    Click for text description

    where the random error variables e1b and e2b almost always have different values than e1 and e2. Therefore almost always,

    Click for text description

  • The error can be written as an inner product of vectors,

    Click for text description

    • ArgVector is a vector containing all the arguments of arithmetic operations, e.g., (a,b,a+b,c) for the example above.
    • ErrVector is a vector of random variables in the range (+/-)u, for example, (e1,e1,e2,e2) for the example above.

An Example of Nonassociativity

To highlight the effect of order on a sum, it helps to use numbers that vary widely in size. This increases the chance that some error terms will arise in which a relatively large argument is multiplied by a relatively large roundoff error. Since the different orders of addition change the combinations of arguments and errors, these relatively large terms are unlikely to occur in both of the different-ordered additions.

To get a set of such numbers, take the floating point quotient of two random integers, where the numerator ranges over [0,RANDMAX] and the denominator over [1,RANDMAX], where RANDMAX is a C-implementation-dependent constant whose value does not matter, as long as a wide range of random values is permitted. In an actual application, double precision numbers would be used, but floating point numbers are used here to create noticeable errors in fewer actual computations, for illustrative purposes.

Adding various numbers of these random quotients in opposite orders produced the following results.

Table 5. Order Errors for Addition
Size Sum Up Sum Down Diff. Avg. Up Avg/ Down Diff/Size
10 33.325592 33.325592 0.000000 3.332559 3.332559 0.000000
100 495.900269 495.900146 0.000122 4.959002 4.959002 0.000001
1000 6318.028320 6318.027832 0.000488 6.318028 6.318028 0.000000
10000 67116.781250 67116.718750 0.062500 6.711678 6.711672 0.000006
100000 562199.375000 562194.625000 4.750000 5.621994 5.621946 0.000048

The C program producing these numbers appears in averrdemo.c (see below).

Computing the Maximum Error

Because all operations satisfy the same bounds for errors,

Click for text description

repeated application of the previous section's techniques provides a maximum error for a numerical computation.

If the precision is constant throughout the computation, the maximum error, |fl(x)-x| for a real number x in the range of the floating point number system is u*SumArgs, where SumArgs is the sum of the absolute values of the arguments for all arithmetic operations used to compute x.

This maximum error can be computed along with x, because:

  • u is a known quantity for any particular floating point number system and, in particular, has the value 5E-15 for the typical double precision operation.
  • SumArgs can be calculated as a running sum during the computation of x.

Computing the Probable Error

While the small random error variables need not be random, they often behave so, and the observed error is much smaller than the maximum error. Therefore, it is useful to assume that the actual error is a weighted sum of random error variables and to compute the standard deviation of the error.

Assuming that the distribution of errors is uniform over a u radius interval around zero, the actual error is a weighted sum of the random error variables (the e's). Under reasonable assumptions about the probability density functions for the individual e's, such as assuming a uniform distribution of errors in the u interval, the Central Limit Theorem applies.

For e distributions for which the Central Limit Theorem applies, such as a uniform distribution for each of the e's:

  • The best point estimate of the error is 0. This is because the expected error is the weighted sum of the means of the individual error distributions. This is 0, because each of the individual e's is assumed to have a mean of 0.
  • The variance is the weighted sum of the variances of the individual e's. If each e is a uniform random variable on [-u,u], each has a variance of (2*u)^^2/12. The variance of the error entire computation is then SumArgs*(2*u)^^2/12.

Application: Adding N Numbers

To add, in order N, floating point numbers that are:

  • All non-negative (so the example is not complicated with estimates of precision of intermediate results).
  • About the same size, to enable some approximations shown below.

In this case, the intermediate results are:

  • The N numbers themselves.
  • The partial sums of the first k numbers, from k = 2 to k = n.

Let m be the mean of the numbers. Using the assumption that the numbers are about equal, the partial sums are k*m, approximately. The sum of these partial sums is approximately ((N-2)/2)*(N*m), so the total sum of arguments is approximately ((N-1)/2)*(N*m), adding in the errors on the numbers themselves, or approximately N^^2/2*m.

For the usual double precision arithmetic, the variance of a single random error variable is

Click for text description

Because the standard deviation is the square root of the variance, to keep the error standard deviation less than 1 part in 10,000, the variance must be less than E-8. This requires N^^2/2*m < E-15. This shows that the error in most sums on a computer is negligible.

However, some data miners compute statistics from vast samples. If N = E5 and m = E6, for example, there might be a standard deviation greater than E-4.

Large sums also appear in numerical integration. However, the individual summands are the areas of very small approximate trapezoids, so for numerical integration, roundoff error usually can be ignored.

Restrictions on the Probable Error Computation

Note that these statistical estimates break down if the computation contains only a few operations. In this case, direct computation of the errors, as was done for two additions above, is preferred. It is also important to analyze separately those arguments occurring during the computation that have different precisions. These results for different precisions can be combined using elementary statistics in the same way that the above estimates for expected error were obtained.

Averrdemo.c

Random Quotients

Previous | Table of Contents | Next

ResearchFHWA
FHWA
United States Department of Transportation - Federal Highway Administration