U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590

Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

This report is an archived publication and may contain dated technical, contact, and link information
Publication Number: FHWA-RD-03-093
Date: August 2006

Study of Long-Term Pavement Performance (LTPP): Pavement Deflections

Chapter 3. Inconsistent FWD Deflection Basins

Introduction to Data Screening for Consistency of Measured Deflections

The most basic and obvious type of FWD data error is associated with a random type of data anomaly. In this case, when one deflection, normalized to a uniform or target load level, is appreciably different from the other deflections taken at the same time and test point, a random error or a faulty deflection sensor are likely causes.

The Dynatest® Model 8000 FWD, with either the 8600 or 9000 system processor, is the only FWD device currently owned and operated by LTPP. These models advertise a deflection accuracy of ±2 percent ±2 micrometers (mm) (microns), with the 2 percent figure representing the potential systematic error (or bias) and the 2mm figure representing the random error (or precision). Since both relative and reference calibrations carried out from time to time have indicated that this claim is generally true, it was determined that using a combination of potential random and systematic error sources may be useful in finding inconsistent drop-to-drop deflection data in the database.

Inconsistent Deflection Basin Identification Criteria

The ±2 mm random variation associated with deflection measurements is generally stated as a one standard deviation limit, not an absolute limit. Therefore, it was immediately clear that a larger limit was necessary so as not to identify potentially good data as a random error or an inconsistent deflection basin. As an end result, the following was carried out:

  1. After normalizing the deflections to the target load level, all data were marked where the deviation from the average deflection at that drop height was greater than 4mm. Usually, data from four drops at each given drop height were available. In some cases three, two, and (very infrequently) even only one drop was recorded by the equipment operator or transferred to the database through the routine (global) quality control (QC) screening processes. The initial 4-mm criterion was used in all cases except where data from only one drop were available. In those cases, the data were also marked for further screening.
  2. As expected, the standard deviations associated with increasing deflections were generally larger than the standard deviations at smaller deflection levels. It was also noted that the LTPP protocol for deflection testing utilized an additional 1 percent buffer before a given drop sequence is flagged for large deviations in recorded deflections (from drop to drop at the same drop height), evidently taking into account the systematic part of the FWD’s accuracy specification. Therefore, 1 percent of the recorded deflection for each sensor was then subtracted from the standard deviation as calculated above. This resulted in a considerably reduced list of suspect data records, which were called “initially flagged” data. The set of initially flagged FWD data points thus consisted of data where the standard deviation from drop to drop, less 1 percent of reading, was greater than ±4mm.
  3. Each of the flagged, or marked, sets of data (whether a four-, three-, two-, or one-drop set) was then compared with the average deflection basin from a drop sequence taken at the same time and at the same test point, but from a different (unmarked) drop height. This check used both the statistical correlation (R2) and the standard error of the estimate (SEE) between marked and unmarked data.
  4. Quite expectedly, it was discovered that different types of pavement structures required different threshold levels of correlation and/or SEEs to identify truly suspect data. For example, the pavement-induced deflection variations at joints on portland cement concrete (PCC) pavements are naturally greater from drop to drop than with axisymmetrical cases of asphalt concrete (AC) or PCC interior slab data. Various types of pavement were therefore treated somewhat differently during the flagging and marking process, depending on the natural variation and deflection basin shapes of the overall data.
  5. In the subsequent evaluation, it was possible to evaluate all of the data except the unbound material tests (S = subgrade and G = granular base) where the standard deviations, for all sensors and at virtually all test points, were much larger (see “Creation of LTPP Feedback Reports,” below).
  6. After applying appropriate R2 and SEE limits to the suspect FWD test lines, a shortlist of inconsistent basin data records was identified and listed, by LTPP region, in a Microsoft® Excel spreadsheet. The suspect data were then subjected to further data processing, as discussed in the following paragraphs.

These marked and flagged data were an uncommon occurrence in the database. For example, figure 1 shows a typical distribution of the calculated (load-normalized) standard deviations for all AC type pavements tested along the wheelpath (designated in the database as “lane F3”).

Figure 1. Graph. Frequency distribution of standard deviations for repeated deflections. The figure is a line graph showing the typical distribution of standard deviations for all asphalt concrete pavements along the wheel path. The standard deviations are graphed on the horizontal axis from 0 to 5 micrometers. The cumulative percentage is graphed on the vertical axis. There are three sites tested: sensor 1, sensor 3, and sensor 7. The plot for each sensor begins at the origin, increases rapidly, and levels off as the cumulative percentage approaches 100 percent and the standard deviation approaches 5 micrometers. At a standard deviation of 1 micrometer, sensor 1 is plotted at 60 percent, sensor 3 is plotted at 78 percent, and sensor 7 is plotted at 82 percent.

Figure 1. Graph. Frequency distribution of standard deviations for repeated deflections.

The task of identifying the distribution of standard deviations was facilitated by shifting the standard deviation limits by 1 percent of the deflection reading. As shown, a standard deviation level of 4 mm or greater (after the 1 percent of adjustment) only exists in 1 to 2 percent of the entire pre-autumn 1998 FWD load-deflection database.

After a careful review of the data, the correlation and standard deviation criteria were set for various categories or lanes of tested pavements. The data that caused the most extreme outliers were immediately flagged, while the other data were identified and marked for further assessment. These criteria are shown in table 2.


Table 2. Marked or flagged autoidentification criteria for various lanes.
Marked/Flagged Correlation SEE Defl.1
Marked Autoidentification Criteria—AC surfaces (Lanes F0–F5)
Marked < 0.9975 N/A N/A
Marked From 0.9990 to 0.9995 > 18 mm N/A
Marked From 0.9975 to 0.9990 > 9 mm N/A
Flagged < 0.9900 > 9 mm N/A
Flagged N/A N/A > 2,100 mm
Marked Autoidentification Criteria—Lanes C0, C1, J1, J6, J7, and J8
Marked < 0.995 N/A N/A
Marked From 0.998 to 0.999 > 18 mm N/A
Marked From 0.995 to 0.998 > 9 mm N/A
Flagged < 0.980 > 9 mm N/A
Flagged N/A N/A > 2,100 mm
Marked Autoidentification Criteria—Lanes C2–C5, J2–J5, Ls, and Ps
Marked < 0.990 N/A N/A
Marked From 0.997 to 0.998 > 18 mm N/A
Marked From 0.990 to 0.997 > 9 mm N/A
Flagged < 0.970 > 9 mm N/A
Flagged N/A N/A > 2,100 mm

Creation of LTPP Feedback Reports

The transformed basin, or SLIC method described in chapter 5 (see also appendix B) was subsequently used to reexamine the marked and flagged data selected by the automatic identification method described above. Although the method of transformed basins was developed primarily to identify sensor position errors, random errors and anomalies can also be detected from these graphs.

The SLIC technique had not been developed before the automatic identification method for inconsistent deflection basins was employed. Also, responses to our original Feedback Report (called RNS–4) of September 1999 suggested that the automatic identification method, on occasion, improperly identified suspect data, and in fact the set of drop heights used to identify anomalous data was sometimes itself a potentially anomalous set of data. Such a situation could occur, for example, when data from all four drops at a particular height were spurious but consistent with each other. These data would then pass the random error screen based on standard deviations, and then be used not only to confirm a record of anomalous data, but all the other records at that same drop height, some of which were potentially correct.

After a reexamination of all 7,045 data records originally recommended for final flagging in Feedback Report RNS–4, and with the aid of the transformed basin graphs, some of the identified data records were reclassified as good or at least okay. In addition, many of the original RNS–4 Feedback Report recommendations have by now been flagged in the current database, as so-called nondecreasing deflections (strictly speaking, these are increasing deflections, as adjacent but equal deflections are not flagged). These currently flagged data records were relabeled accordingly.

Accordingly, the new criteria only marked the most extreme anomalies or outliers. As an example, section 12–4154 tested on November 9, 1990, is shown in table 3, using the autoidentification criteria listed in table 2.


Table 3. Autoidentification example of a marked FWD data record.
Station Lane Hgt. Load D.1 D.2 D.3 D.4 D.5 D.6 D.7 Correlation SEE Mark?
0 F3 3 787 678 427 214 118 82 57 35 1.0000 0.54 N
0 F3 3 783 677 426 214 118 82 56 36 1.0000 0.33 N
0 F3 3 784 676 425 213 118 82 56 35 1.0000 0.28 N
0 F3 3 783 675 425 213 118 84 56 37 1.0000 0.71 N
0 F3 4 1058 817 529 282 164 113 76 48 0.9996 9.01 N
0 F3 4 1056 822 526 280 163 113 76 46 0.9997 7.92 N
0 F3 4 1057 819 525 282 160 114 74 45 0.9996 8.86 N
0 F3 4 1053 810 523 284 161 113 80 32 0.9990 13.98 Y

In this example, the criteria presented in table 2 were applied, and as a result the last line of data from the fourth drop height was marked. The average of the third drop height was used to compare the basins from the fourth drop height. As can be seen in table 3, it appears that deflection sensor #7 had too low a deflection (by about 14 mm) while #6 had too high a deflection, though by a lesser amount. Such magnitudes of deviation can have a significant impact on backcalculated moduli or other basin shape factors.

There were some cases in which none of the drop heights used at a given test point passed the 4 mm less 1 percent of reading standard deviation test, so none of the drop heights could be used for automated comparisons. In these cases, marking was accomplished visually, since in all instances at least a few of the deflection basins seemed reasonable from most or all of the four drop heights. Visual marking was an attempt to avoid flagging data that may, in fact, be acceptable and useable. An attempt was made to always have data from at least one drop left, after marking, at a given test point and drop height, although this could not be achieved 100 percent of the time.

As previously mentioned, the unbound material tests conducted directly on the subgrade or granular base layers (denoted by an S = subgrade or a G = granular in the lane designation) were too variable to separate real errors or anomalies from the actual pavement response. There were several causes for this problem; those causes are described in the following paragraphs.

The FWD equipment provided to SHRP and the LTPP program was not specifically designed to test unbound materials, although with careful handling it can be used successfully for unbound material tests. The load plate on the LTPP program’s FWD equipment is not segmented or split.

When unbound materials are tested, the mean pressure under the loading plate should be reduced to a level similar to what that particular layer will experience, under traffic, after the bound layers are in place. In most cases when unbound materials were tested, not only was the small 300-millimeter (mm) (117-inch) loading plate used in lieu of the provided 450-mm (136.5-inch) loading plate, but the ordinary weight package and standard drop heights used for bound material tests were occasionally employed as well. This often resulted in deflections that were too large, sometimes even exceeding the physical limits of the FWD’s ~2,100 mm sensor range. Further, this problem not only occurred on a regular basis for the center deflection, but for sensors 2 and 3 from time to time, especially in the case of subgrade tests. Use of the 450-mm (136.5-inch) plate results in better confinement of the materials under test, which is more realistic.

All of these factors contributed to several spurious deflection readings observed throughout the S- and G-tests conducted for the LTPP program and later uploaded as level E data into the database. In addition, unbound materials often behave nonelastically, with plastic deformations, punching, and shear deformations taking place simultaneously on a fairly regular basis.

Nevertheless, no flags or other changes to the data are recommended to the S- and G-data, because some of the deflection readings—particularly those between d2 and d5—often appear to be reasonable. These data alone may prove to be valuable for analysis of unbound material tests, since backcalculation is not likely to be used to derive stiffness data for one, or at the most two, layer(s) in the pavement structure.

Except for the unbound material test data, the automated and manual (visual) processes described in the foregoing paragraphs were applied to all of the pre-autumn 1998 load-deflection data from all four regions. A list of recommended flags was developed, and a revision of Feedback Report RNS–4 was created, called RNS–4M (shown in appendix A). Table 4 shows the number of recommended flagged records versus the approximate number of records in the corresponding pre-autumn 1998 “*.M06” files in the LTPP database.


Table 4. FWD records identified for flagging in the pre-autumn 1998 database.
All LTTP Data Total Number of Records (lines) Total Number of Recommended Flags Percentage of Recommended Flags
TOTALS 4,422,000 2,642 0.06

As shown in table 4, on a percentage basis the overall number of recommended flags is very small. The errors identified were possibly attributable to equipment operators not noticing when something was wrong with a particular sensor or sensors. Much of the flagged data was sequential (i.e., from the same day and along the same test section). It is also possible that there were intermittent problems with the equipment that could not be immediately rectified in the field.

Finally, it should be noted that other flags were originally recommended in Feedback Report RNS–4; however, most of these were corrected or changed through other screening methods, such as the visual SLIC method (see chapter 5) or the use of the nondecreasing deflections flag mentioned above. With respect to the use of flags in the load-deflection tables, such as those recommended in Feedback Report RNS–4M (see appendix A), a Feedback Report designated RNS–7 was submitted. This feedback report is also presented in appendix A.

Previous | Table of Contents | Next


Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101