Skip to contentU.S. Department of Transportation/Federal Highway Administration
Asset Management | Bridge Technology | Operations | Pavement
FHWA > Asset Management > Pavement Health Track > PHT Final Report > Sensitivity Analysis between MEPDG version 0.8 and 1.0Se

Enhancement of the Pavement Health Track (PHT) Analysis Tool Final Report

Sensitivity Analysis between MEPDG version 0.8 and 1.0Se

Introduction

The PHT Analysis Tool version 1.1 pavement performance prediction models were developed based on the simplified MEPDG version 0.8 models. Significant improvements have been made to the MEPDG models and algorithms used for computing fundamental pavement responses and the performance models as a whole has been recalibrated to reflect both changes in both computational algorithms and improved and more extensive LTPP calibration data. The objective of this task was to investigate whether there is significant differences between MEPDG versions 0.8 (PHT Tool) and DARWin-ME 1.0 pavement performance predictions and how possible differences do impact the results of the PHT Analysis Tool. This investigation involved conducting a sensitivity analysis to develop RSL estimates under different scenarios using the PHT Tool and the DARWin-ME pavement performance models and performing a statistical comparison of RSL outputs from each application.

The test experiment used a total of 40 LTPP unique sites from 20 states with 4 climate zones, 3 base type and four pavement surface types. The complete summary of the projects selected for the sensitivity analysis is provided in Appendix A of this document.

Sensitivity Analysis Methodology

The test methodology for statistical comparison is presented below.

Methodology:
  1. Develop a matrix of projects adapted after in-service LTPP pavement projects for use in conducting sensitivity analysis. Criteria for developing the matrix of projects was as follows:
    1. Include all pavement types of interest (new flexible, asphalt overlaid flexible pavement, new jointed plain concrete pavement (JPCP), and asphalt overlaid JPCP).
    2. Cover all four LTPP climate zones (wet-freeze, wet-nofreeze, dry-freeze, dry-nofreeze).
    3. Based on the criteria (a) & (b) there was a total of 16 super cells in the proposed matrix. Each super cell will contain on average 2 to 3 projects representing a mix of AC and PCC thicknesses, sub-grade type, and highway functional class/traffic level.
  2. Develop RSL computation parameters as follows:
    1. Terminal distress and smoothness.
    2. Maximum service life.
    3. Analysis types of interest:
    4. First to critical distress.
    5. Weighted average (equal weights for all distress and smoothness)
    6. Create PHT Tool and DARWin-ME input files.
    7. Run PHT Tool and DARWin-ME for all the projects included in the matrix.
    8. Obtain PHT estimates of RSL for all analysis types of interest.
  3. Use the DARWin-ME predictions of future distress and smoothness to estimate RSL for all analysis types of interest.
  4. Perform a statistical analysis to compare RSL output from the PHT Analysis Tool and the DARWin-ME 1.0 estimated RSL for goodness of fit, bias, t-test, etc.

This methodology was designed to ensure that both PHT Analysis Tool and the DARWin-ME pavement RSL estimates were based on the respective tools pavement performance model. Thus usual constraints in estimating pavement RSL such as maximum service life depending on pavement types were relaxed. Details of the analysis framework are presented in the following sections. Internal calibration of predicted distress/IRI using measured pavement performance data was not utilized.

The criteria for PHT Analysis Tool and the DARWin-ME model pavement RSL estimates include establishing performance thresholds that define the terminal distress and smoothness values that indicate the end of the pavements service life as well as establishing the maximum service life for each surface type. The terminal values for the smoothness and each distress type are shown in Table 1 and the maximum service life is shown in Table 2.

Table 1. Terminal Distress/Smoothness Values
Suggested Performance Criteria for Use in Pavement Design
Pavement Type Performance Criteria Maximum Value at End of Design Life
AC pavement and AC/PCC overlays AC alligator cracking Interstate: 10 percent lane area
Primary: 20 percent lane area
Local: 45 percent lane area
AC pavement and AC/PCC overlays Rutting Interstate: 0.75 inch mean
Primary: 1 inch mean
AC pavement and AC/PCC overlays Transverse cracking Interstate: Crack length < 1000-ft/mile
Primary/Secondary: Crack length < 1,500-ft/mile
AC pavement and AC/PCC overlays IRI Interstate: 150 inch/mile
Primary: 175 inch/mile
Secondary: 200 inch/mile
New JPCP Mean joint faulting Interstate: 0.10 inch mean all joints
Primary: 0.15 inch mean all joints
Secondary: 0.25 inch mean all joints
New JPCP Percent slabs with transverse cracking Interstate: 10 percent
Primary: 15 percent
Secondary: 20 percent
New JPCP IRI Interstate: 150 inch/mile
Primary: 175 inch/mile
Secondary: 200 inch/mile

Table 2. Maximum Service Life
Surface Type Maximum Service Life, years
New HMA 30
New PCC 40
Thick AC Overlay of AC Pavement 15
Thin AC Overlay of AC Pavement 10
Thick AC Overlay of PCC Pavement 25
Unbonded PCC Overlay of PCC Pavement 30
Bonded PCC Overlay of PCC Pavement 15
Thin AC Overlay of AS/PCC Pavement 25

There are two analysis types that include a critical distress that determines the RSL until the first distress or smoothness threshold, and a weighted average that averages the critical distress for each distress type and smoothness together using an equal weight for each.

For analysis, pavement original or overlay construction date was assumed to be 2011. This was done to ensure that RSL estimates was mostly computed based on predicted future pavement condition rather than based on constraints of typical service life.

Statistical Analysis

Statistical analyses were performed to determine if pavement RSL predicted from the PHT Analysis Tool and from the DARWin-ME are sufficiently similar or significantly different and if a bias exists in the PHT Analysis Tool RSL predictions.

The goodness of fit between DARWin-ME and PHT Tool estimates of RSL was assessed by performing linear regression using DARWin-ME and PHT Tool estimates of RSL and determining diagnostic statistics including coefficient of determination (R2) and standard error of the estimate (SEE). Engineering judgment was then used to determine the practical reasonableness of both diagnostic statistics. Models exhibiting a poor R2 of less than 50 percent or excessive SEE of an RSL estimate greater than 3 years were deemed to be inadequate. If this occurs, it implies that statistically the PHT Analysis Tool is inadequately predicting RSL as defined by DARWin-ME. However, the magnitude of the difference in RSL must also be assessed on the basis of what is the practical importance of the difference.

Bias is defined as the consistent under or over prediction of pavement RSL. Bias is determined by performing linear regression using DARWin-ME and PHT Analysis Tool estimates of RSL and performing the following two hypothesis tests. A significance level of 5% was assumed for all hypothesis testing. This level of significance is often used in similar analyses and gives a relatively low probability of making a false judgment on bias of the RSL estimates.

Hypothesis One:
A paired t-test was done to determine whether the DARWin-ME and PHT Tool estimates of RSL represented the same population of RSL. The paired t-test was performed as follows:
  1. Assume the following null and alternative hypothesis
    1. H0: Mean DARWin-ME RSL = mean PHT Tool RSL
    2. HA: Mean DARWin-ME RSL ≠ mean PHT Tool RSL
  2. Compute test p-value
  3. Compare computed p-value to predetermined level of significance for this test. Note a significance level of 5 percent was adopted for this analysis
Note:
A rejection of the null hypothesis (p-value < 0.05) would imply DARWin-ME and PHT Tool estimates of RSL are from different populations. This indicates that for the range of RSL used in analysis, the PHT Analysis Tool will produce biased predictions of RSL defined by DARWin-ME.
Hypothesis Two:
Determine whether the linear regression model developed using DARWin-ME and PHT Tool estimates of RSL has an intercept of 0 and a slope of 1.0:
  1. Using the results of the linear regression analysis, test the following null and alternative hypotheses to determine if the fitted linear regression model has an slope of 1.0:
    1. Intercept
      1. H01: Model intercept = 1.0
      2. HA1: Model intercept ≠ 1.0
    2. Slope
      1. H02: Model slope = 1.0
      2. HA2: Model slope ≠ 1.0
  2. Compute test p-values
  3. Compare computed p-value to predetermined level of significant for this test. Note a significance level of 5 percent was adopted for this analysis.
Note:
A rejection of the null hypothesis 1 & 2 (p-value < 0.05) would imply that the linear model has an intercept significantly different from 0 and a slope significantly different from 1.0 at the 5 percent significance level. This indicates that using the PHT Analysis Tool could produce biased estimates of RSL defined by DARWin-ME.

The presence of bias does not necessarily imply that the PHT Analysis Tool is inadequate; rather it basically means that there is statistical bias in estimates of pavement RSL. The significance of the bias (over or under prediction) can also be judged on an engineering practical basis. If the PHT Analysis Tool RSL estimate is deemed to be biased or with an inadequate goodness of fit, and the differences are deemed to be too large for practical usage, recalibration of the PHT Analysis Tool will be needed.

Sensitivity Analysis

Comparisons of PHT Analysis Tool and DARWin-ME estimates of RSL are presented graphs below in Figure 1 and Figure 2. For the PHT Tool analysis, RSL was computed as described in the software user guide and other reference documents. For DARWin-ME, RSL was computed using relevant predicted pavement distress and IRI following the guidelines in appropriate PHT guide documents and analysis criteria; such as terminal distress/IRI and maximum service life.

A preliminary review of the comparison plots shows the following:

  • Correlation between PHT Tool and DARWin-ME estimates of RSL was poor, ranging from 2.6 to 4.3 percent.
  • PHT Tool estimates of RSL were generally far higher than DARWin-ME.
  • Error in RSL estimates was considerably high (ranging from 8.7 to 11 years).
  • Coefficient of variation (COV) ranged from 35 to 53 percent which is also high.
A scatter graph shows values for first to critical DARWin-ME RSL estimate in years over first to critical PHT RSL estimate in years for R squared equals 4.3 percent, SEE equals 11.1 years, and N equals 40. The DARWin-ME points trend along a value of 30 years from zero to 20 years PHT, scatter in a cluster between 20 and 30 years PHT and trend back to a value near 30 for 80 to 150 years PHT.
Figure 1. Comparison of PHT and DARWin-ME RSL Estimates, Critical Failure
A scatter graph shows values for weighted average DARWin-ME RSL estimate in years over weighted average PHT RSL estimate in years for R squared equals 2.6 percent, SEE equals 8.7 years, and N equals 40. From about 40 years to 150 years PHT, DARWin-ME values fall in a band from lows of 10 to highs of 30 years.
Figure 2. Comparison of PHT and DARWin-ME RSL Estimates, Weighted Average

A more detailed statistical evaluation of PHT Analysis Tool and DARWin-ME RSL estimates was performed and the results are presented in Table 3 that shows the following:

  • For first to critical threshold analysis, PHT estimates of pavement RSL was mostly significantly different from that estimated from DARWin-ME. The slopes of the lines indicated significant bias for over prediction of RSL. The only parameter not significantly differently was the mean RSL (17.97 versus 21.4 years) and this was barely not significant (paired t-test p-value was 0.0594).
  • For weighted average analysis, PHT estimates of pavement RSL was significantly different from that estimated from DARWin-ME for all parameters.
Table 3. Statistical Test Results Comparing PHT and DARWin-ME RSL Estimates
Analysis Type Statistical Hypothesis Test Mean Intercept Value, years 95 Percent Confidence Limits p-Value
Critical
(First distress or IRI)
Intercept = 0 17.97 12.7 to 23.2 < 0.0001
Critical
(First distress or IRI)
Slope = 1 0.43 0.29 to 0.56 < 0.0001
Critical
(First distress or IRI)
Paired t-test     0.0594
Weighted Average
(Equal weights for all distress and IRI)
Intercept = 0 21.4 13.7 to 29.1 < 0.0001
Weighted Average
(Equal weights for all distress and IRI)
Slope = 1 0.322 0.27 to 0.37 < 0.0001
Weighted Average
(Equal weights for all distress and IRI)
Paired t-test     < 0.0001

Finally, a more detailed review of individual distress types and IRI was done to determine which of the individual PHT models produced estimates of RSL significantly different from that from DARWin-ME. The results are presented in Table 4.

Table 4. Statistical Test Results for each Distress/IRI and Pavement Type
Pavement Type AC Fatigue Cracking AC Rutting AC Transverse Cracking (Thermal Reflection) JPCP Slab Transverse Cracking JPCP Faulting IRI
New AC <0.0001Highlighted Cell 0.1871 0.0035Highlighted Cell Not applicable Not applicable 0.0710
AC/AC <0.0001Highlighted Cell 0.0318Highlighted Cell 0.3493 Not applicable Not applicable 0.0343Highlighted Cell
New JPCP Not applicable Not applicable Not applicable 0.0539Highlighted Cell <0.0001Highlighted Cell 0.0029Highlighted Cell
AC/JPCP Not applicable Not applicable 0.6845 Not applicable Not applicable <0.0001Highlighted Cell
Note: The highlighted cells indicate distress/IRI based RSL estimates from PHT Tool are significantly different from that computed using DARWin-ME predicted distress/IRI.

The AC rutting model coefficients adapted from the original MEPDG version 0.8 were modified during PHT Tool development to reduce bias and enhance precision. Thus, this version of the rutting model was developed using more recent versions of the MEPDG; after the original 2004 NCHRP 1-37A submissions. This is why this version of the PHT Analysis Tool rutting model was found to be adequate.

The data shown above illustrates that 8 of the 13 distress/IRI and pavement type combination exhibited PHT Analysis Tool RSL estimates significantly different from those estimated by the DARWin-ME tool.

Bias and Precision

Any pavement distress/IRI and RSL prediction tool should be a representation of reality. How well reality is represented is dependent upon factors such as reasonableness of the input data, validity of the underlying mathematical algorithms, and the model bias and precision.

Having been used and tested over many years, the DARWin-ME mathematical algorithms and the LTPP data have proven to be reasonable and robust. Thus a good match between the PHT Analysis Tool and DARWin-ME output would indicate that the PHT Analysis Tool also has good sound mathematical formulations and have been calibrated to reflect the same reality that the DARWin-ME tool does. However, an imperfect match does indicate some type of defect exists that will need to be rectified through changes in mathematical formulations and/or recalibration to make the PHT Analysis Tool reflect the DARWin-ME reality.

Of utmost importance for such a comparative analysis is the definition of an adequate match between the outputs. The statistics commonly used to characterize how well a tools output adequately reflects reality is bias and precision.

Bias is the systematic difference that arises between the observed and predicted values as illustrated in Figure 3. Specific to this analysis, bias is when PHT estimates of RSL are systematically over or under predicting DARWin-ME RSL estimates.

A dimensionless line chart shows the relationship of frequency over value. The curve for a population is represented as a bell curve, while the curve for the sampled data is represented as a bell curve shifted to the right. The interval between the peak of the population curve and the peak of the sampled data curve is designated as bias.
Figure 3. Bias in Sampled Data 1

Precision is the measure of how closely the observed and predicted data are related to each other as illustrated in Figure 4. Specific to this analysis, precision will be how closely PHT estimates of RSL relate to DARWin-ME RSL estimates.

A dimensionless line chart shows the relationship of frequency over value. The curve for a population is represented as a bell curve, while the curve for the sampled data is represented as a bell curve that is less steep and slightly lower. With the peaks aligned, precision is indicated by the closeness of pairs of values on the curves, referenced to the peaks.
Figure 4. Precision in Sampled Data 1

The four scenarios illustrated in Figure 5 show how bias and precision fit into the context of adequacy when comparing observed and predicted data. Specific to this analysis, the outputs from the PHT Analysis Tool will be compared to a baseline reality RSL as estimated from the DARWin-ME tool.

Four dimensionless area charts show data spreads for predicted versus observed values with reference to the linear plot for observed equals predicted value. Chart A plots values with low bias and high precision, where the area plot is narrow and bisected by the linear plot. Chart B plots values with low bias and low precision, where the area plot is circular and bisected by the linear plot. Chart C plots values with high bias and high precision, where the area plot is narrow but offset below the linear plot. Chart D plots values with high bias and low precision, where the area plot is circular but offset from the linear plot.
Figure 5. Bias and Precision for four Scenarios 1

In the figure above, the shaded ellipse represents DARWin-ME estimated RSL which has been plotted against PHT Analysis Tool estimated RSL. The solid line at 45 degrees is the line of equality, where DARWin-ME and PHT estimated RSL are supposed to be equal. Each chart above is described in more detail in Table 5.

A pavement management and planning tool such as PHT must, as a minimum, satisfy the requirements of a model exhibiting low bias and low precision illustrated in Figure 5B to be described as adequate. The PHT sensitivity analysis results shown in Figure 1 and Figure 2 illustrate an outcome closer to that of Figure 5C with high bias and high precision. High bias and high precision implies that even though there is a relationship between the RSL estimates from DARWin-ME and PHT, the outputs from the two tools represents two very different populations.

Table 5. Bias and Precision Scenarios for Pavement Analysis
Distress/IRI Prediction Scenario Description Applicability Needed Enhancement
Low bias & high precision
(see Figure 5A)
This is the best case scenario for distress/IRI prediction. Low bias and high precious is characteristic of well calibrated mechanistic-empirical (cause and effect) pavement models. Such models employ a large input dataset that describes and characterizes key pavement structure, materials, design, and site properties. The detailed input data combined with complex mathematical equations forms the basis for distress/IRI prediction Pavement Design and Analysis
For such analysis, the goal is to obtain best estimate of pavement performance for a specific section under design/analysis. Extensive recourses can thus be employed to obtain data needed for the models. A reliability factor is added when used for design
None
Low bias - low precision
(see Figure 5B)
This is the second best case scenario for distress/IRI prediction. Low bias and high precious is characteristic of empirical pavement models requiring limited amounts of input data for characterizing pavement structure, materials, design, and site properties Pavement Management & Planning
For such analysis, the goal is to obtain an overall best estimate of pavement condition across a pavement network or corridor. Thus the accuracy of pavement distress/IRI prediction for each pavement section does not matter as much as accuracy of overall network/corridor condition.
None
High bias - high precision
(see Figure 5C)
Although distress/IRI is predicted with high precision, the magnitude of distress/IRI is significantly different from "reality. " High precision typically implies that although underlying model assumptions and algorithms may be reasonable, the models calibration factors are inadequate leading to the presence of significant bias Not Suitable for Any Kind of Analysis
Models exhibiting high bias cannot be used for pavement design, forensic, management, or planning purposes.
Apply needed translation or rotation correction factors through calibration of models
(see Figure 6)
High bias - low precision
(see Figure 5D)
Distress/IRI prediction with low precision and high bias basically indicates a flawed model. Not Suitable for Any Kind of Analysis
Models exhibiting high bias and low precision cannot be used for pavement design, forensic, management, or planning purposes.
Apply needed translation or rotation correction factors through calibration of models
(see Figure 6)
Basic model formulation may have to be revised

The reasons for high bias and high precision include the following:

  • The PHT Analysis Tool models were developed based on original DARWin-ME technology and outputs.
  • Since original development, there have been significant changes to the DARWin-ME models and recalibration using practically new enhanced datasets with over 10 years of additional performance data and 5 years of additional climate data.

There was a need to develop correction factors that can be used to make necessary adjustments to the existing PHT models to make them more compatible with DARWin-ME. The corrections factors are typically obtained through calibration. Examples of needed correction factors for the different situations are illustrated in Figure 6.

Three dimensionless area charts show data spreads for predicted versus observed values with reference to the linear plot for observed equals predicted value. Chart A shows the correction factor termed rotation, which addresses the horizontal difference between the axis of the area plot and the linear plot. Chart B shows the correction factor termed translation, which addresses the vertical difference between the axis of the area plot and the linear plot. Chart C shows the correction factor termed rotation and translation, which addresses the differences in combination.
Figure 6. Correction Factors to Reduce Bias and Increase Precision 1

Recommendations

Based on the statistical comparison results presented, it can be concluded that the PHT Tool pavement performance models mostly produce RSL estimates that are different from that from DARWin-ME. There is therefore a great need to recalibrate the PHT Tool pavement models to make them more comparable to that from DARWin-ME

The exact cause of differences in PHT Tool and DARWin-ME RSL estimates was further investigated by performing a t-test on individual distress/IRI RSL obtained from the PHT Tool and DARWin-ME as it from the individual distress/IRI RSL estimates that overall first to critical or weighted average RSL is computed. The results of the t-test showered the following models exhibiting significant bias at the 0.05 level of significance:

  • New AC
    • Alligator cracking
    • Transverse "thermal " cracking
  • AC overlaid AC pavement
    • Alligator cracking
    • Rutting
    • IRI
  • New JPCP
    • Transverse cracking
    • Faulting
    • IRI
  • AC overlaid AC
    • IRI

Thus, it is recommended that:

  • All the distress/IRI models listed above be recalibrated to make their prediction more line with predictions from DARWin-ME. Even the few models that were not biased could be made better through recalibration. The many changes made between MEPDG Version 0.8 in 2005 and the 2011 DARWin-ME software require a recalibration of the models.
  • The recalibrated distress/IRI models must be verified using a limited selection of LTPP and HPMS projects.

The reason for this verification is to further ensure that PHT Tool estimates of RSL are reasonable. By enhancing precision and reducing bias for the models listed above, the PHT Tool will improve individual distress/IRI based RSL predictions and produce overall RSL estimates with high precision and low bias.

In practical terms, use of the current PHT Analysis Tool to predict RSL may result in over prediction of RSL. The magnitude of over prediction depends on the pavement design and site conditions. Recalibration of the models is required to bring the models into unbiased predictions.


1 Bennett and Paterson, 2000
Updated: 11/22/2013