The PHT Analysis Tool version 1.1 pavement performance prediction models were developed based on the simplified MEPDG version 0.8 models. Significant improvements have been made to the MEPDG models and algorithms used for computing fundamental pavement responses and the performance models as a whole has been recalibrated to reflect both changes in both computational algorithms and improved and more extensive LTPP calibration data. The objective of this task was to investigate whether there is significant differences between MEPDG versions 0.8 (PHT Tool) and DARWinME 1.0 pavement performance predictions and how possible differences do impact the results of the PHT Analysis Tool. This investigation involved conducting a sensitivity analysis to develop RSL estimates under different scenarios using the PHT Tool and the DARWinME pavement performance models and performing a statistical comparison of RSL outputs from each application.
The test experiment used a total of 40 LTPP unique sites from 20 states with 4 climate zones, 3 base type and four pavement surface types. The complete summary of the projects selected for the sensitivity analysis is provided in Appendix A of this document.
The test methodology for statistical comparison is presented below.
Methodology:

This methodology was designed to ensure that both PHT Analysis Tool and the DARWinME pavement RSL estimates were based on the respective tools pavement performance model. Thus usual constraints in estimating pavement RSL such as maximum service life depending on pavement types were relaxed. Details of the analysis framework are presented in the following sections. Internal calibration of predicted distress/IRI using measured pavement performance data was not utilized.
The criteria for PHT Analysis Tool and the DARWinME model pavement RSL estimates include establishing performance thresholds that define the terminal distress and smoothness values that indicate the end of the pavements service life as well as establishing the maximum service life for each surface type. The terminal values for the smoothness and each distress type are shown in Table 1 and the maximum service life is shown in Table 2.
Suggested Performance Criteria for Use in Pavement Design  

Pavement Type  Performance Criteria  Maximum Value at End of Design Life 
AC pavement and AC/PCC overlays  AC alligator cracking  Interstate: 10 percent lane area Primary: 20 percent lane area Local: 45 percent lane area 
AC pavement and AC/PCC overlays  Rutting  Interstate: 0.75 inch mean Primary: 1 inch mean 
AC pavement and AC/PCC overlays  Transverse cracking  Interstate: Crack length < 1000ft/mile Primary/Secondary: Crack length < 1,500ft/mile 
AC pavement and AC/PCC overlays  IRI  Interstate: 150 inch/mile Primary: 175 inch/mile Secondary: 200 inch/mile 
New JPCP  Mean joint faulting  Interstate: 0.10 inch mean all joints Primary: 0.15 inch mean all joints Secondary: 0.25 inch mean all joints 
New JPCP  Percent slabs with transverse cracking  Interstate: 10 percent Primary: 15 percent Secondary: 20 percent 
New JPCP  IRI  Interstate: 150 inch/mile Primary: 175 inch/mile Secondary: 200 inch/mile 
Surface Type  Maximum Service Life, years 

New HMA  30 
New PCC  40 
Thick AC Overlay of AC Pavement  15 
Thin AC Overlay of AC Pavement  10 
Thick AC Overlay of PCC Pavement  25 
Unbonded PCC Overlay of PCC Pavement  30 
Bonded PCC Overlay of PCC Pavement  15 
Thin AC Overlay of AS/PCC Pavement  25 
There are two analysis types that include a critical distress that determines the RSL until the first distress or smoothness threshold, and a weighted average that averages the critical distress for each distress type and smoothness together using an equal weight for each.
For analysis, pavement original or overlay construction date was assumed to be 2011. This was done to ensure that RSL estimates was mostly computed based on predicted future pavement condition rather than based on constraints of typical service life.
Statistical analyses were performed to determine if pavement RSL predicted from the PHT Analysis Tool and from the DARWinME are sufficiently similar or significantly different and if a bias exists in the PHT Analysis Tool RSL predictions.
The goodness of fit between DARWinME and PHT Tool estimates of RSL was assessed by performing linear regression using DARWinME and PHT Tool estimates of RSL and determining diagnostic statistics including coefficient of determination (R^{2}) and standard error of the estimate (SEE). Engineering judgment was then used to determine the practical reasonableness of both diagnostic statistics. Models exhibiting a poor R^{2} of less than 50 percent or excessive SEE of an RSL estimate greater than 3 years were deemed to be inadequate. If this occurs, it implies that statistically the PHT Analysis Tool is inadequately predicting RSL as defined by DARWinME. However, the magnitude of the difference in RSL must also be assessed on the basis of what is the practical importance of the difference.
Bias is defined as the consistent under or over prediction of pavement RSL. Bias is determined by performing linear regression using DARWinME and PHT Analysis Tool estimates of RSL and performing the following two hypothesis tests. A significance level of 5% was assumed for all hypothesis testing. This level of significance is often used in similar analyses and gives a relatively low probability of making a false judgment on bias of the RSL estimates.
Hypothesis One: A paired ttest was done to determine whether the DARWinME and PHT Tool estimates of RSL represented the same population of RSL. The paired ttest was performed as follows:
A rejection of the null hypothesis (pvalue < 0.05) would imply DARWinME and PHT Tool estimates of RSL are from different populations. This indicates that for the range of RSL used in analysis, the PHT Analysis Tool will produce biased predictions of RSL defined by DARWinME. 
Hypothesis Two: Determine whether the linear regression model developed using DARWinME and PHT Tool estimates of RSL has an intercept of 0 and a slope of 1.0:
A rejection of the null hypothesis 1 & 2 (pvalue < 0.05) would imply that the linear model has an intercept significantly different from 0 and a slope significantly different from 1.0 at the 5 percent significance level. This indicates that using the PHT Analysis Tool could produce biased estimates of RSL defined by DARWinME. 
The presence of bias does not necessarily imply that the PHT Analysis Tool is inadequate; rather it basically means that there is statistical bias in estimates of pavement RSL. The significance of the bias (over or under prediction) can also be judged on an engineering practical basis. If the PHT Analysis Tool RSL estimate is deemed to be biased or with an inadequate goodness of fit, and the differences are deemed to be too large for practical usage, recalibration of the PHT Analysis Tool will be needed.
Comparisons of PHT Analysis Tool and DARWinME estimates of RSL are presented graphs below in Figure 1 and Figure 2. For the PHT Tool analysis, RSL was computed as described in the software user guide and other reference documents. For DARWinME, RSL was computed using relevant predicted pavement distress and IRI following the guidelines in appropriate PHT guide documents and analysis criteria; such as terminal distress/IRI and maximum service life.
A preliminary review of the comparison plots shows the following:
A more detailed statistical evaluation of PHT Analysis Tool and DARWinME RSL estimates was performed and the results are presented in Table 3 that shows the following:
Analysis Type  Statistical Hypothesis Test  Mean Intercept Value, years  95 Percent Confidence Limits  pValue 

Critical (First distress or IRI) 
Intercept = 0  17.97  12.7 to 23.2  < 0.0001 
Critical (First distress or IRI) 
Slope = 1  0.43  0.29 to 0.56  < 0.0001 
Critical (First distress or IRI) 
Paired ttest  0.0594  
Weighted Average (Equal weights for all distress and IRI) 
Intercept = 0  21.4  13.7 to 29.1  < 0.0001 
Weighted Average (Equal weights for all distress and IRI) 
Slope = 1  0.322  0.27 to 0.37  < 0.0001 
Weighted Average (Equal weights for all distress and IRI) 
Paired ttest  < 0.0001 
Finally, a more detailed review of individual distress types and IRI was done to determine which of the individual PHT models produced estimates of RSL significantly different from that from DARWinME. The results are presented in Table 4.
Pavement Type  AC Fatigue Cracking  AC Rutting  AC Transverse Cracking (Thermal Reflection)  JPCP Slab Transverse Cracking  JPCP Faulting  IRI 

New AC  <0.0001  0.1871  0.0035  Not applicable  Not applicable  0.0710 
AC/AC  <0.0001  0.0318  0.3493  Not applicable  Not applicable  0.0343 
New JPCP  Not applicable  Not applicable  Not applicable  0.0539  <0.0001  0.0029 
AC/JPCP  Not applicable  Not applicable  0.6845  Not applicable  Not applicable  <0.0001 
The AC rutting model coefficients adapted from the original MEPDG version 0.8 were modified during PHT Tool development to reduce bias and enhance precision. Thus, this version of the rutting model was developed using more recent versions of the MEPDG; after the original 2004 NCHRP 137A submissions. This is why this version of the PHT Analysis Tool rutting model was found to be adequate.
The data shown above illustrates that 8 of the 13 distress/IRI and pavement type combination exhibited PHT Analysis Tool RSL estimates significantly different from those estimated by the DARWinME tool.
Any pavement distress/IRI and RSL prediction tool should be a representation of reality. How well reality is represented is dependent upon factors such as reasonableness of the input data, validity of the underlying mathematical algorithms, and the model bias and precision.
Having been used and tested over many years, the DARWinME mathematical algorithms and the LTPP data have proven to be reasonable and robust. Thus a good match between the PHT Analysis Tool and DARWinME output would indicate that the PHT Analysis Tool also has good sound mathematical formulations and have been calibrated to reflect the same reality that the DARWinME tool does. However, an imperfect match does indicate some type of defect exists that will need to be rectified through changes in mathematical formulations and/or recalibration to make the PHT Analysis Tool reflect the DARWinME reality.
Of utmost importance for such a comparative analysis is the definition of an adequate match between the outputs. The statistics commonly used to characterize how well a tools output adequately reflects reality is bias and precision.
Bias is the systematic difference that arises between the observed and predicted values as illustrated in Figure 3. Specific to this analysis, bias is when PHT estimates of RSL are systematically over or under predicting DARWinME RSL estimates.
Precision is the measure of how closely the observed and predicted data are related to each other as illustrated in Figure 4. Specific to this analysis, precision will be how closely PHT estimates of RSL relate to DARWinME RSL estimates.
The four scenarios illustrated in Figure 5 show how bias and precision fit into the context of adequacy when comparing observed and predicted data. Specific to this analysis, the outputs from the PHT Analysis Tool will be compared to a baseline reality RSL as estimated from the DARWinME tool.
In the figure above, the shaded ellipse represents DARWinME estimated RSL which has been plotted against PHT Analysis Tool estimated RSL. The solid line at 45 degrees is the line of equality, where DARWinME and PHT estimated RSL are supposed to be equal. Each chart above is described in more detail in Table 5.
A pavement management and planning tool such as PHT must, as a minimum, satisfy the requirements of a model exhibiting low bias and low precision illustrated in Figure 5B to be described as adequate. The PHT sensitivity analysis results shown in Figure 1 and Figure 2 illustrate an outcome closer to that of Figure 5C with high bias and high precision. High bias and high precision implies that even though there is a relationship between the RSL estimates from DARWinME and PHT, the outputs from the two tools represents two very different populations.
Distress/IRI Prediction Scenario  Description  Applicability  Needed Enhancement 

Low bias & high precision (see Figure 5A) 
This is the best case scenario for distress/IRI prediction. Low bias and high precious is characteristic of well calibrated mechanisticempirical (cause and effect) pavement models. Such models employ a large input dataset that describes and characterizes key pavement structure, materials, design, and site properties. The detailed input data combined with complex mathematical equations forms the basis for distress/IRI prediction  Pavement Design and Analysis For such analysis, the goal is to obtain best estimate of pavement performance for a specific section under design/analysis. Extensive recourses can thus be employed to obtain data needed for the models. A reliability factor is added when used for design 
None 
Low bias  low precision (see Figure 5B) 
This is the second best case scenario for distress/IRI prediction. Low bias and high precious is characteristic of empirical pavement models requiring limited amounts of input data for characterizing pavement structure, materials, design, and site properties  Pavement Management & Planning For such analysis, the goal is to obtain an overall best estimate of pavement condition across a pavement network or corridor. Thus the accuracy of pavement distress/IRI prediction for each pavement section does not matter as much as accuracy of overall network/corridor condition. 
None 
High bias  high precision (see Figure 5C) 
Although distress/IRI is predicted with high precision, the magnitude of distress/IRI is significantly different from "reality. " High precision typically implies that although underlying model assumptions and algorithms may be reasonable, the models calibration factors are inadequate leading to the presence of significant bias  Not Suitable for Any Kind of Analysis Models exhibiting high bias cannot be used for pavement design, forensic, management, or planning purposes. 
Apply needed translation or rotation correction factors
through calibration of models (see Figure 6) 
High bias  low precision (see Figure 5D) 
Distress/IRI prediction with low precision and high bias basically indicates a flawed model.  Not Suitable for Any Kind of Analysis Models exhibiting high bias and low precision cannot be used for pavement design, forensic, management, or planning purposes. 
Apply needed translation or rotation correction factors
through calibration of models (see Figure 6) Basic model formulation may have to be revised 
The reasons for high bias and high precision include the following:
There was a need to develop correction factors that can be used to make necessary adjustments to the existing PHT models to make them more compatible with DARWinME. The corrections factors are typically obtained through calibration. Examples of needed correction factors for the different situations are illustrated in Figure 6.
Based on the statistical comparison results presented, it can be concluded that the PHT Tool pavement performance models mostly produce RSL estimates that are different from that from DARWinME. There is therefore a great need to recalibrate the PHT Tool pavement models to make them more comparable to that from DARWinME
The exact cause of differences in PHT Tool and DARWinME RSL estimates was further investigated by performing a ttest on individual distress/IRI RSL obtained from the PHT Tool and DARWinME as it from the individual distress/IRI RSL estimates that overall first to critical or weighted average RSL is computed. The results of the ttest showered the following models exhibiting significant bias at the 0.05 level of significance:
Thus, it is recommended that:
The reason for this verification is to further ensure that PHT Tool estimates of RSL are reasonable. By enhancing precision and reducing bias for the models listed above, the PHT Tool will improve individual distress/IRI based RSL predictions and produce overall RSL estimates with high precision and low bias.
In practical terms, use of the current PHT Analysis Tool to predict RSL may result in over prediction of RSL. The magnitude of over prediction depends on the pavement design and site conditions. Recalibration of the models is required to bring the models into unbiased predictions.