U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590

Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

This report is an archived publication and may contain dated technical, contact, and link information
Back to Publication List        
Publication Number:  FHWA-HRT-17-104    Date:  June 2018
Publication Number: FHWA-HRT-17-104
Date: June 2018


Using Multi-Objective Optimization to Enhance Calibration of Performance Models in the Mechanistic-Empirical Pavement Design Guide




The first step in the proposed research study was a comprehensive literature review regarding calibration of the pavement performance models in the AASHTOWare® Pavement ME Design software. The literature review also included multi-objective model calibration studies in research areas other than pavement engineering. The major objective of this literature review was to identify important sources of information for model calibration and to formulate corresponding objective functions. The review also enabled researchers to base their selection of range and precision of possible calibration factors on previous calibration studies.

Results of previous calibration efforts indicated little to no problem in calibration of thermal (transverse) cracking and smoothness prediction models. Therefore, the existing single-objective calibration procedure seems to be sufficient for these two models.(3) There seem to be difficulties in calibration of the longitudinal cracking model associated with the lack of fit of the global model.(5) These difficulties require reconsideration of the formulation for the longitudinal cracking model, and therefore, this model is not considered for this research project.

The permanent deformation model has been reported to consistently overpredict measured pavement rutting. On the other hand, the fatigue cracking model has been reported to underpredict actual pavement distress in most studies. A more sophisticated calibration procedure could address these consistent deviations of model predictions from measured pavement performance. This research project was therefore originally focused on local calibration of prediction models for rutting and fatigue cracking in flexible pavements. Hence, the following literature review includes information regarding both models. However, due to limitations in the resources and schedule of this project, only the permanent deformation models were considered to demonstrate the proof of concept for multi-objective calibration.


The amount of total permanent deformation in flexible pavements is calculated as the sum of plastic deformations in each of the hot-mix asphalt (HMA), base, and subgrade layers. The model for predicting rutting (permanent deformation) in HMA layers (inches)[1] of flexible pavements has the form of equation 1:(1)

No 508 description provided     (1)


Δp = predicted rutting (inches).
hHMA= thickness of the HMA layer (inches).
εp= plastic strain in the layer (inch/inch).
εr= resilient (recoverable) strain in the layer (inch/inch).
T = layer temperature.
N = number of load repetitions.
k1, k2, k3 = global field calibration parameters (from NCHRP 1-40D Recalibration, k1 = –3.35412, k2 = 1.5606, k3 = 0.4791).
βr1, βr2, βr3 = local or mixture field calibration factors; these factors were all set to 1.0 for the global calibration.
kz= depth confinement factor, which is calculated through equation 2:

No 508 description provided     (2)


D = depth below the surface
C1 and C2 = coefficients to calculate the depth confinement factor; these coefficients are calculated according to equations 3 and 4:

No 508 description provided     (3)

No 508 description provided     (4)

The model for predicting rutting in unbound (base, subbase, and subgrade soil) layers (inches) of flexible pavements has the form of equation 5:(1)

No 508 description provided     (5)


hsoil = thickness of the unbound layer/sublayer (inches).
k1 = global calibration coefficient; k1 = 2.03 for granular materials, and k1 = 1.35 for fine-grained materials.
βs1 = local calibration factor; this factor was set to 1.0 for the global calibration; it is also called βGB for unbound base layers and βSG for subgrade layers.
εv = average vertical resilient or elastic strain in the layer (inch/inch) calculated by the structural response model.
ε0 = strain intercept (inch/inch) determined from laboratory repeated-load permanent deformation tests.
εr = resilient (recoverable) strain (inch/inch) imposed in laboratory test to obtain material properties.
epsilon sub 0 over epsilon sub r = strain ratio that is calculated using equation 6:

No 508 description provided     (6)

β and ρ are material properties that are calculated according to equations 7 and 8:

No 508 description provided     (7)

No 508 description provided     (8)

Where Wc is water content (%) that is calculated using equation 9:

No 508 description provided     (9)


GWT = depth of ground water table (ft).
C0 = factor depending on the material resilient modulus and is calculated through equation 10:

No 508 description provided     (10)


Mr = resilient modulus of the unbound layer/sublayer (psi).
a1, a9, b1, b9 = regression constants; a1 = 0.15, a9 = 20.0, b1 = 0.0, and b9 = 0.0.

The prediction model for fatigue (bottom–up or alligator) cracking (percent of total lane area) in flexible pavements has the form of equation 11:(1)

No 508 description provided     (11)


FCBottom–up = bottom–up alligator cracking.
C*1 and C*2 are coefficients that can be calculated using equations 12 and 13:

No 508 description provided     (12)

No 508 description provided     (13)


hHMA = total HMA thickness.
DIbottom–up = damage index that is calculated using equation 14:

No 508 description provided     (14)


DI = incremental damage index.
N = actual number of axle load applications within a specific period.
j = axle load interval.
m = axle load type (single, tandem, tridem, quad, or special axle configuration).
l = truck type using the truck classification groups included in the MEPDG.
p = month.
T = median temperature for the five temperature intervals used to subdivide each month.
Nf HMA = allowable number of axle load applications for a flexible pavement to fatigue cracking, and it is calculated using equation 15:

No 508 description provided     (15)


εt = tensile strain at the critical location.
E = dynamic modulus measured in compression.
kf1, kf2, kf3 = global field calibration parameters (from NCHRP 1-40D Recalibration,
kf1 = 0.007566, kf2 = –3.9492, kf3 = –1.281).(6)
βf1, βf2, βf3 = local or mixture field calibration factors; these factors were all set to 1.0 for the global calibration.
C = constant depending on mix properties and calculated using equations 16 and 17:

No 508 description provided     (16)

No 508 description provided     (17)


Va = air voids at the time the roadway is opened to traffic (%).
Vbe = effective asphalt content by volume of the mix placed on the roadway (%).
Ch = thickness correction term, and it is calculated using equation 18:

No 508 description provided     (18)

The local calibration procedure is aimed at determining the calibration factors that minimize the difference between measured and predicted pavement performance. This process includes reducing bias through minimization of average prediction error and lessening error variation through reduction of the standard deviation of error. Table 1 lists the calibration factors or coefficients that need to be determined in the model calibration process for rutting and fatigue cracking in flexible pavements.

Table 1. Calibration factors in prediction models for rutting and fatigue cracking in flexible pavements.(3)

Performance Model Calibration Objective:
Reduce Bias
Calibration Objective:
Reduce STE
Permanent deformation k1, βr1, βGB, and/or βSG k2, k3 and βr2, βr3
Fatigue cracking C2 or βf1 βf2, βf3 and C1
STE = standard error.


Based on the NCHRP Project 1-40B, the corresponding calibration factors in table 1 were found to be contributing to bias and standard error (STE).(4) The current single-objective calibration procedure determines the calibration factors in two steps corresponding to “eliminating” bias and reducing STE, respectively.(3) However, the multi-objective calibration approach in this research project will involve the determination of optimum values for all calibration factors to reduce bias and STE at the same time.


A very important task in calibration and implementation of AASHTOWare® Pavement ME Design software is selection of accurate values for input variables. Three main categories of pavement structure, climate, and traffic variables require ample efforts to determine corresponding values for every design project. By the same token, many State agencies have sponsored research efforts to characterize local pavement materials, determine local climatic data, and classify local traffic patterns. In fact, several State agencies have developed databases or software that specify corresponding values for each input variable to be used in the implementation of AASHTOWare®.(7,8)

The majority of the States have used LTPP data in combination with their State pavement management system (PMS) database to develop their MEPDG calibration database.(9,10) Differences in distress identification protocols between LTPP and State PMS surveys are a source of concern regarding the combination of these data sources to be used in model calibration efforts. Some States have addressed this issue by interpreting their distress data according to the Distress Identification Manual for the Long-Term Pavement Performance Program and using the transformed data.(11)

There are three levels of data precision (hierarchical input levels) for MEPDG input variables. Level 1 input values are site-specific data based on laboratory or field measurements that are the most accurate values. Level 2 values are derived based on correlations with other locally measured parameters or available historical data that were not necessarily measured at the specific site. Level 3 data are the default values that were established based on national averages, correlations, or both. Depending on the sensitivity of the predicted output to each input variable, it is important to use level 1 data when available.

Regarding asphalt material characterization in the MEPDG performance models, the most important (influential) input variable is the dynamic modulus of HMA. FHWA has developed software based on Artificial Neural Network (ANN) models to populate the LTPP database with dynamic modulus data.(12) Several State departments of transportation (DOTs) have also conducted HMA material characterization studies to determine asphalt binder and mixture properties to be used as level 1 (agency-specific) input values in AASHTOWare® Pavement ME Design software.(13) One of the key efforts in these studies was an evaluation of the Witczak model for calculation of dynamic modulus. Most of these studies found the Witczak model to produce reasonable predictions for dynamic modulus of HMA with conventional binders and mixtures. However, further modifications were required for binders with higher performance grades (PGs) and nonconventional mixtures, such as high recycled asphalt pavement (RAP) content, stone-matrix asphalt, cold-recycled asphalt, and warm-mix asphalt mixtures.

Most of the studies on characterization of unbound materials in flexible pavements have focused on determining resilient modulus values for typical granular aggregate base materials and local subgrade soils.(13) Several studies have also developed a resilient modulus prediction model based on soil parameters. In addition, falling weight deflectometer (FWD) and other nondestructive test results have been implemented to determine the resilient modulus values. The LTPP database contains repeated load resilient modulus test results, and FWD measured deflections that could be utilized in this regard.

The LTPP database contains extensive climatic data either measured at LTPP sites or estimated from adjacent weather stations. The impact of climatic and environmental parameters on material properties of unbound pavement layers is captured using the Enhanced Integrated Climatic Model (EICM) in MEPDG. Several studies have evaluated the predictions of EICM with test data.(13) Change in resilient modulus values due to seasonal variations and behavior of unsaturated soils is another topic currently under research in this area.

Traffic data inputs for the AASHTOWare® Pavement ME Design software have been calculated and are accessible for LTPP sites. LTPP data have been utilized to establish level 3 traffic inputs for the MEPDG. Several States have developed agency-specific traffic data and axle load spectra.(13) Some have also developed customized software to calculate MEPDG traffic inputs from weigh-in-motion (WIM) data.


Sensitivity analysis of performance prediction models is a qualitative assessment that can be implemented for multiple purposes, such as the following:

This research project will include a sensitivity analysis on the final calibrated models for the second purpose. The majority of sensitivity analyses conducted on MEPDG performance models in the literature correspond to the first purpose and have been carried out before calibration. In this project, the results of the previous studies will be utilized to determine the suitable range of input variables and calibration factors. The following are two types of sensitivity analyses on MEPDG performance models in the literature:

Sensitivity to Input Variables

The most comprehensive sensitivity analysis of MEPDG performance models to changes in input variables was carried out in the NCHRP Project 01-47, and the results provide valuable information regarding range and precision of input values to be considered for calibration of each model.(14) The adopted sensitivity metric was a Normalized Sensitivity Index (NSI), which represents percent change in predicted performance from its design limit value, normalized to a percentage change in an input variable.

This study comprised extensive one-at-a-time (OAT) sensitivity analyses in addition to comprehensive global sensitivity analysis (GSA). In contrast to the OAT analyses, the GSA varied all design inputs simultaneously across the entire problem domain. General agreements between OAT and GSA rankings of sensitivity to various input variables suggest that there were no significant interactions among design inputs. Therefore, the OAT analyses, which are computationally less demanding, could be adequate for sensitivity analysis of MEPDG performance models.

Multivariate linear regression and ANNs were utilized to fit response surface models (RSMs) to the GSA results, allowing for evaluation of sensitivities to design input variables. The ANN resulted in more accurate and robust representations of the compound relations between input design variables and output performance values. Based on frequency distributions and summary statistics generated using the ANN RSM, a “mean plus/minus two standard deviations” (m ± 2s) normalized sensitivity metric (NSIμ±2σ) was derived, which incorporates the mean sensitivity and the variability of the sensitivity across the problem domain. This metric was used to develop the following sensitivity categories:

The hypersensitive, very sensitive, and sensitive design inputs for rutting and fatigue cracking models are listed in table 2. As indicated in this table, the performance predictions are most sensitive to the dynamic modulus (E*) of HMA layers. Poisson’s ratio and thickness of the HMA layer and the surface shortwave absorptivity are also important input variables to which these models have shown high sensitivity.

The extreme sensitivity of performance models to the lower and upper shelves of HMA dynamic modulus master curve (alpha and delta parameters) is a questionable behavior. Nevertheless, this calls for careful characterization of dynamic modulus using mix-specific laboratory measurements. In addition, accurate representation is required for thickness and Poisson’s ratio values. The most challenging insight from this sensitivity analysis is that the performance models are very sensitive to several uncertain variables, such as the surface shortwave absorptivity for HMA, thermal conductivity, and heat capacity of stabilized bases, that cannot be readily measured.

Table 2. Sensitive design inputs for rutting and fatigue cracking models.(14) NSIm±2s values are given in parentheses.

Distress Input
Hypersensitive Very Sensitive Sensitive
Fatigue cracking HMA properties E* alpha (–15.9)
E* delta (–13.2)
Thickness (–7.5)
Air voids (+3.4)
Effective binder volume
Surface shortwave absorptivity (+1.3)
Poisson’s ratio (–1.0)
Unit weight (+1.0)
Heat capacity (–0.6)
High-temperature PG (–0.5)
Thermal conductivity (–0.4)
Fatigue cracking Base properties Resilient modulus (–2.7)
Thickness (–1.0)
Poisson’s ratio (+0.9)
Fatigue cracking Subgrade properties Resilient modulus (–3.4) Liquid limit (–0.8)
Percent passing no. 200 (–0.7)
Poisson’s ratio (–0.6)
Groundwater depth (–0.2)
Plasticity index (+0.1)
Fatigue cracking Other properties Traffic volume (+3.9) Operating speed (–0.8)
AC rutting HMA properties E* alpha (–24.4)
E* delta (–24.4)
Surface shortwave absorptivity (+4.6)
Poisson’s ratio (–4.3)
Thickness (–4.2)
Unit weight (–0.9)
Heat capacity (–0.8)
High-temperature PG (–0.7)
Low-temperature PG (+0.2)
Thermal conductivity (+0.2)
AC rutting Base properties Thickness (+0.2)
Poisson’s ratio (–0.2)
Resilient modulus (+0.1)
AC rutting Subgrade properties Percent passing no. 200 (–0.1)
Liquid limit (–0.1)
AC rutting Other properties Traffic volume (+1.9)
Operating speed (–1.1)
Total rutting HMA properties E* alpha (–9.0)
E* delta (–9.0)
Surface shortwave absorptivity (+1.7)
Thickness (–1.6)
Poisson’s ratio (–1.5)
Unit weight (–0.3)
Heat capacity (–0.3)
High-temperature PG (–0.2)
Total rutting Base properties Resilient modulus (–0.2)
Total rutting Subgrade properties Resilient modulus (–0.3)
Percent passing no. 200 (–0.1)
Total rutting Other properties Traffic volume (+0.7)
Operating speed (–0.4)
—No input variable is in this sensitivity category for this performance model; AC = asphalt concrete; E* = dynamic modulus of the HMA layer.


Sensitivity to Calibration Factors

Li et al. (2009) introduced another kind of sensitivity analysis, which is used to determine range and precision of calibration factors.(15) This study on calibration of MEPDG flexible pavement models for Washington DOT examines sensitivity of distress output to the change in each calibration factor. This sensitivity is represented by a metric called elasticity, which was calculated as in equation 19:(15)

No 508 description provided     (19)


No 508 description provided= the elasticity of calibration factor Ci for the associated distress condition.
∂ (distress) = change in distress.
distress = initial distress.

No 508 description provided is calculated as the ratio of normalized change in predicted distress divided by the normalized change in calibration factor. A positive value means that the predicted distress increases as the calibration factor increases, and a negative value implies that the predicted distress decreases as the calibration factor increases. Based on typical pavement structure, traffic, and climatic data in the Washington DOT PMS database, table 3 indicates elasticity values for calibration factors in MEPDG rutting and fatigue cracking models.(15)

Table 3. Elasticity of MEPDG calibration factors in rutting and fatigue cracking models for Washington State DOT flexible pavements.(15)

Distress Calibration Factor Elasticity Related Input Variables
Fatigue cracking βf1 –3.3 Effective binder content, air voids, AC thickness
Fatigue cracking βf2 –40 Tensile strain
Fatigue cracking βf3 20 Material stiffness
Fatigue cracking C1 1 AC thickness
Fatigue cracking C2 0 Fatigue damage, AC thickness
Fatigue cracking C3 ≈0 No related variable
Rutting βr1 0.6 Layer thickness, layer resilient strain
Rutting βr2 20.6 Temperature
Rutting βr3 8.9 Number of load repetitions
AC = asphalt concrete.


The higher absolute values of elasticity for βf2, βf3, βr2, and βr3 indicate that model predictions are more sensitive to these calibration factors. As a result, successful calibration requires a higher degree of precision for these factors compared to the others in the optimization procedure. It should be noted that increasing the precision of calibration factors requires higher computational cost of the optimization procedure. Therefore, the selected precision for each factor should be commensurate with its corresponding elasticity.

It should also be noted that the elasticity metric needs to be identified according to the local pavement structure, climate, and traffic data. In another study on calibration of AASHTOWare® Pavement ME Design software for Iowa, Ceylan et al. used a similar sensitivity metric to calculate the change in performance prediction caused by change in calibration factors.(16)


Global calibration and validation of MEPDG performance models were completed using a subset of LTPP data based on national averages.(2) Ever since, numerous State DOTs have been in the process of calibrating these models to their own regional materials–traffic–climate conditions. Two important studies of NCHRP 9-30 and NCHRP 1-40B have provided guidelines in this regard.(17,4) The NCHRP Synthesis 457 provides a comprehensive report on the pavement design practices and MEPDG implementation status in various States across the country.(10) This report also includes agency implementation challenges and details case examples of the MEPDG implementation process in three States.

NCHRP 1-40B provides the following 11-step procedure for verification, calibration, and validation of the MEPDG models for local conditions, which has been adopted by AASHTO:(3)

  1. Select hierarchical input level.
  2. Develop experimental plan and sampling template.
  3. Estimate sample size.
  4. Select roadway segments.
  5. Evaluate project and distress data.
  6. Conduct field testing and forensic investigation.
  7. Assess local bias.
  8. Eliminate local bias.
  9. Assess STE of the estimate.
  10. Reduce STE of the estimate.
  11. Interpret the results.

Statistical significance testing is recommended at various steps to determine if the models need further calibration. At the seventh step, the significance of the bias (the average difference between predicted and measured performance) is tested. If there is a significant bias in prediction of pavement performance measures, the first round of calibration is conducted at the eighth step to eliminate bias. For example, during this step for the rutting models, the SSE is minimized by adjusting the βr1, βGB, and βSG calibration factors.

At the ninth step, the STE (standard deviation of error among the calibration dataset) is evaluated by comparing it to the STE from the national global calibration. If there is a significant STE, the second round of calibration at the 10th step tries to reduce the STE by adjusting the βr2 and βr3 calibration factors. A final validation step checks for the reasonableness of performance predictions. The flowcharts depicted in figure 1 and figure 2 demonstrate this calibration process.(3)

This figure includes a flowchart that describes the AASHTO recommended process for calibration of the performance models in the Mechanistic–Empirical Pavement Design Guide to local materials, traffic, and climatic data. This is the first part of the flowchart, and the second part continues in figure 2. Step 1 is to select hierarchical input levels for use in local calibration, a policy decision. There is an arrow from step 1 to a connection point A, from which the flowchart will continue in figure 2. There is another arrow from step 1 to step 2, which is to develop experimental design and matrix: a fractional, blocked, or stratified factorial design. Step 3 is to estimate sample size for each distress simulation model. A precursor task in this flowchart that provides input to both the step 2 and step 3 boxes is to decide on level of confidence for accepting or rejecting the null hypotheses, which are the assumptions of no bias and that the local standard error equals the global standard error. This determined level of confidence will be used later in the flowchart to determine the number of condition surveys to be included in the experimental matrix. From step 3, there is an arrow to step 4, which is to select roadway segments. The next box is the type and number of test sections. From there, the next box is roadway segments, PMS sites, that are used to determine and eliminate bias. The next box is the roadway segments, research grade (LTPP), that are used to determine and eliminate bias and determine standard error. The next box is APT with simulated truck loading and APT with full-scale truck loading, which are used to minimize the number of roadway segments and quantify components of error term. There are arrows from all of these test sections and from the confidence level to a box that is for determination of the number of condition surveys available for each section included in the experimental matrix and time-history distress data. The next box is step 5, which is to extract and evaluate roadway segment or test section data. Next is time-history distress data, from which there are arrows to two boxes: One is to APT and research-grade segments, and the other is to PMS segments. For PMS segments, MEPDG and PMS distress are compared, and two options are explored: Either perform detailed distress surveys (using LTPP protocol) over time if needed, or just use the PMS distress data. Next, for both the PMS and other (APT and research grade) data, the outliers or segments with irrational trends in data are identified and removed from the database. The next task is to extract other pavement data to determine inputs to MEPDG for remaining sites, including layer type and thickness, material and soil properties, and traffic and climate data. These other data need to be generated based on the hierarchical levels established in step 1. The next task is to identify missing data elements for MEPDG execution. There is an arrow from here to a connection point B, from which the flowchart will continue in figure 2.

Reprinted from Guide for the Local Calibration of the Mechanistic–Empirical Pavement Design Guide, 2010, by the American Association of State Highway and Transportation Officials, Washington, DC. Used by permission.

Figure 1. Flowchart. The AASHTO recommended procedure for local calibration of MEPDG performance models, steps 1 through 5.(3)


This figure includes a flowchart that describes the AASHTO recommended process for calibration of the performance models in the Mechanistic–Empirical Pavement Design Guide to local materials, traffic, and climatic data. This is the second part of the flowchart, and the first part is in figure 1. From the flowchart connection point B from figure 1, there is an arrow to step 6 here, which is to conduct field investigations of test sections to define missing data. From step 6 and the flowchart connection point A from figure 1, there are arrows into a box with the task to develop material sampling and data collection plan. From there, there is a question to be answered: Should the MEPDG assumptions be accepted? from which there are two options. If MEPDG assumptions are accepted, then forensic investigations are not required, and only field tests are needed to obtain missing data. If MEPDG assumptions are questioned or rejected, then forensic investigations are required; trenches and cores are needed to determine direction of crack propagation and amount of rutting in each layer to confirm or reject assumptions. Then, the field testing and materials sampling plan are conducted to define missing data. The next box is an optional activity to reevaluate experimental matrix to ensure hypothesis can be properly evaluated and accepted or rejected. The next task is to conduct laboratory materials testing plan to determine missing data. Next is step 7, to assess bias for experimental matrix or sampling template. Then, the inputs for each road segment are determined, and the MEPDG software is executed for distress predictions. At this stage of the flowchart, if there are only PMS segments with PMS distress data, then the PMS distress measurements are adjusted or combined to match the MEPDG distress protocol. If there are more detailed or research grade (LTPP) or APT data, then there is no need to adjust the distress. Then, the MEPDG predicted distress is compared to the measured distress to compute local bias for distress transfer functions. The next is the decision to accept or reject the hypothesis related to bias. If the hypothesis is rejected, meaning there is a statistically significant bias between measured and predicted distress, step 8 is to determine the local calibration coefficients to eliminate bias of the transfer function. This step is carried out by minimizing the bias, while adjusting the local calibration coefficients. These local calibration coefficients are then used to predict distress and calculate standard error of the estimate. Whether the bias-related hypothesis is rejected or accepted, step 9 is to assess standard error for the transfer function. Then, a decision needs to be made regarding accepting or rejecting the hypothesis for standard error. If this hypothesis is rejected, meaning that the standard error of the estimate is too large, then step 10 is to improve precision of the model by modifying coefficients and exponents or developing calibration functions. Whether the hypothesis for standard error is accepted or rejected, step 11 is an interpretation of the results and to decide on adequacy of calibration coefficients. Finally, the calibration coefficients are accepted for use in design.

Reprinted from Guide for the Local Calibration of the Mechanistic–Empirical Pavement Design Guide, 2010, by the American Association of State Highway and Transportation Officials, Washington, DC. Used by permission.

Figure 2. Flowchart. The AASHTO recommended procedure for local calibration of MEPDG performance models, steps 6 through 11.(3)


The most recent literature review on calibration of MEPDG in different States was conducted as part of a study for Georgia DOT.(13) Results of this literature review and other similar studies have been compiled to provide a summary of the State calibration efforts in table 4 and a list of reported calibration factors for flexible pavement fatigue cracking and rutting models in table 5.

Most of the past calibration studies suggest that the MEPDG rutting prediction models overpredict rutting in unbound pavement layers. However, the rutting predicted in asphalt concrete (AC) layers seems to be easily calibrated. The fatigue (alligator) cracking model seems to underpredict actual pavement distress and has high variation in the predicted values. There seems to be little to no problem in calibration of transverse cracking and smoothness prediction models. There seems to be no specific trend for the flexible pavement longitudinal cracking model, and none of the studies reported a successful calibration of it. The MEPDG longitudinal cracking model is not considered in the scope of this study because several past studies have expressed concern on the lack of fit of this model.(18) Difficulty in differentiating longitudinal cracks in the wheelpath from alligator cracking patterns might have contributed to errors in measured longitudinal cracking values.

Differences among various distress identification protocols (e.g., LTPP versus State PMS) and the subjective nature of identifying distress type and severity have been noted as sources of measurement error that cause significant challenges in calibration of mechanistic models to field-measured performance data.(19,20)

Table 4. Major State efforts for calibration of MEPDG performance models.

Study Scope Major Findings
NCHRP 1-37A(2) National calibration of MEPDG models National calibration of MEPDG models
NCHRP 9-30(17) Calibration of flexible pavement performance models for structural and mix design Procedures for adjusting global coefficients according to lab data
Independent review of the MEPDG Rutting is overpredicted in unbound pavement layers.
NCHRP 1-40B(4) 11-step recommended calibration procedure 11-step recommended calibration procedure
NCHRP 1-40D(6) National recalibration of MEPDG models National recalibration of MEPDG models
Von Quintus and Moulthrop 2007(5) Calibration of MEPDG flexible pavement performance models for Montana Lack of fit for the longitudinal flexible pavement cracking model
Kang et al. 2007(7) Midwest regional pavement performance database for MEPDG calibration Database creation is very labor intensive and unreliable.
Von Quintus 2008(18) Overview of selected studies on local calibration of MEPDG Summary of flexible pavement local calibration factors from national and local calibrations
Muthadi and Kim 2008(22) Calibration of MEPDG flexible pavement performance models for North Carolina Calibration factors for rutting and fatigue cracking models. MEPDG models underpredict fatigue cracking.
Banerjee et al. 2009(23) Calibration of MEPDG flexible pavement performance models for Texas Regional and local calibration factors for rutting
Li et al. 2009(15) Calibration of MEPDG flexible pavement performance models for Washington The important calibration factors were identified according to the sensitivity of the models to them.
Titus-Glover and Mallela 2009(24) Calibration of MEPDG performance models for Ohio Calibration of MEPDG performance models for Ohio
Souliman et al. 2010(25) Calibration of MEPDG flexible pavement performance models for Arizona Calibration of MEPDG flexible pavement performance models for Arizona
Hoegh et al. 2010(26) Calibration of MEPDG rutting models for Minnesota Modified rutting model based on MnROAD data
Hall et al. 2011(27) Calibration of MEPDG flexible pavement performance models for Arkansas Variation in predicted fatigue cracking remains high and is not improved by calibration.
Williams and Shaidur 2013(28) Calibration of MEPDG performance models for Oregon Calibration of MEPDG performance models for Oregon
Ceylan et al. 2013(16) Calibration of MEPDG performance models for Iowa Nationally calibrated rutting model provides acceptable predictions for Iowa.
Mallela et al. 2013(29) Calibration of MEPDG performance models for Colorado Calibration of MEPDG performance models for Colorado
MnROAD = Minnesota Department of Transportation pavement test track.


Table 5. Local calibration factors for MEPDG fatigue cracking and rutting prediction models.

Performance Model HMA Fatigue HMA Fatigue HMA Fatigue Bottom–Up Cracking Bottom–Up Cracking HMA Rutting HMA Rutting HMA Rutting Base
Subgrade Rutting
Coefficient βf1 βf2 βf3 C1 C2 βr1 βr2 βr3 βGB βSG
National 1 1 1 1 1 1 1 1 1 1
AR 1 1 1 0.688 0.294 1.2 1 0.8 1 0.5
AZ* 0.729 0.8 0.8 0.732 0.732 3.63 1.1 0.7 0.111 1.38
CO^ 130.367 1 1.2178 0.07 2.35 1.34 1 1 0.4 0.84
IA 1 1 1 1 1 1 1.15 1 0 0
MO 1 1 1 1 1 1.07 1 1 0.01 0.4375
MT 13.21 1 1.25 1 1 7 1.13 0.7 1 0.3
NC* 1.41 –2.82 –6.67 0.4372 0.15049 1.0175 1 1 1.5803 1.10491
OH 1 1 1 1 1 0.51 1 1 0.32 0.33
OR 1 1 1 0.56 0.225 1.48 1 0.9 0 0
UT 1 1 1 1 1 0.56 1 1 0.604 0.4
WA* 0.96 0.97 1.03 1.071 1 1.05 1.109 1.1   0
WI* 1 1.2 1.5 1 1 1.0157 1 1 0.01 0.5731
WY^ 1 1 1 0.4951 1.469 1.0896 1 1 0.9475 0.6897
Midwest 1 1.2 1.5 1 1 1 1 1 1 1
Average” 2.1190 0.6682 0.4009 0.8488 0.7638 1.7757 1.0445 0.9273 0.4039 0.4569
Range” 0.729 to 13.21 –2.82 to 1.2 –6.67 to 1.5 0.4372 to 1.071 0.15049 to 1.469 0.51 to 7 1 to 1.15 0.7 to 1.1 0.0 to 1.5803 0.0 to 1.38
COV (%)” 174 174 588 27 47 108 6 15 139 97
*Calibration factors reported by Von Quintus et al. (2013) were different from the ones found in this literature search (references in table 4).(5)
^These values are not final.
”These statistics exclude CO and WY values.
COV = coefficient of variation.


Table 5 shows significant variance among the States in terms of the βf1, βf2, βf3, βr1, βGB, and βSG calibration factors as indicated by their corresponding high coefficients of variation. Therefore, it is important that the optimum coefficients be determined for these calibration factors to ensure compliance to local pavement performance. In addition, C1 and C2 also show some variation among different calibration efforts. The number of calibration factors determined to be equal to 1 (1.0), which are the global calibration values, are more for the fatigue cracking model compared to the permanent deformation model. This could be interpreted as a superior global model having been developed for fatigue cracking compared to rutting.


The measurement error in the performance data records is known to be greatly undermining precision of calibrated MEPDG models.(18) Therefore, Hall et al. suggested a new output format for the performance models to predict ranges of distress instead of an exact value.(27)

To account for the effect of maintenance or rehabilitation activities, Li et al. suggested developing piecewise performance models for Washington State.(30) Pavement serviceable life was divided into three time periods of early age, rehabilitation, and overdistressed situations. They used regression to develop models for each time period.

In addition to the national research studies conducted to determine the global calibration factors for permanent deformation model, some States have conducted their own laboratory tests in this regard.(31) For example, Jadoun and Kim used results of the triaxial repeated load permanent deformation test to determine the global k factors for 12 different HMA mixtures.(32)

The majority of these studies used exhaustive search methods such as the generalized reduced gradient (GRG) method to minimize SSE between measured and predicted performance. These methods are local optimization techniques that are dependent on seed values and typically get stuck at a local minimum of error. Jadoun and Kim compared a genetic algorithm (GA) to the GRG method for calibration of rutting and fatigue cracking models for North Carolina.(32) They demonstrated that the GA method provides a more global minimum of SSE compared to the GRG method in predicting rutting. However, this superior optimization does not result in a reasonable match between predicted and measured fatigue cracking.

It should be noted that the applied GA code is highly sensitive to the control parameters used to manipulate the evolutionary process of optimization. Therefore, there might be variants of this GA code that perform better, and the best set of control parameters needs to be determined for each optimization problem. Several evolution strategies (ESs) have been developed in the evolutionary computation literature that evolve and adapt control parameters along with optimization solutions and with respect to the objective function space. Application of these ESs would result in a more robust optimization.


All of the MEPDG calibration studies focus on minimization of a single-objective function (SSE) for all distress severity levels and all pavement ages in the considered network. Incorporating multiple sources of information might reveal unknown aspects of this calibration problem and result in more reasonable calibration coefficients. Multi-objective evolutionary algorithms (MOEAs) are derivative-free, global optimization heuristics that provide a set of tradeoff solutions independent of seed values.(33)

MOEAs have been used in pavement management studies to optimize the allocation of resources to various treatment alternatives considering multiple criteria.(34–36) They have also been vastly implemented in water resources research to design long-term groundwater monitoring schemes and to calibrate hydrologic models.(37,38) The multi-criteria framework provided by this kind of calibration has enabled recognition and handling of errors and uncertainties and detection of prominent behavioral solutions with acceptable tradeoffs in hydrologic modeling efforts within the past decade.(39)


The following are the key observations drawn from this literature review:


Based on the findings of this literature review, the following considerations corresponding to the above observations were recommended for the research approach:

Several scenarios can be devised for multi-objective formulation of calibration, all of which could overcome cognitive challenges and add to the knowledge of this problem. More than one set of multiple objectives will be considered to explore new aspects of the calibration problem. The idea is to optimize multiple objectives simultaneously. The following are the proposed sets of objectives up to this stage of the study:

In the primary multi-objective scenario, mean and standard deviation of prediction error are simultaneously minimized to reduce the bias and STE at the same time. In this manner, the information from a single calibration run is fully implemented, and an additional round of computationally intensive calibration is avoided.

In the second multi-objective scenario for calibration of MEPDG performance models, the error in predicting the performance of pavements within different performance data sources will be used as separate objective functions to be minimized simultaneously. In addition to LTPP test sections, data from State PMS or APT facilities in the same region can be considered for this scenario. This scenario comprises an objective approach to incorporate different sources of data. Finally, a combination of two or more of the above scenarios could also be considered for the multi-objective calibration approach.

[1] For consistency with how measurements are recorded in the LTPP database, all layer thickness measurements are presented in inches in this report. These measurements can be converted to centimeters: 1 inch = 2.54 cm.



Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101