U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590

Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

This report is an archived publication and may contain dated technical, contact, and link information
Publication Number: FHWA-RD-03-037
Date: May 2005

Validation of Accident Models for Intersections

FHWA Contact: John Doremi,
HRDI-10, (202) 493-3052, John.doremi@dot.gov

PDF Version (1.61 MB)

PDF files can be viewed with the Acrobat® Reader®


The research described in this report consists of two separate yet complimentary activities-validation and calibration of crash models for rural intersections. Both the validation and recalibration activities were conducted with one overriding research objective:

Given existing database limitations, make marginal improvements to an existing set of statistical models for predicting crashes at two- and four-lane intersections, with the primary intent to provide robust predictive models for use in the Interactive Highway Safety Design Module (IHSDM).

The five types of intersection models addressed in this research effort include:

  • Type I: Three-legged stop controlled intersections of two-lane roads.
  • Type II: Four-legged stop controlled intersections of two-lane roads.
  • Type III: Three-legged stop controlled intersections with two lanes on minor and four lanes on major roads.
  • Type IV: Four-legged stop controlled intersections with two lanes on minor and four lanes on major roads.
  • Type V: Signalized intersections of two-lane roads.

The models that are the focus of this research are presented in three different Federal Highway Administration (FHWA) reports: Vogt and Bared (Types I and II);(1) Vogt (Types III, IV, and V);(2) and Harwood et al. (Types I, II, and V).(3) Each report presents several variants of the models for each type of intersection. The first two reports include models for total as well as injury accidents and present what are referred to as full models. The Harwood et al. report presents base models for Types I, II, and V intersections.(3) These base models included variables that were statistically significant at the 15 percent level and are at the backbone of an algorithm for predicting accidents at intersections that are different in one or more features from the specified base conditions. Specifically, accident modification factors (AMF) for the features of interest are applied to the base model prediction to estimate accidents per unit of time for a specific intersection. This algorithm is intended for use in the Crash Prediction Module of FHWA's IHSDM. The anticipated practical application of these models has motivated research directions taken throughout the course of this investigation.

The data in support of this research were derived from three sources:

  1. The original data used for the calibration of the main models for total accidents were obtained from the researchers who developed those models.
  2. Highway Safety Information Systems (HSIS) data were obtained for additional years for the same intersections used in the calibration and for injury accidents for the original and additional years.
  3. An independent validation data set of intersections and their relevant crash, traffic, and geometric data in Georgia was specially assembled for this project.(4)

The research team faced a number of challenges while conducting this research, including data collection, independent variable characteristics, and the models' intended end-use:

  1. The observational data on which the statistical models are based suffered from intercorrelation.
  2. The interactions among variables had to be carefully considered in model estimation.
  3. The need to forecast crashes across States posed significant difficulties.
  4. The observational data limited the amount of variation in independent variables, reducing overall model precision.
  5. Resource and data reliability restrictions prohibited a sufficiently large, randomly selected, fully comprehensive data set on which to estimate statistical models.
  6. Incongruencies between data sets across States and across time periods posed serious challenges.

Despite these challenges, the research team conducted a model validation and then recalibrated the five intersection models.


The four sets of validation activities were:

  1. Re-estimation of the model coefficients using the original data. This validation activity was used to determine the reproducibility of the published results and to ensure an "equivalent" launching point for all validation and calibration activities. This activity represented a logical starting point for the research effort and was successful in that modeling results were reproduced satisfactorily.
  2. Validation of the models against additional years of accident data for the same intersections used in the calibration. Because the crash models were developed as direct inputs into the IHSDM's Accident Analysis Module, the models will be used by highway agencies to estimate the safety performance of an existing or proposed roadway. Therefore, the models should be able to forecast crashes across time and space. This validation activity was used to assess the models' ability to forecast crashes across time, determining the models temporal capability and stability.
  3. Validation of the models against Georgia data. This validation activity assessed the models' ability to forecast crashes across space. This activity tested numerous aspects of model prediction: comparing data from different jurisdictions; capturing variables that describe regional and jurisdictional differences; and consistency of crash processes across space.
  4. Validation of the Accident Prediction Algorithm. While this research's primary focus was to validate crash models, validating the Accident Prediction Algorithms based on these models was also important.

Two basic sets of performance tests were employed. First, the models were re-estimated using the same variables and functional forms as those published in the original reports; the parameters for the original and re-estimated models were then compared, using a level of alpha = 0.10 to establish statistical significance. Second, the model (or algorithm) was used to predict accident frequencies at individual intersections, from which the following summary statistics were calculated:

  • Pearson product-moment linear correlation coefficients.
  • Mean Prediction Bias (MPB); MPB/year.
  • Mean Absolute Deviation (MAD); MAD/year.
  • Mean Squared Error (MSE); MSE/year2.
  • Mean Square Prediction Error (MSPE); MSPE/year2.

The details of validation activities 1 through 4 are presented in sections 3.4 through 3.7 respectively, while the results are discussed in section 3.8.


Model recalibration was focused on improving the existing set of intersection crash models through use of an improved and expanded database and through lessons learned in the validation and recalibration activities.

For each the five intersection types, the research team developed and/or refined three different sets of models, described in detail in chapter 3. The first type is Annual Average Daily Traffic (AADT) Models, which represent base models for predicting crashes as a function of major and minor road AADT. The analytical results of these models can be found in subsequent sections of this report. The second type of model is Full Models. These statistical models forecast crashes as a function of a relatively large set of independent variables. Details of the Full Models can be found in section 3.4 of this report. The third type of model is AMF. These models, better described as countermeasure correction factors, represent our best efforts to estimate the effect of geometric countermeasures on safety relative to base model predictions. AMF details can be found in section 3.5 of this report.

Sensitivity analyses-tables of AMFs as a function of AADT and other factors are provided in section 3.6.


The research supported the proposed IHSDM accident prediction algorithm. An updated set of base models for predicting crashes using only AADT are recommended (see Summary, Discussion, and Conclusions section and Table 235 ). The updated statistical models are based on larger sample sizes and, in some cases, resulted in slightly modified sets of independent variables compared to the originally estimated models. AMFs should be selected on a case-by-case basis, and should be updated continually to improve the predictive ability of the crash models. Expert opinion derived AMFs should be replaced with the results of state-of-the-practice before-after studies as time progresses and research allows. If expert opinion accurately reflects safety conditions, then carefully conducted future studies should reveal general agreement with expert expectation. When expert opinions are not confirmed over time, then empirical results should replace expert opinion. Full regression models are recommended for crash forecasts and find logical applications in the Highway Safety Manual and Safety Analyst (see Summary, Discussion, and Conclusions section and Table 236 ).

Previous | Table of Contents | Next

Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101