U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590

Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

This report is an archived publication and may contain dated technical, contact, and link information
Publication Number: FHWA-RD-99-207

Prediction of the Expected Safety Performance of Rural Two-Lane Highways


One of the most critical gaps in the management of highway safety is the lack of a reliable method for estimating the safety performance of an existing or planned roadway. Accident record systems have been developed and maintained by highway agencies to monitor the safety performance of their roadways, but these provide historical or retrospective data. Effective management requires a prospective viewpoint. Highway engineers need to know not what the safety performance of a roadway was in the recent or distant past, but what it is now and what it is likely to be in the future if particular proposed actions are taken.

In the past, when current or future safety performance estimates for a roadway were needed, they have been developed by one of four approaches: averages from historical accident data, predictions from statistical models based on regression analysis, results of before-after studies, and expert judgments made by experienced engineers. Each of these methods, used alone, has significant weaknesses which are described below. A new approach combining elements of each of these methods into an accident prediction algorithm is then described. This new accident prediction algorithm, developed specifically for application to rural two-lane highways, is the subject of this report.

Estimates from Historical Accident Data

Historical accident data are an important indicator of the safety performance of a roadway, but they suffer from the weakness of being highly variable. Given this high variability, it is difficult to estimate the long-term expected accident rate using a relatively short-duration sample of 1 to 3 years of accident data. This is especially true for rural roadway sections and intersections where accidents are very rare events and many locations experience no accidents, or at most one accident, over a period of several years. If a location has experienced no accidents in the past several years, it is certainly not correct to think that it will never experience an accident, yet the available data for that site alone provide an insufficient basis for estimating its long-term expected safety performance.

Roadway improvement programs based on safety are often managed with accident surveillance systems that use accident records to identify high-accident locations. A high-accident location is a roadway section or intersection identified because it experienced more than a specified threshold number of accidents during a recent period (typically 1 to 3 years). Each high-accident location is investigated by the engineering staff of the responsible highway agency and, at locations where a particular accident pattern is clearly evident and an appropriate countermeasure is feasible, an improvement project may be programmed and constructed. The decisionmaking concerning such projects often involves a benefit-cost or cost-effectiveness calculation based on the expected percentage reduction in accidents from the level of recent accident experience found by the accident surveillance program. However, both statistical theory and actual experience show that, because of the random nature of accidents, locations with high short-term accident experience are likely to experience fewer accidents in the future even if no improvement is made. This phenomenon, known as regression to the mean, makes it difficult both to identify potential problem locations through accident surveillance and estimate the potential (or actual) effectiveness of improvements made at such locations.

Estimates from Statistical Models

Safety analysts have, for many years, applied statistical techniques to develop models to predict the accident experience of roadways and intersections. Such models are developed by obtaining a database of accident and roadway characteristics (e.g., traffic volumes, geometric design features, and traffic control features) data from highway agency records, selecting an appropriate functional form for the model, and using regression analysis to estimate the values of the coefficients or parameters in that model. Historically, most such models were developed with multiple regression analysis. Recently, researchers have begun to use Poisson and negative binomial regression analyses which are theoretically better suited to accident data based on small counts (i.e., zero or nearly zero accidents at many sites). However, regardless of the statistical technique used, accident prediction models never quite seem to meet the expectations of their developers and potential users.

Regression models are very accurate tools for predicting the expected total accident experience for a location or a class of locations, but they have not proved satisfactory in isolating the effects of individual geometric or traffic control features. There is a strong temptation to interpret each coefficient in a regression model as representing the true effect of an incremental change in its associated roadway feature. This is a reasonable assumption is some cases, but not in others. A key drawback of regression models is that they are based on statistical correlations between roadway characteristics and accidents that do not necessarily represent cause-and-effect relationships. Furthermore, if the independent variables in the model are strongly correlated to one another, it is difficult to separate their individual effects. In addition, if a variable in the model is strongly correlated to an important variable that happens not to be included in the available data base, the coefficient of the variable in the model may represent the effect of the unavailable variable rather than its own effect. Thus, the value of the coefficient of a particular geometric feature may be a good estimate of the actual effect of that feature on safety, or it may be merely an artifact of, or a surrogate for, its correlation to other variables.

As an example, consider the following negative binomial regression model developed in a recent FHWA study to predict the accident experience at urban, four-leg intersections with STOP control on the minor road:(1)

Y =

e-5.073 (X1)0.635 (X2)0.294 exp(-0.969 X3) exp(-0518 X4)
(X5)-0.091 exp(0.340 X6) exp(0.087 X7) exp(-0.331 X8)
exp(-0.175 X9)



Y = expected number of total multiple-vehicle accidents in a 3-year period;
X1 = average daily traffic on major road (veh/day);
X2 = average daily traffic on minor road (veh/day);
X3 = 1 if left-turn are prohibited on one or more major-road approaches; 0 otherwise;
X4 = 1 if no access control is present along the major road approaches; 0 otherwise;
X5 = average lane width on major road (ft)*;
X6 = 1 if major road has three or fewer through lanes in both directions of travel combined; 0 otherwise;
X7 = 1 if major road has four or five through lanes in both directions of travel combined; 0 otherwise;
X8 = 1 if there is no channelization for free right turns; 0 otherwise; and
X9 = 1 if the intersection has no lighting; 0 otherwise.

* Average lane width in this equation is specified in conventional units of measure (feet). See the explanation in the section entitled Units of Measure in this report.

This model, overall, provides quite reliable predictions of the total accident experience of urban, four-leg, STOP-controlled intersections. In addition, the coefficients of many of the terms appear to reasonably represent the expected effects of their associated variables. However, two of the variables in the model have coefficients that are in a direction opposite to that which safety engineers normally presume for those variables. Specifically, the negative coefficient of the access control factor (X4) implies that more accidents would be expected at an intersection with access-controlled approaches than at an intersection without access-controlled approaches. Furthermore, the negative coefficient of the lighting factor (X9), implies that lighted intersections have more accidents than unlighted intersections. Such interpretations are unreasonable. The negative signs for the access control and lighting variables in equation (1) could result merely from correlations of access control and lighting with the variables already accounted for in the model, such as traffic volumes, or with other important variables that are not included in the model because no data for those variables are available. It is also possible that lighting has been installed as an accident countermeasure at high-accident locations, so that lighting appears to be associated with locations that have more accidents. Thus, while regression equations may provide useful predictive models, their coefficients may be unreliable indicators of the incremental effects of individual roadway features on safety.

Estimates from Before-and-After Studies

Before-and-after studies have been used for many years to evaluate the effectiveness of highway improvements in reducing accidents. However, most before-and-after studies reported in the literature have design flaws such that the study design cannot account for the effects of regression to the mean. Therefore, the potential user of the before-and-after study results cannot be certain whether they represent the true effectiveness of the potential improvement in reducing accidents or an overoptimistic forecast that is biased by regression to the mean.

Safety experts are generally of the opinion that, if the potential bias caused by regression to the mean can be overcome, a before-and-after study may provide the best method to quantify the safety effects of roadway geometric and traffic control features. Hauer(2) has developed a new approach that remedies the problem of regression to the mean that has, in the past, caused before-and-after studies to provide unreliable results. However, very few of these well-designed before-and-after studies have been conducted.

Estimates from Expert Judgment

Expert judgment, developed from many years of experience in the highway safety field, can have an important role in making reliable safety estimates. Experts may have difficulty in making quantitative estimates with no point of reference, but experts are usually very good at making comparative judgments (e.g., A is likely to be less than B, or C is likely to be about 10 percent larger than D). Thus, experts need a frame of reference based on historical accident data, statistical models, or before-and-after study results to make useful judgments.

A New Approach

This report presents a new approach to accident prediction that combines the use of historical accident data, regression analysis, before-and-after studies, and expert judgment to make safety predictions that are better than those that could be made by any of these three approaches alone. The recommended approach to accident prediction has its basis in published safety literature, including both before-and-after evaluations and regression models, is sensitive to the geometric features that are of greatest interest to highway designers, and incorporates judgments made by a broadly based group of safety experts.

This report shows how this new approach can be implemented in an accident prediction algorithm for rural two-lane highways. This same approach can potentially be adapted in the future to rural multilane highways, urban arterial streets, and rural or urban freeways.

The Federal Highway Administration (FHWA) is currently developing an Interactive Highway Safety Design Model (IHSDM) for use by highway designers to incorporate more explicit consideration of safety into the highway design process. IHSDM will consist of a set of computer tools that can work interactively with the Computer-Aided Design (CAD) systems used by many agencies to design highway improvements. The components of the IHSDM will include a Crash Prediction Module (CPM), Roadside Safety Module (RSM), Intersection Diagnostic Review Module (DRM), Design Consistency Module (DCM), Policy Review Module (PRM), Driver/Vehicle Module (D/VM), and Traffic Analysis Module (TAM). Initial priority in IHSDM development is being given to evaluation of rural two-lane highways.

The accident prediction algorithm presented in this report has been developed for incorporation in the IHSDM as the CPM for rural two-lane highways, but is also suitable for use as a stand-alone model to predict the safety performance of rural two-lane highways. This report documents how the accident prediction algorithm was developed and how it will function within the IHSDM.

Organization of this Report

The remainder of this report is organized as follows. Section 2 presents an overview of the accident prediction algorithm and its two primary components, base models and accident modification factors. A more detailed description of the base models and accident modification factors is presented in sections 3 and 4, respectively. Section 5 presents the results of sensitivity analyses conducted with the accident prediction algorithm, and section 6 explains how the accident prediction algorithm will be implemented within the IHSDM. The conclusions and recommendations of the report are presented in section 7 and a list of references is presented in section 8.

Appendix A identifies the members of the expert panels that developed the accident modification factors. Appendix B documents the development of the base models. Appendix C presents a calibration procedure that can be used by any highway agency to adapt the accident prediction algorithm to their own local conditions and to the safety performance of their highways. Appendix D documents the definitions of the roadside hazard ratings used in the accident prediction algorithm to represent roadside design features.

Units of Measure

The text of this report presents all measured quantities in SI (metric) units with equivalent quantities in conventional (English) units following in parentheses. However, virtually all of the research on which the report is based was conducted using conventional units of measure. Therefore, all equations in the report, like equation (1) above, use conventional units. A metric conversion chart is included for the convenience of readers. The software developed to implement the accident prediction algorithm will allow users to provide input and obtain output at their option in either SI or conventional units.

Previous | Table of Contents | Next

Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101